| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
This removes one useless SMEM load operations which pointed to
the same descriptor anyway.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
The driver will emit GLC=1.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Both input and output images use the same type.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Cc: 19.1 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
This way we handle linear images etc. correctly.
Acked-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise LLVM can sink them and their texture coordinate calculations
into divergent branches.
v2: simplify the conditions on which the intrinsic is marked as convergent
v3: only mark as convergent in FS and CS with derivative groups
Cc: <[email protected]>
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes -Woverflow warnings with GCC 9.1.1
v2: use a cast instead of a bitwise and
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The driver should only fast depth clears with the graphics path
when the view covers all image layers, otherwise this might
corrupt layers when HTILE is enabled.
Cc: 19.0 19.1 [email protected]
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
It's unsupported, only load/store format with vec3 are supported.
Fixes: 6970a9a6ca9 ("ac,radv: remove the vec3 restriction with LLVM 9+")"
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
trivial
|
| |
|
|
|
|
|
|
|
| |
The automatic header generation unifies identical registers in a series
and only emits definitions for the first one. This is mostly to avoid
emitting excessive definitions for CB registers, but special-casing
an exception for this family of registers doesn't seem worth it.
|
|
|
|
|
|
|
|
|
|
|
| |
The definition of the fields differs, but PITCH_GFX9 is a mere extension
of PITCH_GFX6 that does not conflict with any other fields.
This aligns the definitions with what will be generated from the
register JSON.
The information about how large the fields really are is preserved in
the register database.
|
|
|
|
|
|
|
| |
This "register" name collides with R_370_CONTROL.
This aligns the definitions with what will be generated from the
register JSON.
|
|
|
|
|
|
|
|
|
|
| |
The field layout wasn't actually changed in gfx9, so having the suffix
isn't very useful. The field *contents* were changed, but this is
reflected in the V_xxx_xxx definitions and is taken into account by
the ac_debug logic based on the register JSON.
This aligns the definitions with what will be generated from the
register JSON.
|
| |
|
| |
|
|
|
|
|
| |
The descriptions are mostly derived from parsing the existing
register headers.
|
|
|
|
|
|
|
|
|
|
| |
We will derive both the debugging tables and (the majority of) the
register headers from descriptions in JSON, instead of deriving the
debugging tables from an awkward parsing of the register headers.
Some of the scripts are useful for maintaining the register database
itself. The scripts are designed to output reasonably readable JSON
by default.
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
From the Vulkan spec 1.1.108:
"vkCmdCopyQueryPoolResults is guaranteed to see the effect of
previous uses of vkCmdResetQueryPool in the same queue, without any
additional synchronization."
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes requires LLVM r356755.
32706 shaders in 16744 tests
Totals:
SGPRS: 1448848 -> 1455984 (0.49 %)
VGPRS: 1016684 -> 1016220 (-0.05 %)
Spilled SGPRs: 25871 -> 25815 (-0.22 %)
Spilled VGPRs: 122 -> 122 (0.00 %)
Scratch size: 11964 -> 11956 (-0.07 %) dwords per thread
Code Size: 55324500 -> 55301152 (-0.04 %) bytes
Max Waves: 235660 -> 235586 (-0.03 %)
Totals from affected shaders:
SGPRS: 293704 -> 300840 (2.43 %)
VGPRS: 246716 -> 246252 (-0.19 %)
Spilled SGPRs: 159 -> 103 (-35.22 %)
Scratch size: 188 -> 180 (-4.26 %) dwords per thread
Code Size: 8653664 -> 8630316 (-0.27 %) bytes
Max Waves: 60811 -> 60737 (-0.12 %)
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
The driver should already support this without any changes.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Basically, this extension allows applications to use custom
sample locations. It doesn't support variable sample locations
during subpass. Note that we don't have to upload the user
sample locations because the spec doesn't allow this.
The extension is currently disabled because the driver needs to
support variable sample locations during layout transitions. The
depth decompress needs to know them and that's a bit invasive.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
spirv_to_nir() returned the nir_function corresponding to the
entrypoint, as a way to identify it. There's now a bool is_entrypoint
in nir_function and also a helper function to get the entry_point from
a nir_shader.
The return type reflects better what the function name suggests. It
also helps drivers avoid the mistake of reusing internal shader
references after running NIR_PASS on it. When using NIR_TEST_CLONE or
NIR_TEST_SERIALIZE, those would be invalidated right in the first pass
executed.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Replace its uses with checking for is_entrypoint and calling
nir_shader_get_entrypoint().
This is a preparation to change spirv_to_nir() return type.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It makes sense to use the image view formats when resolving
inside subpasses, while we have to use the image formats for
normal resolves.
Original patch by Philip Rebohle.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110348
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure to sync all previous work if the given command buffer
has pending active queries. Otherwise the GPU might write queries
data after the reset operation.
This fixes a bunch of new dEQP-VK.query_pool.* CTS failures.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If the driver waits for CP DMA to be idle and emit an EOP event
we need more space.
This fixes a crash with Quake Champions.
Cc: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes use of radv_meta_resolve_compute_image() by filling
a VkImageResolve region instead of duplicating code.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
it's the same design
|
|
|
|
|
|
|
| |
Based on ANV.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The old code was not wrong because the transitions performed
after the resolves should re-emit the framebuffer if needed.
This change is mostly a no-op but it improves consistency
regarding other meta operations that need to save/restore subpasses.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This helper will be useful for clearing HTILE after some
depth/stencil resolves.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Just move the block that checks the availability bit into the
switch like other query types.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The difference between imov and fmov has been a constant source of
confusion in NIR for years. No one really knows why we have two or when
to use one vs. the other. The real reason is that they do different
things in the presence of source and destination modifiers. However,
without modifiers (which many back-ends don't have), they are identical.
Now that we've reworked nir_lower_to_source_mods to leave one abs/neg
instruction in place rather than replacing them with imov or fmov
instructions, we don't need two different instructions at all anymore.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This flag has caused more confusion than good in most cases. You can
validly use imov for floats or fmov for integers because, without source
modifiers, neither modify their input in any way. Using imov for floats
is more reliable so we go that direction.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On machines with many cores, you can run into that issue :
../mesa-9999/src/vulkan/overlay-layer/overlay.cpp:42:10: fatal error: vk_enum_to_str.h: No such file or directory
v2: Move declare_dependency around (Eric)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reported-by: Jan Ziak
Cc: <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
From the Vulkan spec 1.1.108:
"After query pool creation, each query must be reset before
it is used."
So, the driver doesn't need to do this at creation time.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It should be 7, not 8.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
The driver only supports up to 8 samples.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instead of setting the glsl types of the pointers for each resource,
set the nir_address_format, from which we can derive the glsl_type,
and in the future the bit pattern representing a NULL pointer.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This fixes some CTS failures related to VK_EXT_sample_locations.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|