| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
for internal radeonsi shaders
|
|
|
|
|
|
| |
radeonsi will use this.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
Test that found this: dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer
Fixes: 49e6c2fb78c "radv: Store color/depth surface info in attachment info instead of framebuffer."
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Can result in different shaders.
Fixes: 8a86908e9a7 "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Fixes: 8a86908e9a7 "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Instead of having the three values everywhere. This is also more
future proof if we want the driver to make those decisions eventually.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
CC: <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Mirroring radeonsi.
CC: <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
v2: Hide it behind a perftest flag.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Because it is about to be generation dependent.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Previously we relied on stores not using DCC but that is going to
change, so disable compression explicitly.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
For extra args. Unlike image creation, I'm not embedding the vk
struct in there, so all the inline structs can be kept.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
We handle render loops properly now and STORAGE still disables
DCC/TC-compat HTILE in general.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
And do nothing with it yet.
Everything outside a renderpass has no render loop.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VK spec 7.3:
"Applications must ensure that all accesses to memory that backs
image subresources used as attachments in a given renderpass instance
either happen-before the load operations for those attachments, or
happen-after the store operations for those attachments."
So the only renderloops we can have is with input attachments. Detect
these.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Using the wrong bounds
Fixes: "219d6939df8 radv: add more assertions to make sure packets are correctly emitted"
Reviewed-by: Andres Rodriguez <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
Fixes: 028ce527 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
Since it can introduce comparisons.
Fixes: 028ce527395 "radv: Add non-uniform indexing lowering."
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
results from radeonsi NIR:
Totals from affected shaders:
SGPRS: 704 -> 464 (-34.09 %)
VGPRS: 2056 -> 672 (-67.32 %)
Spilled SGPRs: 24 -> 0 (-100.00 %)
Spilled VGPRs: 28406 -> 0 (-100.00 %)
Private memory VGPRs: 0 -> 3182 (0.00 %)
Scratch size: 1064 -> 3228 (203.38 %) dwords per thread
Code Size: 935260 -> 40180 (-95.70 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 28 -> 70 (150.00 %)
Wait states: 0 -> 0 (0.00 %)
results from radv:
Totals from affected shaders:
SGPRS: 80 -> 48 (-40.00 %)
VGPRS: 204 -> 108 (-47.06 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 256 (0.00 %) dwords per thread
Code Size: 15792 -> 9504 (-39.82 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1 -> 2 (100.00 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Tested-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This automates the include_directories and dependencies tracking so that
all users of libmesa_util don't need to add them manually.
Next commit will remove the ones that were only added for that reason.
Signed-off-by: Eric Engestrom <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Tested-by: Vinson Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
So we can use it with imageless framebuffers.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
That way we can use it for imageless framebuffers.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
When the application does not ask for robust buffer access.
Only implemented the check in radv.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The driver should now rely on cmask_offset because CMASK can be
disabled by the driver for some reasons (eg. mipmaps). Apply the
same change for FMASK, although it should be useless.
Fixes: ad1bc8621df ("radv: remove radv_get_image_fmask_info()")
Fixes: 10d08da52c6 ("radv/gfx10: add missing dcc_tile_swizzle tweak")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It's unnecessary to duplicate fields in another struct.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Fixes: c90f46700dd ("radv/gfx10: mask DCC tile swizzle by alignment")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It's unnecessary to duplicate fields in another struct.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It's 0 for depth surfaces with TC compat HTILE enabled.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this
extracts the ptr, does a bound check and then uses a cmpxchg LLVM
instruction.
Not ideal, but the earliest release we're going to get a proper
intrinsic is LLVM 10.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
This reverts commit bfea7e4d2965269bff8f1f6449cb99c312fd7384.
|
|
|
|
|
|
| |
This reverts commit d3c80733cdfe8552b2f447ec8ed62465d0f2af1a.
These were only appearing due to memory corruption.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it cheaper to just change the dynamic offsets with
the same descriptor sets.
This optimization has been reverted a while back because of
random GPU hangs on GFX9, no it looks fine, at least CTS no longer
hangs on GFX9 and it doesn't hang on GFX10 as well.
It fixes a performance problem with Wolfenstein Youngblood.
Suggested-by: Philip Rebohle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It can be enabled with RADV_PERFTEST=gewave32.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It can be enabled with RADV_PERFTEST=pswave32.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|