| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
This refactors the code out to share it between radv and radeonsi.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This follows what radeonsi does.
Ported from radeonsi:
radeonsi: emit PA_SC_RASTER_CONFIG_1 only once
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This just makes this common code between the two drivers.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Missed this on initial radeonsi port, we shouldn't use this value
on gfx9, but also in gfx8 only for when we have a geom shader.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes crashes for the following CTS:
dEQP-VK.glsl.texture_functions.query.texturequerylod.*
Cubemaps are the same as 2D arrays.
Fixes: 625dcbbc456 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
Suggested by Nicolai.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bunch of CTS fails with 1D arrays:
dEQP-VK.glsl.texture_functions.texture*.sampler1darray_*
Fixes: 625dcbbc456 ("amd/common: pass address components individually to
ac_build_image_intrinsic")
Cc: 18.1 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
This is what radeonsi does.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The copr repo from che was using LTO and he reported radv broke
recently with it. When testing with lto builds here I noticed
that we weren't seeing any instance extensions reported.
It appears LTO was treating the const without extern as an empty
struct, this is possibly a gcc bug, but we can work around it
just by marking these with extern.
Acked-by: Samuel Pitoiset <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For subpass attachments we need one more coordinate with
the layer, so make them array types.
This fixes a bunch of CTS fails with RADV.
Fixes: 24fb3e6aa1 ("ac/nir: use ac_build_image_opcode for image intrinsics")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise a lot of games complain about not having enough memory,
and it is sort of local so this seems reasonable to me.
CC: 18.0 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SI family doesn't support chaining which means the maximum
size in dwords per CS is limited. When that limit was reached
we failed to submit the CS and the application crashed.
This patch allows to submit up to 4 IBs which is currently the
limit, but recent amdgpu supports more than that.
Please note that we can reach the limit of 4 IBs per submit
but currently we can't improve that. The only solution is to
upgrade libdrm. That will be improved later but for now this
should fix crashes on SI or when using RADV_DEBUG=noibs.
Fixes: 36cb5508e89 ("radv/winsys: Fail early on overgrown cs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes a ton of CTS crashes.
Fixes: c366f422f0 ("nir: Offset vertex_id by first_vertex instead of base_vertex")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Ported from RadeonSI.
Local BOs ignore BO priorities, and we don't need those on APUs.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Maintaining two different paths is annoying but this gets
rid of the performance regression introduced by the global
BO list.
We might find a better solution in the future, but for now
just keeps two paths.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to reduce a performance regression introduced by
4b13fe55a4 ("radv: Keep a global BO list for VkMemory."),
we are going to maintain two different paths.
One when VK_EXT_descriptor_indexing is enabled by the
application because we need to have a global BO list, and
one (the old one) when it's not enabled.
With Talos on Polaris, the global BO list reduces performance
by 10% which is too much for me.
This reverts commit ab6cadd3ecc7fbdd9079808b407674e0b19c52f0.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
So that we'll use the dimension-aware intrinsics in the future.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
In preparation of dimension-aware LLVM image intrinsics.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This is in preparation for the new image intrinsics.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This is in preparation for the new, dimension-aware LLVM image
intrinsics.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The LLVM instruction returns { i32, i1 }, where the i1 indicates success.
We're only interested in the first part, which is the loaded value.
Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.*
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
I have seen a few applications and games do the dynamic buffer bounds incorrectly, this
make it easier to work around, e.g. for debugging.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This can be enabled with RADV_PERFTEST=dccmsaa.
DCC for MSAA textures is actually not as easy to implement. It
looks like there is some corner cases. I will improve support
incrementally.
Vega support, as well as Polaris improvements, will be added later.
No CTS changes on Polaris using RADV_DEBUG=zerovram and
RADV_PERFTEST=dccmsaa.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Multisampled source images (ie. color attachments) can be now
DCC compressed, so the driver needs to perform a DCC decompression
pass before resolving
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This should be fixed at some point in order to improve
performance.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
CMASK is required because it should be cleared to
0xCCCCCCCC for MSAA textures.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
When DCC is enabled with MSAA textures, CMASK should be
cleared to 0xCCCCCCCC.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes some random CTS failures:
dEQP-VK.renderpass.multisample.*.
Performing a fast-clear eliminate is still useless, but it
seems that we need to sync.
Found while running CTS with RADV_DEBUG=zerovram.
Fixes: 56a171a499c ("radv: don't fast-clear eliminate after resolving a subpass with compute")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Might be useful for debugging purposes, especially when we
want to replace a shader on the fly.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This adds everything except non-uniform indexing, which needs a bit
more work and testing.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
The continue means we do alignment differently than during creation,
making the buffer smaller than expected.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously we did not care about havin the set storage in order,
but for variable descriptor count we want the highest binding
at the end of the storage.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With update after bind we can't attach bo's to the command buffer
from the descriptor set anymore, so we have to have a global BO
list.
I am somewhat surprised this works really well even though we have
implicit synchronization in the WSI based on the bo list associations
and with the new behavior every command buffer is associated with
every swapchain image. But I could not find slowdowns in games because
of it.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
'scale[i]' can be non-integer.
Original patch by Philip Rebohle.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074
Fixes: 0f3de89a56a ("radv: Use the guard band.")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
To handle the source color image transitions in the same place.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
That looks useless, and I think radv_handle_image_transition()
will do a fast-clear eliminate because it's called after the
resolve.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
DCC implies a fast-clear eliminate, so I think this sounds
reasonable.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Into radv_handle_color_image_transition().
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
In order to separate initialization from decompression. In the
future, that will allow us to init DCC/FMASK/CMASK in one shot.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|