| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
The LLVM instruction returns { i32, i1 }, where the i1 indicates success.
We're only interested in the first part, which is the loaded value.
Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.*
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
I have seen a few applications and games do the dynamic buffer bounds incorrectly, this
make it easier to work around, e.g. for debugging.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This can be enabled with RADV_PERFTEST=dccmsaa.
DCC for MSAA textures is actually not as easy to implement. It
looks like there is some corner cases. I will improve support
incrementally.
Vega support, as well as Polaris improvements, will be added later.
No CTS changes on Polaris using RADV_DEBUG=zerovram and
RADV_PERFTEST=dccmsaa.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Multisampled source images (ie. color attachments) can be now
DCC compressed, so the driver needs to perform a DCC decompression
pass before resolving
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This should be fixed at some point in order to improve
performance.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
CMASK is required because it should be cleared to
0xCCCCCCCC for MSAA textures.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
When DCC is enabled with MSAA textures, CMASK should be
cleared to 0xCCCCCCCC.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes some random CTS failures:
dEQP-VK.renderpass.multisample.*.
Performing a fast-clear eliminate is still useless, but it
seems that we need to sync.
Found while running CTS with RADV_DEBUG=zerovram.
Fixes: 56a171a499c ("radv: don't fast-clear eliminate after resolving a subpass with compute")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Might be useful for debugging purposes, especially when we
want to replace a shader on the fly.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This adds everything except non-uniform indexing, which needs a bit
more work and testing.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
The continue means we do alignment differently than during creation,
making the buffer smaller than expected.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously we did not care about havin the set storage in order,
but for variable descriptor count we want the highest binding
at the end of the storage.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With update after bind we can't attach bo's to the command buffer
from the descriptor set anymore, so we have to have a global BO
list.
I am somewhat surprised this works really well even though we have
implicit synchronization in the WSI based on the bo list associations
and with the new behavior every command buffer is associated with
every swapchain image. But I could not find slowdowns in games because
of it.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
'scale[i]' can be non-integer.
Original patch by Philip Rebohle.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074
Fixes: 0f3de89a56a ("radv: Use the guard band.")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
To handle the source color image transitions in the same place.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
That looks useless, and I think radv_handle_image_transition()
will do a fast-clear eliminate because it's called after the
resolve.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
DCC implies a fast-clear eliminate, so I think this sounds
reasonable.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Into radv_handle_color_image_transition().
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
In order to separate initialization from decompression. In the
future, that will allow us to init DCC/FMASK/CMASK in one shot.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
To handle CMASK, FMASK and DCC transitions in the same place.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Mostly because DCC implies a fast-clear eliminate and we
should be able to skip some DCC decompressions by setting
a predicate like for CMASK and FMASK.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When decompressing DCC we don't enable it, so it's useless
to disable it. This reduces the number of prediction packets
sent to the GPU when performing color decompression passes.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Niuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
No clue how I missed those ...
Fixes: 4503ff760c "ac/nir: Add workaround for GFX9 buffer views."
CC: <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105320
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Co-authored-by: Connor Abbott <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Local BOs ignore BO priorities, and we don't need those on APUs.
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
- remove mtypes.h from most header files
- add main/menums.h for often used definitions
- remove main/core.h
v2: fix radv build
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Pretty straight forward, just pass the divisors through the shader
key and then do a LLVM divide.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For dcn1 && < 64 bpp displayable surfaces, addrlib only accepts
S swizzles.
At the same time addrlib prefers D swizzles is allowed, so we can
just allow S swizzles as fallback.
Fixes: b64b712558 "ac/surface/gfx9: request desired micro tile mode explicitly"
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
num_dcc_levels means that DCC is supported, but this doesn't
mean that it's enabled by the driver. Instead, we should rely
on radv_image_has_dcc().
This fixes some multisample regressions since 0babc8e5d66
("radv: fix picking the method for resolve subpass") on Vega.
This is because the resolve method changed from HW to FS, but
those fails are totally unexpected, so there might some
differences between Polaris and Vega here.
Fixes: 44fcf587445 ("radv: Disable DCC for GENERAL layout and compute transfer dest.")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This helper shares common code before resolving using either
a fragment or a compute shader.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
And add some comments.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes linking errors against:
anv_GetPhysicalDeviceImageFormatProperties2KHR
radv_GetPhysicalDeviceImageFormatProperties2KHR
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This will give us meaningful wave information in the case of a hang where
shaders are still running in an infinite loop.
Note that we call umr multiple times for different sections of the ddebug
hang dump, and so the wave information will not necessarily match up
between sections.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
All the information in vk_android_native_buffer.xml is now in vk.xml.
The only exception is the extension type attribute which we can work
around in the generators while we wait for the XML to be fixed.
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to Marek, not enabling it on Stoney has a significant
negative performance impact. (And I guess this might impact
performance on Raven as well)
The register settings are pretty much copied from radeonsi. I did
not put this in the pipeline as that would make the pipeline more
dependent on the format which mean we would have to have more
pipelines for the meta shaders.
v2: Don't clear RB+ regs if not enabled as the CLEAR_STATE packet
does already.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The source and destination image parameters were swapped.
No CTS changes on Polaris10, but I suspect this might
fix something.
Fixes: 2a04f5481df ("radv/meta: select resolve paths")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, the shader BOs are not added to the list on SI because
prefetching isn't supported. Calling radv_cs_add_buffer() in the
prefetch codepath was a bad idea.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105952
Fixes: 4ad7595f35 ("radv: rename radv_emit_prefetch() to radv_emit_prefetch_L2")
Signed-off-by: Samuel Pitoiset <[email protected]>
Tested-by: Turo Lamminen <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|