| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
It doesn't support GFX9.
Acked-by: Dave Airlie <[email protected]>
Acked-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
GFX9 uses LDS instead.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GFX9:
Totals from affected shaders:
SGPRS: 472 -> 464 (-1.69 %)
VGPRS: 576 -> 584 (1.39 %)
Code Size: 45432 -> 44324 (-2.44 %) bytes
Max Waves: 40 -> 40 (0.00 %)
VI:
SGPRS: 720 -> 720 (0.00 %)
VGPRS: 728 -> 728 (0.00 %)
Code Size: 45348 -> 43992 (-2.99 %) bytes
Max Waves: 120 -> 120 (0.00 %)
This affects Rise of Tomb Raider and the three Vulkan demos
that use a geometry shader (geometryshader, deferredshadows
and viewportarray).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
That way the winsys might use a faster path when the global
BO list is NULL.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Having an entrypoint different than "main" doesn't mean we
have multiple shaders per module.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It's always true.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
By using the geometry shader output usage mask.
This improves all Vulkan demos that use a geometry shader
(ie. geometryshader, deferredshadows, viewportarray).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
For reducing the number of parameters that are exported by
the GS copy shader.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
For further optimizations.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
An upcoming patch will run the shader info pass on the
geometry shader just before emitting the GS copy shader.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The hardware always interprets the alpha as unsigned and fixing it
in the shader is going to add unacceptable overheads.
CC: 18.0 18.1 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pre-Vega HW always interprets the alpha for this format as unsigned,
so we have to implement a fixup to do the sign correctly for signed
formats.
v2: Improve indexing mess.
CC: 18.0 18.1 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Basic sampling support for linear tiling.
No CTS regressions, but it seems the blitting coverage is not very
extensive.
https://bugs.freedesktop.org/show_bug.cgi?id=106331
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
radeonsi could pass them through but the enum changed between
Gallium and Vulkan, so we have to translate.
In progress I made the register defines a bit more readable.
CC: 18.0 18.1 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100430
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves the extra queries to after the main query ended, instead
of doing it after the begin and hence doing nesting.
We also emit only (view count - 1) extra queries, as the main query
is already there for the first view.
This fixes the CTS occasionally getting stuck in
dEQP-VK.multiview.queries* waiting on results.
Fixes: 32b4f3c38dc "radv/query: handle multiview queries properly. (v3)"
CC: 18.1 <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
radv_can_dump_shader() already handles if module is NULL.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
These are going to be crazy and we are probably going to add
more scan stuff in the future. Also use switch cases instead.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I don't think the hw resolve path can't handle multi-layer images.
This fixes all the:
dEQP-VK.renderpass.multisample_resolve.layers_*
tests on my VI card.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This path should iterate across all layers, I've some ideas
for doing this in a single pass, but this is simpler for now.
This passes the tests because we don't use the fragment path
unless we have DCC, and we don't have DCC on layered images.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
| |
For a multi-layer subpass resolve we want to make sure we flush all
the layers.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT is set we skip NIR
linking optimisations and only run over the NIR optimisation loop
once similar to the GLSLOptimizeConservatively constant used by
some GL drivers.
We need to run over the opts at least once to avoid errors in LLVM
(e.g. dead vars it can't handle) and also to reduce the time spent
compiling the IR in LLVM.
With this change the Blacksmith Unity demos compilation times
go from 329760 ms -> 299881 ms when using Wine and DXVK.
V2: add bit to radv_pipeline_key
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106246
|
|
|
|
|
|
|
| |
These helpers will be needed for future work.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
instead of invoking FMASK computation separately.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This seems to be broken at the moment for opengl interop.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
It's a per-application optimization, so it makes more sense
to do that in radv_handle_per_app_options().
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Trivial.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
This fixes the fmask descriptor generation to handle 2d ms arrays.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
It can do 32-bit packing too now.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes
dEQP-VK.api.device_init.create_instance_invalid_api_version
CC: 18.1 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently the somewhere between 1.1.70 and 1.1.73 the loader started
depending on this. The loader then creates a 1.0 instance, which gets
into funny situation because we have a 1.1 device.
No idea how to do line wrapping in Mako though, my random guesses
did not work.
CC: 18.1 <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously before fb077b0728, the LOD parameter was being used in place of the
sample index, which would only copy the first sample to all samples in the
destination image. After that multisample image copies wouldn't copy anything
from my observations.
This fixes some copy_and_blit CTS tests.
v3.1: - set lod to 0 for nir_txf_ms (Samuel)
v2: - use GLSL_SAMPLER_DIM_MS instead of 2D (Samuel)
- updated commit description (Samuel)
Fix this properly by copying each sample in a separate radv_CmdDraw and using a
pipeline with the correct rasterizationSamples for the destination image.
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
As the implementation is conservative, we can now enable it
by default. It can be disabled with RADV_DEBUG=nooutoforder.
Don't expect much more than 1% of improvements, but the gain
seems consistent.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Only count color attachments twice if resolves are used, also
account for the depth stencil attachment if present.
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is needed for gfx9 and later for all fmask surface index.
(Mentioned by Marek on irc)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
A bo's ref_count was not being initialized when imported from an fd.
Therefore, we would fail to free the resource during VkFreeMemory().
This patch fixes applications like hifi VR in threaded mode, which
perform frequent imports/releases of IPC shared memory.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
CC: 18.0 18.1 <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Tested-by: Benedikt Schemmer <ben at besd.de>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
num_channels has been introduced since "ac/surface: don't set
the display flag for obviously unsupported cases".
Based on RadeonSI.
Fixes: e29facff315 ("ac/surface: don't set the display flag for obviously unsupported cases (v2)")
Cc: 18.1 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dcc_msaa_allowed is always false on GFX9+ and only true on VI
if RADV_PERFTEST=dccmsaa is set. This means DCC was disabled
in some situations where it should not.
This is likely going to fix a performance regression.
Fixes: 2f63b3dd09 ("radv: enable DCC for MSAA 2x textures on VI under an option")
Cc: 18.1 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This refactors the code out to share it between radv and radeonsi.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This follows what radeonsi does.
Ported from radeonsi:
radeonsi: emit PA_SC_RASTER_CONFIG_1 only once
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|