| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Let's make it clear what includes are being added everywhere, so that
they can be cleaned up.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
|
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4324>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of emitting 1.0 / x which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.
No pipeline-db changes with VEGA10/LLVM 9.
pipeline-db (VEGA10/LLVM 10):
Totals from affected shaders:
SGPRS: 6672 -> 6672 (0.00 %)
VGPRS: 6652 -> 6652 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 561780 -> 561692 (-0.02 %) bytes
Max Waves: 1043 -> 1043 (0.00 %)
pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 84608 -> 83768 (-0.99 %)
VGPRS: 106768 -> 106636 (-0.12 %)
Spilled SGPRs: 1625 -> 1713 (5.42 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 10850936 -> 10726712 (-1.14 %) bytes
Max Waves: 3152 -> 3180 (0.89 %)
LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.
Cc: 20.0 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of emitting 1.0 / sqrt(x) which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.
pipeline-db (VEGA10/LLVM 9):
Totals from affected shaders:
SGPRS: 16872 -> 16864 (-0.05 %)
VGPRS: 15320 -> 15464 (0.94 %)
Spilled SGPRs: 2021 -> 2133 (5.54 %)
Code Size: 1915464 -> 1917476 (0.11 %) bytes
Max Waves: 641 -> 639 (-0.31 %)
pipeline-db (VEGA10/LLVM 10):
Totals from affected shaders:
SGPRS: 43936 -> 44120 (0.42 %)
VGPRS: 41776 -> 41972 (0.47 %)
Spilled SGPRs: 875 -> 875 (0.00 %)
Code Size: 4468164 -> 4468120 (-0.00 %) bytes
Max Waves: 2412 -> 2414 (0.08 %)
pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 60096 -> 60096 (0.00 %)
VGPRS: 63552 -> 63648 (0.15 %)
Spilled SGPRs: 6135 -> 6117 (-0.29 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 6252996 -> 6249772 (-0.05 %) bytes
Max Waves: 2324 -> 2337 (0.56 %)
LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.
Cc: 20.0 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of emitting 1.0 / x which includes a slow division that
LLVM doesn't always optimize even if the metadata is correctly set.
pipeline-db (VEG10/LLVM 9):
Totals from affected shaders:
SGPRS: 50384 -> 50312 (-0.14 %)
VGPRS: 42572 -> 42696 (0.29 %)
Spilled SGPRs: 1372 -> 1372 (0.00 %)
Code Size: 5692040 -> 5691428 (-0.01 %) bytes
Max Waves: 3954 -> 3951 (-0.08 %)
pipeline-db (VEG10/LLVM 10):
Totals from affected shaders:
SGPRS: 78512 -> 78464 (-0.06 %)
VGPRS: 62408 -> 62484 (0.12 %)
Spilled SGPRs: 1502 -> 1502 (0.00 %)
Code Size: 8106188 -> 8103372 (-0.03 %) bytes
Max Waves: 7759 -> 7753 (-0.08 %)
pipeline-db (VEGA10/LLVM 11 - 92744f62478):
Totals from affected shaders:
SGPRS: 112760 -> 113232 (0.42 %)
VGPRS: 111132 -> 110568 (-0.51 %)
Spilled SGPRs: 5870 -> 5940 (1.19 %)
Spilled VGPRs: 650 -> 652 (0.31 %)
Code Size: 11887232 -> 11561744 (-2.74 %) bytes
Max Waves: 8964 -> 9015 (0.57 %)
LLVM 11 (master) is more affected than previous versions, but
based on the small impact with LLVM 9/10, I decided to emit it
unconditionally.
Cc: 20.0 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
|
|
|
|
|
|
|
|
|
|
|
| |
If there was no demote() in the shader, ac_build_is_helper_invocation
behaves exactly the same as ac_build_load_helper_invocation, i.e.
the helper lanes are the same as they were at the beginning of the shader.
Fixes: de57ea2a3da ("amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation")
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enabling a Vulkan extension doesn't mean that all features need
to be implemented. DOOM Eternal crashes at launch if that ext
is not supported but it doesn't matter if the features are enabled
or not.
Let's enable it like we did for VK_KHR_16bit_storage.
Cc: 19.3 20.0 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
64-bit VGPR constant copies can happen because of 64-bit constant copy
propagation. Since this optimization is beneficial and more annoying to
deal with in the optimizer, I've implemented 64-bit VGPR constant copies
in handle_operands().
This also sets copy_operation::size correctly for 64-bit constant copies.
Cc: 20.0 <[email protected]>
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260>
|
|
|
|
|
|
|
|
| |
Cc: <[email protected]>
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4285>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4285>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes NIR such as:
if (divergent) {
a = sgpr()
} else {
break;
}
use(a)
Previously we would have emitted:
if (divergent) {
a = sgpr()
}
if (!divergent) {
break;
}
use(a)
But "a" isn't available at it's use. Now we emit:
if (divergent) {
}
if (!divergent) {
break;
}
a = sgpr()
use(a)
pipeline-db (Navi):
Totals from affected shaders:
SGPRS: 1936 -> 1936 (0.00 %)
VGPRS: 1264 -> 1264 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 159408 -> 159152 (-0.16 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 81 -> 81 (0.00 %)
Signed-off-by: Rhys Perry <[email protected]>
CC: <[email protected]>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2557
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old code would have previously caught:
loop {
...
break
}
when it was meant to just catch:
loop {
if (...)
break
else
break
}
Signed-off-by: Rhys Perry <[email protected]>
CC: <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
|
|
|
|
|
|
|
|
|
|
| |
NIR removes most of this but undef instructions for loop header phis can
remain. These were harmless because ACO would DCE them itself.
Signed-off-by: Rhys Perry <[email protected]>
CC: <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
|
|
|
|
|
|
|
|
|
|
| |
Usually a loop ends with a uniform continue. If it doesn't and we end up
adding our own continue edges (because of continue_or_break or divergent
breaks at the end), we have to add extra operands to the loop header phis.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
CC: <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
CC: <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bpermute only exists on GFX8+ and only with Wave32 on GFX10. Instead
we have to use readlane with a waterfall loop to defeat the LLVM
backend.
This fixes DOOM Eternal which requires subgroup shuffle.
Cc: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
|
|
|
|
|
|
|
|
|
|
| |
To avoid crashes when RADV_FORCE_FAMILY is set to GFX9+ because
num_render_backends is used to compute binning state.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4282>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4282>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4282>
|
|
|
|
|
|
|
|
|
|
| |
This stopped working with LLVM 11 and might occasionally have been broken
on older LLVM, because the metadata was set on the mul, not on the rcp.
Cc: 19.3 20.0 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268>
|
|
|
|
|
|
|
|
|
|
| |
Needed for Vulkan 1.1+
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4249>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4249>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Vulkan spec 1.2.135 says:
"pSizes is an optional array of buffer sizes, specifying the maximum
number of bytes to capture to the corresponding transform feedback
buffer. If pSizes is NULL, or the value of the pSizes array element
is VK_WHOLE_SIZE, then the maximum bytes captured will be the size
of the corresponding buffer minus the buffer offset."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2650
Fixes: b4eb029062a ("radv: implement VK_EXT_transform_feedback")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the MESA_ENABLE_LLVM checks in Android.mk
files in order to get mesa3d to build w/ AOSP
using mmma.
This tries to re-create a change that was introduced
in the following merge in the AOSP branch:
69f2c0128d2b Merge branch 'aosp/upstream-18.0'
Acked-by: Tapani Pälli <[email protected]>
Acked-by: Mauro Rossi <[email protected]>
Signed-off-by: John Stultz <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4175>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pipeline-db (Navi, ACO):
Totals from affected shaders:
SGPRS: 11840 -> 11840 (0.00 %)
VGPRS: 19012 -> 19124 (0.59 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 3696 -> 3696 (0.00 %) dwords per thread
Code Size: 998680 -> 921388 (-7.74 %) bytes
LDS: 19646 -> 19646 (0.00 %) blocks
Max Waves: 3398 -> 3401 (0.09 %)
pipeline-db (Navi, LLVM):
Totals from affected shaders:
SGPRS: 17016 -> 17128 (0.66 %)
VGPRS: 19564 -> 14876 (-23.96 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 3872 -> 3872 (0.00 %) dwords per thread
Code Size: 820416 -> 743576 (-9.37 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 3367 -> 3534 (4.96 %)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4193>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4193>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ooops. For some reasons, I have been confused with Wave32 on GFX10,
but it's still possible to require a specific subgroup size if
only Wave64 is supported.
Fixes: 672d1061998 ("radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4227>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4227>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The shader module name is used to compute the pipeline key. The
driver used to load the wrong pipelines because the shader names
were similar.
This should fix random failures of
dEQP-VK.pipeline.depth_range_unrestricted.*
Fixes: f11ea226664 ("radv: fix a performance regression with graphics depth/stencil clears")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
piglit/bin/arb_bindless_texture-limit -auto -fbo:
Needed to deal with non-NULL dynamic_index without deref in tex instructions.
piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto:
Need to deal with non-deref images in enter_waterfall_imae.
Fixes: b83c9aca4a5 "amd/llvm: Fix divergent descriptor indexing. (v3)"
Acked-by: Marek Olšák <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
|
|
|
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
|
|
|
|
|
|
|
|
| |
See: https://reviews.llvm.org/D71358
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
|
|
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If compute shaders require a specific subgroup size (ie. Wave32),
we have to use the correct ballot size.
Fixes dEQP-VK.subgroups.ballot_other.compute.*_requiredsubgroupSize.
Fixes: fb07fd4e6cb ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If compute shaders require a specific subgroup size (ie. Wave32),
we have to return the correct one.
Fixes dEQP-VK.subgroups.size_control.compute.required_subgroup_size_*.
Fixes: fb07fd4e6cb ("radv: implement VK_EXT_subgroup_size_control")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Vulkan 1.2.134 spec update clarified when implicit subpass
dependencies should be injected by the driver. They only make
sense if automatic layout transitions are performed.
This should fix a performance regression with RPCS3 (although
they added a workaround for RADV since the regression has been found).
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2502
Fixes: e60de085473 ("radv: handle missing implicit subpass dependencies")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>
|
|
|
|
|
|
|
|
| |
Fixes: a952bf3946 ('aco: Fix LS VGPR init bug on affected hardware.')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201>
|
|
|
|
|
|
|
| |
Fixes: a952bf3946 ('aco: Fix LS VGPR init bug on affected hardware.')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently needed for Stoney Ridge, Kabini and Mullins APUs.
gfx702 also has 16-bank LDS and https://llvm.org/docs/AMDGPUUsage.html
lists some dGPUs under there. Those GPUs seem to be Hawaii actually
(gfx701) and we don't seem to have gotten any interpolation related bugs
reported with them so far.
The late kill flag was tested by running pipeline-db with
ACO_DEBUG=validatera while setting late kill for SMEM buffer loads,
emit_vop2_instruction() and texture instructions. I also tested with
just setting the flag for v_interp_p1_f32.
As far as I know, the only other thing we have to consider for 16-bank LDS
is something to do with 16-bit interpolation. We don't do that yet.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>
|
|
|
|
|
|
|
|
|
| |
To avoid wasting CPU cycles when thread trace is not enabled.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4180>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4180>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to move v_cvt_pkrtz_f16_f32 instructions upwards, improving
schedules and (somewhat unintentionally) moving the exports slightly
closer together.
Totals from affected shaders:
SGPRS: 1030224 -> 1030248 (0.00 %)
VGPRS: 794080 -> 794392 (0.04 %)
Spilled SGPRs: 127117 -> 127117 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 89028152 -> 89032312 (0.00 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 65252 -> 65219 (-0.05 %)
SMEM score: 843808.00 -> 843918.00 (0.01 %)
VMEM score: 5331687.00 -> 5397802.00 (1.24 %)
SMEM clauses: 567659 -> 567655 (-0.00 %)
VMEM clauses: 290715 -> 290716 (0.00 %)
Instructions: 17143219 -> 17144259 (0.01 %)
Cycles: 1098442808 -> 1098446968 (0.00 %)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Much better scheduling apparently in 160 shaders
Totals from affected shaders:
SGPRS: 6272 -> 6344 (1.15 %)
VGPRS: 4832 -> 4844 (0.25 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 467192 -> 467428 (0.05 %) bytes
LDS: 459 -> 459 (0.00 %) blocks
Max Waves: 1407 -> 1409 (0.14 %)
SMEM score: 9309.00 -> 11216.00 (20.49 %)
VMEM score: 26679.00 -> 33652.00 (26.14 %)
SMEM clauses: 1817 -> 1776 (-2.26 %)
VMEM clauses: 2286 -> 2288 (0.09 %)
Instructions: 86537 -> 86596 (0.07 %)
Cycles: 676260 -> 676568 (0.05 %)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pipeline-db changes in 721 shaders.
Totals from affected shaders:
SGPRS: 42336 -> 42656 (0.76 %)
VGPRS: 38368 -> 38636 (0.70 %)
Spilled SGPRs: 11967 -> 11967 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 5268088 -> 5269840 (0.03 %) bytes
LDS: 1069 -> 1069 (0.00 %) blocks
Max Waves: 4473 -> 4447 (-0.58 %)
SMEM score: 41155.00 -> 41826.00 (1.63 %)
VMEM score: 146339.00 -> 147471.00 (0.77 %)
SMEM clauses: 24434 -> 24535 (0.41 %)
VMEM clauses: 16637 -> 16592 (-0.27 %)
Instructions: 996037 -> 996388 (0.04 %)
Cycles: 76476112 -> 75281416 (-1.56 %)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
|
|
|
|
|
|
|
|
| |
No pipeline-db changes
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
|
|
|
|
|
|
|
|
|
| |
To match aco_compile_shader().
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
|
|
|
|
|
|
|
|
| |
They are already included from src/amd/llvm.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
|
|
|
|
|
|
|
|
|
| |
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
|
|
|
|
|
|
|
|
| |
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
|
|
|
|
|
|
|
|
| |
Based on PAL and RadeonSI.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
|