aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* meson: inline `inc_common`Eric Engestrom2020-03-285-5/+5
| | | | | | | | | Let's make it clear what includes are being added everywhere, so that they can be cleaned up. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4360>
* radv: stop including files from mesa/mainMarek Olšák2020-03-279-7/+14
| | | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4324>
* ac/nir: use llvm.amdgcn.rcp in ac_build_fdiv()Samuel Pitoiset2020-03-272-20/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of emitting 1.0 / x which includes a slow division that LLVM doesn't always optimize even if the metadata is correctly set. No pipeline-db changes with VEGA10/LLVM 9. pipeline-db (VEGA10/LLVM 10): Totals from affected shaders: SGPRS: 6672 -> 6672 (0.00 %) VGPRS: 6652 -> 6652 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 561780 -> 561692 (-0.02 %) bytes Max Waves: 1043 -> 1043 (0.00 %) pipeline-db (VEGA10/LLVM 11 - 92744f62478): Totals from affected shaders: SGPRS: 84608 -> 83768 (-0.99 %) VGPRS: 106768 -> 106636 (-0.12 %) Spilled SGPRs: 1625 -> 1713 (5.42 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 10850936 -> 10726712 (-1.14 %) bytes Max Waves: 3152 -> 3180 (0.89 %) LLVM 11 (master) is more affected than previous versions, but based on the small impact with LLVM 9/10, I decided to emit it unconditionally. Cc: 20.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
* ac/nir: use llvm.amdgcn.rsq for nir_op_frsqSamuel Pitoiset2020-03-271-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of emitting 1.0 / sqrt(x) which includes a slow division that LLVM doesn't always optimize even if the metadata is correctly set. pipeline-db (VEGA10/LLVM 9): Totals from affected shaders: SGPRS: 16872 -> 16864 (-0.05 %) VGPRS: 15320 -> 15464 (0.94 %) Spilled SGPRs: 2021 -> 2133 (5.54 %) Code Size: 1915464 -> 1917476 (0.11 %) bytes Max Waves: 641 -> 639 (-0.31 %) pipeline-db (VEGA10/LLVM 10): Totals from affected shaders: SGPRS: 43936 -> 44120 (0.42 %) VGPRS: 41776 -> 41972 (0.47 %) Spilled SGPRs: 875 -> 875 (0.00 %) Code Size: 4468164 -> 4468120 (-0.00 %) bytes Max Waves: 2412 -> 2414 (0.08 %) pipeline-db (VEGA10/LLVM 11 - 92744f62478): Totals from affected shaders: SGPRS: 60096 -> 60096 (0.00 %) VGPRS: 63552 -> 63648 (0.15 %) Spilled SGPRs: 6135 -> 6117 (-0.29 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 6252996 -> 6249772 (-0.05 %) bytes Max Waves: 2324 -> 2337 (0.56 %) LLVM 11 (master) is more affected than previous versions, but based on the small impact with LLVM 9/10, I decided to emit it unconditionally. Cc: 20.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
* ac/nir: use llvm.amdgcn.rcp for nir_op_frcpSamuel Pitoiset2020-03-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of emitting 1.0 / x which includes a slow division that LLVM doesn't always optimize even if the metadata is correctly set. pipeline-db (VEG10/LLVM 9): Totals from affected shaders: SGPRS: 50384 -> 50312 (-0.14 %) VGPRS: 42572 -> 42696 (0.29 %) Spilled SGPRs: 1372 -> 1372 (0.00 %) Code Size: 5692040 -> 5691428 (-0.01 %) bytes Max Waves: 3954 -> 3951 (-0.08 %) pipeline-db (VEG10/LLVM 10): Totals from affected shaders: SGPRS: 78512 -> 78464 (-0.06 %) VGPRS: 62408 -> 62484 (0.12 %) Spilled SGPRs: 1502 -> 1502 (0.00 %) Code Size: 8106188 -> 8103372 (-0.03 %) bytes Max Waves: 7759 -> 7753 (-0.08 %) pipeline-db (VEGA10/LLVM 11 - 92744f62478): Totals from affected shaders: SGPRS: 112760 -> 113232 (0.42 %) VGPRS: 111132 -> 110568 (-0.51 %) Spilled SGPRs: 5870 -> 5940 (1.19 %) Spilled VGPRs: 650 -> 652 (0.31 %) Code Size: 11887232 -> 11561744 (-2.74 %) bytes Max Waves: 8964 -> 9015 (0.57 %) LLVM 11 (master) is more affected than previous versions, but based on the small impact with LLVM 9/10, I decided to emit it unconditionally. Cc: 20.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4326>
* ac: fix ac_build_is_helper_invocation when postponed_kill is nullPierre-Eric Pelloux-Prayer2020-03-251-0/+3
| | | | | | | | | | | If there was no demote() in the shader, ac_build_is_helper_invocation behaves exactly the same as ac_build_load_helper_invocation, i.e. the helper lanes are the same as they were at the beginning of the shader. Fixes: de57ea2a3da ("amd/llvm: implement nir_intrinsic_demote(_if) and nir_intrinsic_is_helper_invocation") Reviewed-by: Daniel Schürmann <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4301>
* radv: enable VK_KHR_8bit_storage on GFX6-GFX7Samuel Pitoiset2020-03-241-1/+1
| | | | | | | | | | | | | | | Enabling a Vulkan extension doesn't mean that all features need to be implemented. DOOM Eternal crashes at launch if that ext is not supported but it doesn't matter if the features are enabled or not. Let's enable it like we did for VK_KHR_16bit_storage. Cc: 19.3 20.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4299>
* aco: implement 64-bit VGPR constant copies in handle_operands()Rhys Perry2020-03-242-4/+39
| | | | | | | | | | | | | | | 64-bit VGPR constant copies can happen because of 64-bit constant copy propagation. Since this optimization is beneficial and more annoying to deal with in the optimizer, I've implemented 64-bit VGPR constant copies in handle_operands(). This also sets copy_operation::size correctly for 64-bit constant copies. Cc: 20.0 <[email protected]> Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260>
* aco: remove dead code in handle_operands()Rhys Perry2020-03-241-19/+3
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4260>
* aco: fix boolean undef regclassRhys Perry2020-03-231-0/+2
| | | | | | | | Cc: <[email protected]> Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4285> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4285>
* aco: emit IR in IF's merge block instead if the other side ends in a jumpRhys Perry2020-03-231-6/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes NIR such as: if (divergent) { a = sgpr() } else { break; } use(a) Previously we would have emitted: if (divergent) { a = sgpr() } if (!divergent) { break; } use(a) But "a" isn't available at it's use. Now we emit: if (divergent) { } if (!divergent) { break; } a = sgpr() use(a) pipeline-db (Navi): Totals from affected shaders: SGPRS: 1936 -> 1936 (0.00 %) VGPRS: 1264 -> 1264 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 159408 -> 159152 (-0.16 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 81 -> 81 (0.00 %) Signed-off-by: Rhys Perry <[email protected]> CC: <[email protected]> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2557 Reviewed-by: Daniel Schürmann <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
* aco: improve check for unreachable loop continue blocksRhys Perry2020-03-231-6/+10
| | | | | | | | | | | | | | | | | | | | The old code would have previously caught: loop { ... break } when it was meant to just catch: loop { if (...) break else break } Signed-off-by: Rhys Perry <[email protected]> CC: <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
* aco: skip NIR in unreachable merge blocksRhys Perry2020-03-231-2/+6
| | | | | | | | | | NIR removes most of this but undef instructions for loop header phis can remain. These were harmless because ACO would DCE them itself. Signed-off-by: Rhys Perry <[email protected]> CC: <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
* aco: handle when ACO adds new continue edgesRhys Perry2020-03-232-1/+92
| | | | | | | | | | Usually a loop ends with a uniform continue. If it doesn't and we end up adding our own continue edges (because of continue_or_break or divergent breaks at the end), we have to add extra operands to the loop header phis. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
* aco: handle missing second predecessors at merge block phisRhys Perry2020-03-231-0/+4
| | | | | | | Signed-off-by: Rhys Perry <[email protected]> CC: <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
* aco: set has_divergent_branch for discards in loopsRhys Perry2020-03-231-0/+3
| | | | | | | Signed-off-by: Rhys Perry <[email protected]> CC: <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3658>
* radv/llvm: fix subgroup shuffle for chips without bpermuteSamuel Pitoiset2020-03-232-5/+30
| | | | | | | | | | | | | | bpermute only exists on GFX8+ and only with Wave32 on GFX10. Instead we have to use readlane with a waterfall loop to defeat the LLVM backend. This fixes DOOM Eternal which requires subgroup shuffle. Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4284>
* radv/winsys: spoof some values for num_render_backends in the null winsysSamuel Pitoiset2020-03-231-40/+32
| | | | | | | | | | To avoid crashes when RADV_FORCE_FAMILY is set to GFX9+ because num_render_backends is used to compute binning state. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4282> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4282>
* radv/winsys: fix wrong PCI ID for Vega10 in the null winsysSamuel Pitoiset2020-03-231-1/+1
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4282>
* ac: fix fast divisionMarek Olšák2020-03-211-5/+4
| | | | | | | | | | This stopped working with LLVM 11 and might occasionally have been broken on older LLVM, because the metadata was set on the mul, not on the rcp. Cc: 19.3 20.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4268>
* radv/winsys: set has_syncobj_wait_for_submit in the null winsysRhys Perry2020-03-201-0/+1
| | | | | | | | | | Needed for Vulkan 1.1+ Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4249> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4249>
* radv: fix optional pSizes parameter when binding streamout buffersSamuel Pitoiset2020-03-201-1/+6
| | | | | | | | | | | | | | | | | The Vulkan spec 1.2.135 says: "pSizes is an optional array of buffer sizes, specifying the maximum number of bytes to capture to the corresponding transform feedback buffer. If pSizes is NULL, or the value of the pSizes array element is VK_WHOLE_SIZE, then the maximum bytes captured will be the size of the corresponding buffer minus the buffer offset." Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2650 Fixes: b4eb029062a ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4232>
* Android.mk: Tweak MESA_ENABLE_LLVM checksJohn Stultz2020-03-191-0/+4
| | | | | | | | | | | | | | | Change the MESA_ENABLE_LLVM checks in Android.mk files in order to get mesa3d to build w/ AOSP using mmma. This tries to re-create a change that was introduced in the following merge in the AOSP branch: 69f2c0128d2b Merge branch 'aosp/upstream-18.0' Acked-by: Tapani Pälli <[email protected]> Acked-by: Mauro Rossi <[email protected]> Signed-off-by: John Stultz <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4175>
* radv: call nir_shader_gather_info againRhys Perry2020-03-191-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | pipeline-db (Navi, ACO): Totals from affected shaders: SGPRS: 11840 -> 11840 (0.00 %) VGPRS: 19012 -> 19124 (0.59 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 3696 -> 3696 (0.00 %) dwords per thread Code Size: 998680 -> 921388 (-7.74 %) bytes LDS: 19646 -> 19646 (0.00 %) blocks Max Waves: 3398 -> 3401 (0.09 %) pipeline-db (Navi, LLVM): Totals from affected shaders: SGPRS: 17016 -> 17128 (0.66 %) VGPRS: 19564 -> 14876 (-23.96 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 3872 -> 3872 (0.00 %) dwords per thread Code Size: 820416 -> 743576 (-9.37 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 3367 -> 3534 (4.96 %) Signed-off-by: Rhys Perry <[email protected]> Reviewed-By: Timur Kristóf <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4193> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4193>
* radv: remove wrong assert that checks compute subgroup sizeSamuel Pitoiset2020-03-181-5/+4
| | | | | | | | | | | | Ooops. For some reasons, I have been confused with Wave32 on GFX10, but it's still possible to require a specific subgroup size if only Wave64 is supported. Fixes: 672d1061998 ("radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_control") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4227> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4227>
* radv: fix random depth range unrestricted failures due to a cache issueSamuel Pitoiset2020-03-181-2/+6
| | | | | | | | | | | | | | | The shader module name is used to compute the pipeline key. The driver used to load the wrong pipelines because the shader names were similar. This should fix random failures of dEQP-VK.pipeline.depth_range_unrestricted.* Fixes: f11ea226664 ("radv: fix a performance regression with graphics depth/stencil clears") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4216>
* amd/llvm: Fix divergent descriptor regressions with radeonsi.Bas Nieuwenhuizen2020-03-171-11/+13
| | | | | | | | | | | | | | piglit/bin/arb_bindless_texture-limit -auto -fbo: Needed to deal with non-NULL dynamic_index without deref in tex instructions. piglit/bin/shader_runner tests/spec/arb_bindless_texture/execution/images/multiple-resident-images-reading.shader_test -auto: Need to deal with non-deref images in enter_waterfall_imae. Fixes: b83c9aca4a5 "amd/llvm: Fix divergent descriptor indexing. (v3)" Acked-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4191>
* ac: don't set old denormals flags with LLVM >= 11Marek Olšák2020-03-171-1/+2
| | | | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected] Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
* ac: set new LLVM denormal flagsMarek Olšák2020-03-171-0/+9
| | | | | | | | See: https://reviews.llvm.org/D71358 Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
* ac: unify denorm setting enforcementMarek Olšák2020-03-172-14/+13
| | | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4196>
* radv/gfx10: fix required ballot size with VK_EXT_subgroup_size_controlSamuel Pitoiset2020-03-174-7/+27
| | | | | | | | | | | | | If compute shaders require a specific subgroup size (ie. Wave32), we have to use the correct ballot size. Fixes dEQP-VK.subgroups.ballot_other.compute.*_requiredsubgroupSize. Fixes: fb07fd4e6cb ("radv: implement VK_EXT_subgroup_size_control") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
* radv/gfx10: fix required subgroup size with VK_EXT_subgroup_size_controlSamuel Pitoiset2020-03-173-4/+17
| | | | | | | | | | | | If compute shaders require a specific subgroup size (ie. Wave32), we have to return the correct one. Fixes dEQP-VK.subgroups.size_control.compute.required_subgroup_size_*. Fixes: fb07fd4e6cb ("radv: implement VK_EXT_subgroup_size_control") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
* radv: only inject implicit subpass dependencies if necessarySamuel Pitoiset2020-03-171-3/+39
| | | | | | | | | | | | | | | | The Vulkan 1.2.134 spec update clarified when implicit subpass dependencies should be injected by the driver. They only make sense if automatic layout transitions are performed. This should fix a performance regression with RPCS3 (although they added a workaround for RADV since the regression has been found). Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2502 Fixes: e60de085473 ("radv: handle missing implicit subpass dependencies") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4210>
* aco: fix operand order for LS VGPR init bug workaroundRhys Perry2020-03-161-3/+3
| | | | | | | | Fixes: a952bf3946 ('aco: Fix LS VGPR init bug on affected hardware.') Signed-off-by: Rhys Perry <[email protected]> Reviewed-By: Timur Kristóf <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201>
* aco: fix instruction encoding for LS VGPR init bug workaroundRhys Perry2020-03-161-3/+3
| | | | | | | Fixes: a952bf3946 ('aco: Fix LS VGPR init bug on affected hardware.') Signed-off-by: Rhys Perry <[email protected]> Reviewed-By: Timur Kristóf <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4201>
* aco: set late kill for v_interp_p1_f32 for some APUsRhys Perry2020-03-163-2/+8
| | | | | | | | | | | | | | | | | | | | | | Apparently needed for Stoney Ridge, Kabini and Mullins APUs. gfx702 also has 16-bank LDS and https://llvm.org/docs/AMDGPUUsage.html lists some dGPUs under there. Those GPUs seem to be Hawaii actually (gfx701) and we don't seem to have gotten any interpolation related bugs reported with them so far. The late kill flag was tested by running pipeline-db with ACO_DEBUG=validatera while setting late kill for SMEM buffer loads, emit_vop2_instruction() and texture instructions. I also tested with just setting the flag for v_interp_p1_f32. As far as I know, the only other thing we have to consider for 16-bank LDS is something to do with 16-bit interpolation. We don't do that yet. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>
* aco: add a late kill flagRhys Perry2020-03-165-25/+77
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>
* aco: move some register demand helpers into aco_live_var_analysis.cppRhys Perry2020-03-164-45/+55
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3914>
* radv/sqtt: handle thread trace capture in sqtt_QueuePresentKHR()Samuel Pitoiset2020-03-162-44/+49
| | | | | | | | | To avoid wasting CPU cycles when thread trace is not enabled. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4180> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4180>
* aco: don't stop scheduling at exportsRhys Perry2020-03-131-5/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | This allows us to move v_cvt_pkrtz_f16_f32 instructions upwards, improving schedules and (somewhat unintentionally) moving the exports slightly closer together. Totals from affected shaders: SGPRS: 1030224 -> 1030248 (0.00 %) VGPRS: 794080 -> 794392 (0.04 %) Spilled SGPRs: 127117 -> 127117 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 89028152 -> 89032312 (0.00 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 65252 -> 65219 (-0.05 %) SMEM score: 843808.00 -> 843918.00 (0.01 %) VMEM score: 5331687.00 -> 5397802.00 (1.24 %) SMEM clauses: 567659 -> 567655 (-0.00 %) VMEM clauses: 290715 -> 290716 (0.00 %) Instructions: 17143219 -> 17144259 (0.01 %) Cycles: 1098442808 -> 1098446968 (0.00 %) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
* aco: allow barriers to be skipped during schedulingRhys Perry2020-03-131-17/+25
| | | | | | | | | | | | | | | | | | | | | | | | Much better scheduling apparently in 160 shaders Totals from affected shaders: SGPRS: 6272 -> 6344 (1.15 %) VGPRS: 4832 -> 4844 (0.25 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 467192 -> 467428 (0.05 %) bytes LDS: 459 -> 459 (0.00 %) blocks Max Waves: 1407 -> 1409 (0.14 %) SMEM score: 9309.00 -> 11216.00 (20.49 %) VMEM score: 26679.00 -> 33652.00 (26.14 %) SMEM clauses: 1817 -> 1776 (-2.26 %) VMEM clauses: 2286 -> 2288 (0.09 %) Instructions: 86537 -> 86596 (0.07 %) Cycles: 676260 -> 676568 (0.05 %) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
* aco: add helpers for ensuring correct ordering while schedulingRhys Perry2020-03-132-193/+171
| | | | | | | | | | | | | | | | | | | | | | | | Pipeline-db changes in 721 shaders. Totals from affected shaders: SGPRS: 42336 -> 42656 (0.76 %) VGPRS: 38368 -> 38636 (0.70 %) Spilled SGPRs: 11967 -> 11967 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 5268088 -> 5269840 (0.03 %) bytes LDS: 1069 -> 1069 (0.00 %) blocks Max Waves: 4473 -> 4447 (-0.58 %) SMEM score: 41155.00 -> 41826.00 (1.63 %) VMEM score: 146339.00 -> 147471.00 (0.77 %) SMEM clauses: 24434 -> 24535 (0.41 %) VMEM clauses: 16637 -> 16592 (-0.27 %) Instructions: 996037 -> 996388 (0.04 %) Cycles: 76476112 -> 75281416 (-1.56 %) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
* aco: add helpers for moving instructions for schedulingRhys Perry2020-03-131-364/+321
| | | | | | | | No pipeline-db changes Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3776>
* radv: add llvm_compiler_shader() helperSamuel Pitoiset2020-03-133-40/+44
| | | | | | | | | To match aco_compile_shader(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
* radv: remove unnecessary LLVM includesSamuel Pitoiset2020-03-136-16/+0
| | | | | | | | They are already included from src/amd/llvm. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
* radv: remove radv_shader_variant::aco_usedSamuel Pitoiset2020-03-133-3/+1
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
* radv: cleanup occurences of use_aco everywhereSamuel Pitoiset2020-03-133-31/+27
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
* radv: use ac_gpu_info::use_late_allocSamuel Pitoiset2020-03-121-4/+6
| | | | | | | | | Based on PAL and RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
* radv: rewrite late alloc computationSamuel Pitoiset2020-03-121-34/+43
| | | | | | | | Based on PAL and RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>
* radv: tune primitive binning for small chipsSamuel Pitoiset2020-03-121-2/+7
| | | | | | | | Based on PAL and RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4144>