summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* aco: add Instruction::usesModifiers() and add more checks in the optimizerRhys Perry2019-11-082-7/+23
| | | | | | | | No pipeline-db changes. v2: use early-exit for VOP3 Reviewed-by: Daniel Schürmann <[email protected]> (v1)
* radv: adjust loop unrolling heuristics for int64Rhys Perry2019-11-072-7/+16
| | | | | | | | | | | In particular, increase the cost of 64-bit integer division. Fixes huge shaders with dEQP-VK.spirv_assembly.type.scalar.i64.mod_geom , with ACO used for GS this creates shaders requiring a branch with >32767 dword offset. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fix primitive indices orientation for NGG GSSamuel Pitoiset2019-11-072-9/+45
| | | | | | | | | | The primitive indices have to be swapped to follow the drawing order. This fixes corruption with Overwatch when NGG GS is force enabled. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* aco: workaround Tonga/Iceland hardware bugDaniel Schürmann2019-11-071-5/+5
| | | | | | | | The workaround got accidentally moved to the wrong place Fixes: 08d510010b7586387e363460b98e6a45bbe97164 aco: increase accuracy of SGPR limits Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: implement VK_EXT_subgroup_size_controlSamuel Pitoiset2019-11-065-3/+59
| | | | | | | | | | | | | | | | This extension allows to control the subgroup size by allowing a varying subgroup size and also specifying a required subgroup size. This implementation only allows to specify a required subgroup size for compute shaders because there is some caveats with other shader stages (eg. NGG with geometry shader). This basically allows apps to use Wave32 for compute shaders. This extension is enabled for all chips but only GFX10 supports Wave32. ACO doesn't support it. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: rely on shader's wavesize when computing NGG infoSamuel Pitoiset2019-11-061-1/+10
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: determine shaders wavesize at pipeline levelSamuel Pitoiset2019-11-066-19/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: hardcode the number of waves for the GFX6 LS-HS bugSamuel Pitoiset2019-11-061-1/+1
| | | | | | | It's always 64. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: enable wave32 for compute based on shader's wavesizeSamuel Pitoiset2019-11-063-3/+7
| | | | | | | This will allow to change wavesize on-demand. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix 32-bit compiler warningsSamuel Pitoiset2019-11-061-3/+3
| | | | | | Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2031 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add a note about perftest/debug optionsSamuel Pitoiset2019-11-061-0/+1
| | | | | | | | | | Now that all environment variables are documented, it would be appreciated if we can keep this up-to-date. [skip ci] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* meson: move the generic symbols check arguments to a common variableEric Engestrom2019-11-051-1/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviwed-by: Dylan Baker <dylan@pnwbakers>
* meson: add variable to control the symbols checksEric Engestrom2019-11-051-1/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviwed-by: Dylan Baker <dylan@pnwbakers>
* util: rename PIPE_ARCH_*_ENDIAN to UTIL_ARCH_*_ENDIANDylan Baker2019-11-054-7/+7
| | | | | | | | | | | As requested by Tim. This was generated with: grep 'PIPE_ARCH_.*_ENDIAN' -rIl | xargs sed -ie 's@PIPE_ARCH_\(.*\)_ENDIAN@UTIL_ARCH_\1_ENDIAN@'g v2: - add this patch Reviewed-by: Eric Engestrom <[email protected]>
* util/u_endian: set PIPE_ARCH_*_ENDIAN to 1Dylan Baker2019-11-054-7/+7
| | | | | | | | | | | | This will allow it to be used as a drop in replacement for _mesa_little_endian in a number of cases. v2: - Always define PIPE_ARCH_LITTLE_ENDIAN and PIPE_ARCH_BIG_ENDIAN, define the one that reflects the host system to 1 and the other to 0 - replace all uses of #ifdef, #ifndef, and #if defined() with #if and #if ! with PIPE_ARCH_*_ENDIAN Reviewed-by: Eric Engestrom <[email protected]>
* ac: add missing Arcturus to the info of pc linesLeo Liu2019-11-041-0/+2
| | | | | | Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: Marek Olšák <[email protected]>
* aco: fix accidential reordering of instructions when schedulingDaniel Schürmann2019-11-041-10/+47
| | | | | | Fixes: 86786999189c43b4a2c8e1c1a18b55cd2f369fff "aco: implement VGPR spilling" Reviewed-by: Rhys Perry <[email protected]>
* aco: only use single-dword loads/stores for spillingDaniel Schürmann2019-11-041-41/+10
| | | | | | Fixes: 86786999189c43b4a2c8e1c1a18b55cd2f369fff "aco: implement VGPR spilling" Reviewed-by: Rhys Perry <[email protected]>
* aco: fix immediate offset for spills if scratch is usedDaniel Schürmann2019-11-041-6/+6
| | | | | | Fixes: 86786999189c43b4a2c8e1c1a18b55cd2f369fff "aco: implement VGPR spilling" Reviewed-by: Rhys Perry <[email protected]>
* radv: fix compute pipeline keys when optimizations are disabledSamuel Pitoiset2019-11-041-2/+18
| | | | | | | | | | | If an app first creates a compute pipeline with VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT set, then re-compile it without that flag, the driver should re-compile the compute shader. Otherwise, it will return the unoptimized one. Fixes: ce188813bfe ("radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Close all unnecessary fds in secure compile.Bas Nieuwenhuizen2019-11-011-29/+64
| | | | | | | | The seccomp filter allows read/write, let us make sure nobody can do anything with this. Fixes: cff53da3748 "radv: enable secure compile support" Reviewed-by: Timothy Arceri <[email protected]>
* radv: drop unnecessary xmlpool_options_hEric Engestrom2019-10-311-1/+1
| | | | | | | idep_xmlconfig already covers that Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Dylan Baker <[email protected]>
* radv: Fix disk_cache_get size argument.Bas Nieuwenhuizen2019-10-311-2/+2
| | | | | | | Got some int->pointer warnings and 20 is not a valid pointer .... Fixes: 2e3a635ee69 "radv: Add an early exit in the secure compile if we already have the cache entries." Reviewed-by: Timothy Arceri <[email protected]>
* radv: Remove _mesa_locale_init/fini calls.Bas Nieuwenhuizen2019-10-311-3/+0
| | | | | | | | | The resulting locale is not used for Vulkan, and it is not reference counted, giving issues when multiple instances are created. CC: 19.2 19.3 <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* android: aco: fix Lower to CSSAMauro Rossi2019-10-311-0/+1
| | | | | | | | | | Fixes the following building error: external/mesa/src/amd/compiler/aco_spill.cpp:1768: error: undefined reference to 'aco::lower_to_cssa(aco::Program*, aco::live&, radv_nir_compiler_options const*)' Fixes: 0b8216b ("aco: Lower to CSSA") Signed-off-by: Mauro Rossi <[email protected]>
* radv: declare NGG scratch for VS or TES and only on GFX10Samuel Pitoiset2019-10-311-5/+3
| | | | | | | | Do not need to declare it for other stages because this is for streamout. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix timeout handling in syncobj wait.Bas Nieuwenhuizen2019-10-311-1/+1
| | | | | | | libdrm returns -errno instead of directly the ioctl ret of -1. Fixes: 1c3cda7d277 "radv: Add syncobj signal/reset/wait to winsys." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Allocate space for temp. semaphore parts.Bas Nieuwenhuizen2019-10-301-0/+1
| | | | | | | | Calculated the number for allocation and did not reserve space .... Fixes: 2117c53b723 "radv: Add temporary datastructure for submissions." Reviewed-by: Samuel Pitoiset <[email protected]>
* aco: implement VGPR spillingDaniel Schürmann2019-10-301-7/+162
| | | | | | VGPR spilling is implemented via MUBUF instructions and scratch memory. Reviewed-by: Rhys Perry <[email protected]>
* aco: always set scratch_offset in startpgmDaniel Schürmann2019-10-303-23/+22
| | | | | | | This patch also moves private_segment_buffer and scratch_offset to Program to easily access it. Reviewed-by: Rhys Perry <[email protected]>
* aco: omit linear VGPRs as spill variablesDaniel Schürmann2019-10-301-4/+8
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: ensure that spilled VGPR reloads are done after p_logical_startDaniel Schürmann2019-10-301-34/+43
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: simplify calculation of target register pressure when spillingDaniel Schürmann2019-10-301-39/+12
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: fix new_demand calculation for first instructionsRhys Perry2019-10-301-4/+7
| | | | Reviewed-by: Daniel Schürmann <[email protected]>
* aco: don't add interferences between spilled phi operandsDaniel Schürmann2019-10-301-8/+8
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: consider loop_exit blocks like merge blocks, even if they have only one ↵Daniel Schürmann2019-10-301-2/+2
| | | | | | predecessor Reviewed-by: Rhys Perry <[email protected]>
* aco: don't insert the exec mask into set of live-out variables when spillingDaniel Schürmann2019-10-301-14/+6
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: fix transitive affinities of spilled variablesDaniel Schürmann2019-10-301-25/+79
| | | | | | | Variables spilled on both branch legs need to be assigned to the same spilling slot. These affinities can be transitive through multiple merge blocks. Reviewed-by: Rhys Perry <[email protected]>
* aco: fix live-range splits of phisDaniel Schürmann2019-10-301-14/+23
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: remove potential critical edge on loops.Daniel Schürmann2019-10-302-18/+23
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: improve live variable analysisDaniel Schürmann2019-10-301-25/+64
| | | | | | | This patch makes the live variable analysis more precise w.r.t. killed phi operands and the block's register pressure. Reviewed-by: Rhys Perry <[email protected]>
* aco: Lower to CSSADaniel Schürmann2019-10-304-41/+268
| | | | | | | | | | Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes. Previously, it was possible that phi operands have intersecting live-ranges, and thus, couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to spill phis, even if it was beneficial. This patch implements a conversion pass which is currently only called if spilling is necessary. Reviewed-by: Rhys Perry <[email protected]>
* radv: Start signalling semaphores in WSI acquire.Bas Nieuwenhuizen2019-10-301-7/+27
| | | | | | | | | | Winsys semaphores without signal operation get silently ignored. Not so for syncobjs, so actually signal them. Fixes: 84d9551b232 "radv: Always enable syncobj when supported for all fences/semaphores." Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030 Reviewed-by: Samuel Pitoiset <[email protected]>
* aco: rename README to README.mdRhys Perry2019-10-301-0/+0
| | | | | Closes: #1974 Signed-off-by: Rhys Perry <[email protected]>
* aco: a couple loop handling fixes for GFX10 hazard passRhys Perry2019-10-301-3/+3
| | | | | | | It was joining from the wrong blocks and block.kind is a bitmask instead of an enum. Reviewed-By: Timur Kristóf <[email protected]>
* radv: Enable ACO on Navi.Timur Kristóf2019-10-301-2/+1
| | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* aco: try to group together VMEM loads of the same resourceRhys Perry2019-10-301-10/+56
| | | | | | | | v2: remove accidental shaderInt16 change v2: simplify can_move_down initialization v2: simplify VMEM_CLAUSE_MAX_GRAB_DIST Reviewed-by: Daniel Schürmann <[email protected]>
* aco: don't schedule instructions through depending VMEM instructionsDaniel Schürmann2019-10-301-0/+3
| | | | | | | | Previously, the scheduler tried to move up instructions from below depending VMEM instructions only to move them down again when scheduling the VMEM instruction. Reviewed-by: Rhys Perry <[email protected]>
* aco: add can_reorder flags to load_ubo and load_constantDaniel Schürmann2019-10-301-5/+9
| | | | | | | | These got lost due to some refactoring. Due to the way our scheduler works currently, for now we add back the reorder flag for divergent loads only. Reviewed-by: Rhys Perry <[email protected]>
* aco: only skip RAR dependencies if the variable is killed somewhereDaniel Schürmann2019-10-301-21/+46
| | | | | | | | | This patch changes VMEM scheduling in a way that they can only be moved upwards by previous VMEM instructions but not downwards. This way, it improves the order of VMEM instructions in relation to their users. Reviewed-by: Rhys Perry <[email protected]>