aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: fix allocating number of user sgprs if streamout is usedSamuel Pitoiset2019-09-131-1/+1
| | | | | | | | | | streamout_buffers is assigned after that function, so the previous fix was completely wrong. This probably fix something when streamout buffers and push constants are used/inlined in the same shader. Fixes: 378e2d24143 ("radv: fix computing number of user SGPRs for streamout buffers") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: replace HAVE_LLVM with LLVM_VERSION_MAJOR for atomic-optimizationsMarek Olšák2019-09-111-1/+1
| | | | trivial
* radv/gfx10: declare a LDS symbol for the NGG emit spaceSamuel Pitoiset2019-09-103-32/+19
| | | | | | | | | | | | | This fixes some interactions when NGG GS is enabled. It fixes: - dEQP-VK.clipping.user_defined.clip_cull_distance_dynamic_index.*geom* - dEQP-VK.tessellation.geometry_interaction.passthrough.* For some reasons, using the computed ESGS ring size randomly hangs with CTS. For now, just use the maximum LDS size for ESGS. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: calculate GFX9 GS and GFX10 NGG states before compiling shader variantsSamuel Pitoiset2019-09-101-35/+48
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store the ESGS ring size as part of gfx10_ngg_infoSamuel Pitoiset2019-09-102-1/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store GFX10 NGG state as part of the shader infoSamuel Pitoiset2019-09-102-44/+46
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store GFX9 GS state as part of the shader infoSamuel Pitoiset2019-09-102-31/+33
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fill shader info for all stages in the pipelineSamuel Pitoiset2019-09-104-20/+130
| | | | | | | | | This shouldn't be in NIR->LLVM because ACO also needs the shader info. This will also help for computing some NGG values that are necessary for declaring LDS symbols. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not pass all compiler options to the shader info passSamuel Pitoiset2019-09-103-28/+33
| | | | | | | Only the pipeline layout and the shader keys are needed. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: move texture storage allocation outside of radeonsiMarek Olšák2019-09-092-2/+65
| | | | | | possible code sharing with radv Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: move HTILE allocation outside of radeonsiMarek Olšák2019-09-092-3/+11
| | | | | | | ac_surface computes it for amdgpu. radeon_drm_surface computes it for radeon. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/surface: add RADEON_SURF_NO_FMASKMarek Olšák2019-09-092-4/+8
| | | | | | This controls FMASK and CMASK computation for MSAA. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: fix wave occupancy computationsMarek Olšák2019-09-093-7/+24
| | | | | Cc: 19.2 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: use fma on gfx10Marek Olšák2019-09-092-1/+9
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: enable LLVM atomic optimizationsMarek Olšák2019-09-091-1/+9
|
* radv: add support for vk_x11_override_min_image_countEric Engestrom2019-09-061-0/+1
| | | | | | | Cc: [email protected] Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* amd: move adaptive sync to performance section, as it is defined in xmlpoolEric Engestrom2019-09-061-1/+1
| | | | | | | | Fixes: 3844ed8d44677588bc29 ("radv: Add adaptive_sync driconfig option and enable it by default.") Fixes: e260493f2ab2483e5a55 ("radeonsi: Enable adaptive_sync by default for radeon") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* amd: replace major llvm version checks with LLVM_VERSION_MAJOREric Engestrom2019-09-066-20/+26
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* radv/gfx10: determine the number of vertices per primitive for TESSamuel Pitoiset2019-09-061-1/+16
| | | | | | | This doesn't fix anything known but it's correct now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: make use the output usage mask when exporting NGG GS paramsSamuel Pitoiset2019-09-061-3/+8
| | | | | | | | | It shouldn't matter much because output varyings should have been compacted during NIR shader linking but it mirrors what the driver does when emitting NGG GS vertex parameters. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: account for the subpass view for the NGG GS storageSamuel Pitoiset2019-09-061-0/+3
| | | | | | | | | If the fragment shader needs the layer index, we have to allocate one more dword in the NGG GS storage. Found by inspection. This doesn't fix anything known. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: calculate esgs_itemsize in the shader info passSamuel Pitoiset2019-09-062-14/+20
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: calculate the GSVS vertex size in the shader info passSamuel Pitoiset2019-09-062-15/+11
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather primitive ID in the shader info passSamuel Pitoiset2019-09-062-3/+17
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather layer in the shader info passSamuel Pitoiset2019-09-062-10/+20
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather viewport in the shader info passSamuel Pitoiset2019-09-062-8/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather pointsize in the shader info passSamuel Pitoiset2019-09-062-8/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather clip/cull distances in the shader info passSamuel Pitoiset2019-09-062-21/+25
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: move ac_fill_shader_info() to radv_nir_shader_info_pass()Samuel Pitoiset2019-09-062-45/+38
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: merge radv_shader_variant_info into radv_shader_infoSamuel Pitoiset2019-09-066-293/+275
| | | | | | | Having two different structs is useless. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: always set ballot_mask_bits to 64Samuel Pitoiset2019-09-061-2/+1
| | | | | | | | The codegen handles it and it adds the correct casts. This fixes a bunch of LLVM validation errors when enabling Wave32 for compute. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: allow specifying filter callback in lower_alu_to_scalarVasily Khoruzhick2019-09-061-1/+1
| | | | | | | | | | | | | Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* radv: Call nir_propagate_invariant()Connor Abbott2019-09-051-0/+2
| | | | | | | | Without this, invariant qualifiers don't do anything. Together with a fix to the game, this fixes flickering in No Man's Sky. Cc: [email protected] Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Enable nir_opt_large_constantsConnor Abbott2019-09-051-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | vkpipeline-db numbers: Totals: SGPRS: 1740306 -> 1741322 (0.06 %) VGPRS: 1331124 -> 1331712 (0.04 %) Spilled SGPRs: 21201 -> 21316 (0.54 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 256 -> 256 (0.00 %) dwords per thread Code Size: 79022628 -> 78694788 (-0.41 %) bytes LDS: 6500 -> 6500 (0.00 %) blocks Max Waves: 301413 -> 301302 (-0.04 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 53633 -> 54649 (1.89 %) VGPRS: 53000 -> 53588 (1.11 %) Spilled SGPRs: 3454 -> 3569 (3.33 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 5284232 -> 4956392 (-6.20 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 4239 -> 4128 (-2.62 %) Wait states: 0 -> 0 (0.00 %) (The biggest VGPR and max wave regression is due to unrolling a loop, which made the scheduler more aggressive, but in this case it's able to effectively hide latency so it's actually probably a win.) shader-db numbers with radeonsi NIR: Totals: SGPRS: 3526496 -> 3526512 (0.00 %) VGPRS: 2198576 -> 2198576 (0.00 %) Spilled SGPRs: 10463 -> 10463 (0.00 %) Spilled VGPRs: 86 -> 86 (0.00 %) Private memory VGPRs: 3182 -> 2528 (-20.55 %) Scratch size: 3308 -> 2640 (-20.19 %) dwords per thread Code Size: 74117280 -> 74106140 (-0.02 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 775846 -> 775844 (-0.00 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 856 -> 872 (1.87 %) VGPRS: 680 -> 680 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 654 -> 0 (-100.00 %) Scratch size: 668 -> 0 (-100.00 %) dwords per thread Code Size: 49652 -> 38512 (-22.44 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 182 -> 180 (-1.10 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Support load_constant intrinsicsConnor Abbott2019-09-051-0/+55
| | | | | | | Setup a constant global variable that LLVM will stick in a .rodata section and generate PC-relative loads for. Reviewed-by: Marek Olšák <[email protected]>
* radv/radeonsi: Don't count read-only data when reporting code sizeConnor Abbott2019-09-055-3/+13
| | | | | | | | | | We usually use these counts as a simple way to figure out if a change reduces the number of instructions or shrinks an instruction. However, since .rodata sections aren't executed, we shouldn't be counting their size for this analysis. Make the linker return the total executable size, and use it to report the more useful size in both drivers. Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Fix gather4 integer wa with unnormalized coordinatesConnor Abbott2019-09-031-4/+26
| | | | | | | | | | | | | | This adds a bit of unneccesary code on radeonsi, since whether unnormalized coordinates are used is known at compile time with GL, but I wasn't sure if it was worth the few instructions to plumb everything through, especially for something so rare -- my shader-db doesn't have any instances where this changes anything. Fixes CTS tests I created at https://github.com/cwabbott0/VK-GL-CTS/tree/unnorm-gather-tests Acked-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Rewrite gather4 integer workaround based on radeonsiConnor Abbott2019-09-031-65/+85
| | | | | | | | | | The workaround was originally written based on amdgpu-pro traces, but since then radeonsi has got its own slightly different version. Use the radeonsi version instead, to be consistent and because it'll be slightly more convenient for handling unnormalized coordinates. Acked-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: keep a pointer to a NIR shader into radv_shader_contextSamuel Pitoiset2019-08-301-36/+24
| | | | | | | This avoids multiple copies for nothing and it's more elegant. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: move setting can_discard to ac_fill_shader_info()Samuel Pitoiset2019-08-301-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: replace ac_nir_build_if by ac_build_ifccSamuel Pitoiset2019-08-301-107/+13
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: remove radv_init_llvm_target() helperSamuel Pitoiset2019-08-301-33/+1
| | | | | | | RADV no longer uses specific LLVM options compared to the common code. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: remove useless ac_llvm_util.h include from the WSI codeSamuel Pitoiset2019-08-301-1/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: remove unused shader_info parameter in ac_compile_llvm_module()Samuel Pitoiset2019-08-301-3/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: remove some unused fields from radv_shader_contextSamuel Pitoiset2019-08-301-2/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: move lowering PS inputs/outputs at the right placeSamuel Pitoiset2019-08-303-5/+8
| | | | | | | At shaders creation, just after NIR linking. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: gather info about PS inputs in the shader info passSamuel Pitoiset2019-08-304-74/+53
| | | | | | | It's the right place to do that. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac: drop now useless lookup_interp_param from ABISamuel Pitoiset2019-08-303-39/+32
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: import linear/perspective PS input parameters from radv/radeonsiSamuel Pitoiset2019-08-302-17/+23
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv/gfx10: compute the LDS size for exporting PrimID for VSSamuel Pitoiset2019-08-291-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>