summaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* ac: add cpdma_prefetch_writes_memory to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+3
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_out_of_order_rast to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_load_ctx_reg_pkt to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_rbplus to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_dcc_constant_encode to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+5
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_distributed_tess to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_clear_state to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+7
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: drop llvm8 from some load/store helpersSamuel Pitoiset2019-08-271-115/+75
| | | | | | | | Cleanup. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Remove gfx9_stride_size_workaround_for_atomicConnor Abbott2019-08-262-5/+1
| | | | | | | | | The workaround was entirely in common code, and it's needed in radeonsi too so just always do it when necessary. Fixes KHR-GL45.shader_image_load_store.advanced-allStages-oneImage on gfx9 with LLVM 8. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: add a workaround for viewing a slice of 3D as a 2D imageConnor Abbott2019-08-261-3/+32
| | | | | | | | | | | | GL and Vulkan allow you to bind a single layer of a 3D texture to a 2D image, and we weren't implementing a workaround for that on gfx9 that TGSI was. Copy it over. Fixes KHR-GL45.shader_image_load_store.non-layered_binding with radeonsi NIR. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Assert GS input index is constantConnor Abbott2019-08-231-0/+1
| | | | | | If it's not we silently ignore indir_index which is definitely a bug. Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Handle const array offsets in get_deref_offset()Connor Abbott2019-08-231-6/+11
| | | | | | | | | | | Some users of this function (e.g. GS inputs) currently only work with constant offsets. We got lucky since all the tests used an array index of 0, so the non-constant part was always 0. But we still need to handle this. This doesn't fix any CTS test, but was noticed while debugging one. Reviewed-by: Marek Olšák <[email protected]>
* ac,radv,radeonsi: remove LLVM 7 supportSamuel Pitoiset2019-08-236-290/+58
| | | | | | | | Now that LLVM 9 will be released soon, we will only support LLVM 8, 9 and master (10). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: fix exclusive scans on GFX8-GFX9Samuel Pitoiset2019-08-221-4/+3
| | | | | | | | | | | | This fixes a regression introduced with scan&reduce operations on GFX10. Note that some subgroups CTS still fail on GFX10 but I assume it's a different issue. This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*. Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations" Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Add explicit signs to image min/max intrinsicsJason Ekstrand2019-08-211-17/+30
| | | | | | | | | | | This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radeonsi/nir: always lower ballot masks as 64-bit, codegen handles itMarek Olšák2019-08-193-2/+11
| | | | | | This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks. This solution is better, because the IR isn't dependent on wave32.
* ac/nir: set image=true when loading FMASK for imagesMarek Olšák2019-08-191-1/+1
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/nir: Fix store_scratch with a non-full writemaskConnor Abbott2019-08-183-5/+42
| | | | | | | | | | | | By adding one more helper to ac_llvm_build, we can also easily keep vector stores together. Fixes the tests/spec/glsl-1.30/execution/fs-large-local-array-vec4.shader_test piglit test. Fixes: 74470baebbd ("ac/nir: Lower large indirect variables to scratch") Reviewed-by: Marek Olšák <[email protected]
* radeonsi: add support for RenoirMarek Olšák2019-08-143-1/+4
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/nir: implement default tess level system valuesMarek Olšák2019-08-122-3/+10
| | | | Reviewed-by: Connor Abbott <[email protected]>
* compiler: add SYSTEM_VALUE_USER_DATA_AMDMarek Olšák2019-08-122-0/+5
| | | | for internal radeonsi shaders
* compiler: add ACCESS_STREAM_CACHE_POLICYMarek Olšák2019-08-121-0/+3
| | | | | | radeonsi will use this. Reviewed-by: Connor Abbott <[email protected]>
* amd: prepare dropping include of p_compiler.hLionel Landwerlin2019-08-092-3/+4
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* radeonsi: add support for nir atomic_inc_wrap/atomic_dec_wrapPierre-Eric Pelloux-Prayer2019-08-061-0/+25
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac: add ac_atomic_inc_wrap / ac_atomic_dec_wrap supportPierre-Eric Pelloux-Prayer2019-08-062-0/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: handle key.mono.u.ps.interpolate_at_sample_force_centerMarek Olšák2019-08-062-0/+4
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/nir: Use correct cast for readfirstlane and ptrs.Bas Nieuwenhuizen2019-08-061-0/+2
| | | | | | Fixes: 028ce527 "radv: Add non-uniform indexing lowering." Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Lower large indirect variables to scratchConnor Abbott2019-08-051-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | results from radeonsi NIR: Totals from affected shaders: SGPRS: 704 -> 464 (-34.09 %) VGPRS: 2056 -> 672 (-67.32 %) Spilled SGPRs: 24 -> 0 (-100.00 %) Spilled VGPRs: 28406 -> 0 (-100.00 %) Private memory VGPRs: 0 -> 3182 (0.00 %) Scratch size: 1064 -> 3228 (203.38 %) dwords per thread Code Size: 935260 -> 40180 (-95.70 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 28 -> 70 (150.00 %) Wait states: 0 -> 0 (0.00 %) results from radv: Totals from affected shaders: SGPRS: 80 -> 48 (-40.00 %) VGPRS: 204 -> 108 (-47.06 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 256 (0.00 %) dwords per thread Code Size: 15792 -> 9504 (-39.82 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1 -> 2 (100.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir,radv: Optimize bounds check for 64 bit CAS.Bas Nieuwenhuizen2019-08-022-17/+27
| | | | | | | | When the application does not ask for robust buffer access. Only implemented the check in radv. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Implement LLVM9 64-bit buffer compare & exchange.Bas Nieuwenhuizen2019-08-021-4/+64
| | | | | | | | | | | LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this extracts the ptr, does a bound check and then uses a cmpxchg LLVM instruction. Not ideal, but the earliest release we're going to get a proper intrinsic is LLVM 10. Reviewed-by: Samuel Pitoiset <[email protected]>
* Revert "ac/nir: handle negate modifier"Connor Abbott2019-08-021-12/+1
| | | | This reverts commit bfea7e4d2965269bff8f1f6449cb99c312fd7384.
* Revert "ac/nir: handle abs modifier"Connor Abbott2019-08-021-30/+11
| | | | | | This reverts commit d3c80733cdfe8552b2f447ec8ed62465d0f2af1a. These were only appearing due to memory corruption.
* tree-wide: replace MAYBE_UNUSED with ASSERTEDEric Engestrom2019-07-311-7/+7
| | | | | | Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* ac/nir: fix incorrect Phis if callbacks use control flow inside control flowMarek Olšák2019-07-301-2/+2
|
* ac/nir: handle abs modifierMarek Olšák2019-07-301-11/+30
|
* ac: fix a memory leak in the error path of ac_build_type_name_for_intrMarek Olšák2019-07-301-0/+1
|
* ac: allow control flow statements in NIR callbacksMarek Olšák2019-07-302-20/+29
| | | | This fixes a crash when compiling geometry shaders on radeonsi.
* ac/nir: handle negate modifierMarek Olšák2019-07-301-1/+12
|
* radeonsi/nir: implement FBFETCH for KHR_blend_equation_advancedMarek Olšák2019-07-302-0/+7
|
* ac/surface: allow linear swizzle mode automatic selection on gfx9 & 10Marek Olšák2019-07-301-1/+0
| | | | let addrlib make the decision to get the same result as PAL.
* amd: add support for ArcturusMarek Olšák2019-07-292-0/+3
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: add support for compute-only chipsMarek Olšák2019-07-293-1/+6
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: do not crash when the buffer data format is invalidSamuel Pitoiset2019-07-291-0/+1
| | | | | | | | | | This might happen when a pipeline doesn't define the vertex input state, so the buffer data format is 0 (aka INVALID). This fixes crashes when compiling some shaders on GFX10. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: fix txf_ms with an offsetRhys Perry2019-07-291-2/+2
| | | | | | | | | | Seems to fix some hair artifacts in Max Payne 3: https://github.com/daniel-schuermann/mesa/issues/76 Signed-off-by: Rhys Perry <[email protected]> Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver') Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: implement nir_op_pack_{us}norm_2x16Marek Olšák2019-07-231-5/+14
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/nir: do not clamp shadow reference on GFX10Samuel Pitoiset2019-07-221-2/+6
| | | | | | | | RadeonSI only uses Z32_FLOAT_CLAMP for upgraded depth textures on GFX10 and RADV doesn't promotes Z16 or Z24. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/gfx10: use 32-bit wavemasks for Wave32Marek Olšák2019-07-193-8/+23
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: create the LLVM builder in ac_llvm_context_initMarek Olšák2019-07-192-3/+4
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: create the LLVM module for Wave32 or Wave64 in ac_llvm_context_initMarek Olšák2019-07-192-1/+6
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/rtld: add support for Wave32Marek Olšák2019-07-194-5/+15
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>