aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/common/ac_llvm_build.h
Commit message (Collapse)AuthorAgeFilesLines
* ac: Introduce ac_build_expand()Connor Abbott2018-10-221-0/+3
| | | | | | | | And implement ac_bulid_expand_to_vec4() on top of it. Fixes: 7e7ee82698247d8f93fe37775b99f4838b0247dd ("ac: add support for 16bit buffer loads") Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add helpers for fast integer division by a constantMarek Olšák2018-10-161-0/+17
|
* radv: emit the GLC bit for SSBO loads/stores when neededSamuel Pitoiset2018-10-121-1/+2
| | | | | | | | | This fixes some new memory model tests: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108112 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_roundMarek Olšák2018-10-061-0/+1
|
* ac: define all address spaces properlyMarek Olšák2018-10-061-4/+6
|
* ac: add 16-bit constant values for zero and oneSamuel Pitoiset2018-09-171-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_bifield_reverse() helperSamuel Pitoiset2018-09-171-0/+3
| | | | | | | Are we missing 64-bit support? Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_bit_count() helperSamuel Pitoiset2018-09-171-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: fix GPU hangs with bindless textures and LLVM 7.0Marek Olšák2018-09-101-0/+4
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac: fix WAITCNT flags for GFX9Marek Olšák2018-08-221-0/+6
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add imad & fmad helpersMarek Olšák2018-08-211-0/+4
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add ac_build_s_barrierMarek Olšák2018-08-211-0/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add support for 16bit UBO loadsDaniel Schürmann2018-07-231-0/+8
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: fold LLVMContext creation into ac_llvm_context_initMarek Olšák2018-07-041-1/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: use ac_build_image_opcode for image intrinsicsNicolai Hähnle2018-04-201-6/+0
| | | | | | So that we'll use the dimension-aware intrinsics in the future. Acked-by: Marek Olšák <[email protected]>
* radeonsi: generate image load/store/atomic ops using ac_build_image_opcodeNicolai Hähnle2018-04-201-5/+32
| | | | | | In preparation of dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass address components individually to ac_build_image_intrinsicNicolai Hähnle2018-04-201-7/+7
| | | | | | This is in preparation for the new image intrinsics. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass new enum ac_image_dim to ac_build_image_opcodeNicolai Hähnle2018-04-201-1/+12
| | | | | | | This is in preparation for the new, dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <[email protected]>
* ac: add LLVM build functions for subgroup instrinsicsDaniel Schürmann2018-04-141-1/+29
| | | | | Co-authored-by: Connor Abbott <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: move FMASK shader logic to shared codeMarek Olšák2018-04-021-0/+3
| | | | | | We'll need it for FBFETCH in both TGSI and NIR paths. Tested-by: Dieter Nützel <[email protected]>
* ac/nir: Add workaround for GFX9 buffer views.Bas Nieuwenhuizen2018-03-291-0/+10
| | | | | | | | | | | | | | | | | | | | | On GFX9 whether the buffer size is interpreted as elements or bytes depends on whether IDXEN is enabled in the instruction. If the index is a constant zero, LLVM optimizes IDXEN to 0. Now the size in elements is interpreted in bytes which of course results in out of bounds accesses. The correct fix is most likely to disable the LLVM optimization, but we need something to work with LLVM <= 6.0. radeonsi does the max between stride and element count on the CPU but that results in the size intrinsics returning the wrong size for the buffer. This would cause CTS errors for radv. v2: Also include the store changes. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: move unpack_param() to ac_llvm_build.cSamuel Pitoiset2018-03-131-0/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: move trim_vector to ac_llvm_build.cSamuel Pitoiset2018-03-131-0/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: move cast_ptr() to ac_llvm_build.cSamuel Pitoiset2018-03-131-0/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: move ac_build_alloca() to ac_llvm_build.cSamuel Pitoiset2018-03-131-0/+5
| | | | | | | As well as si_build_alloca_undef() and drop the si prefix. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add if/loop build helpersTimothy Arceri2018-03-081-0/+20
| | | | | | These have been ported over from radeonsi. Reviewed-by: Marek Olšák <[email protected]>
* ac: add ac_build_fsign()Samuel Pitoiset2018-03-051-0/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_isign()Samuel Pitoiset2018-03-051-0/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_fract()Samuel Pitoiset2018-03-051-0/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: implement 32-bit pointers in user data SGPRs (v2)Marek Olšák2018-02-171-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | User SGPRs changes: VS: 14 -> 9 TCS: 14 -> 10 TES: 10 -> 6 GS: 8 -> 4 GSCOPY: 2 -> 1 PS: 9 -> 5 Merged VS-TCS: 24 -> 16 Merged VS-GS: 18 -> 11 Merged TES-GS: 18 -> 11 SGPRS: 2170102 -> 2158430 (-0.54 %) VGPRS: 1645656 -> 1641516 (-0.25 %) Spilled SGPRs: 9078 -> 8810 (-2.95 %) Spilled VGPRs: 130 -> 114 (-12.31 %) Scratch size: 1508 -> 1492 (-1.06 %) dwords per thread Code Size: 52094872 -> 52692540 (1.15 %) bytes Max Waves: 371848 -> 372723 (0.24 %) v2: - the shader cache needs to take address32_hi into account - set amdgpu-32bit-address-high-bits Reviewed-by: Samuel Pitoiset <[email protected]> (v1)
* ac: Use the renumbered const address space for LLVM 7.Bas Nieuwenhuizen2018-02-141-1/+2
| | | | | | | The LLVM AMDGPU backend decided to renumber the constant address space .... Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move get_elem_bits() to ac_llvm_build.cTimothy Arceri2018-02-091-0/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac: add ac_build_export_null() helperSamuel Pitoiset2018-02-081-0/+2
| | | | | | | Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: create ac_build_shader_clock() helperTimothy Arceri2018-02-071-0/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use pknorm_i16/u16 and pk_i16/u16 LLVM intrinsicsMarek Olšák2018-02-021-0/+13
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add glc parameter to ac_build_buffer_load_formatMarek Olšák2018-02-011-0/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: load the right number of components for VS inputs and TBOsMarek Olšák2018-02-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | The supported counts are 1, 2, 4. (3=4) The following snippet loads float, vec2, vec3, and vec4: Before: buffer_load_format_x v9, v4, s[0:3], 0 idxen ; E0002000 80000904 buffer_load_format_xyzw v[0:3], v5, s[8:11], 0 idxen ; E00C2000 80020005 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[2:5], v6, s[12:15], 0 idxen ; E00C2000 80030206 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[5:8], v7, s[4:7], 0 idxen ; E00C2000 80010507 After: buffer_load_format_x v10, v4, s[0:3], 0 idxen ; E0002000 80000A04 buffer_load_format_xy v[8:9], v5, s[8:11], 0 idxen ; E0042000 80020805 buffer_load_format_xyzw v[0:3], v6, s[12:15], 0 idxen ; E00C2000 80030006 s_waitcnt vmcnt(0) ; BF8C0F70 buffer_load_format_xyzw v[3:6], v7, s[4:7], 0 idxen ; E00C2000 80010307 Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: rename and move si_const_array into common codeMarek Olšák2018-01-271-0/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move address space definitions to common codeMarek Olšák2018-01-271-0/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: pass the number of channels to ac_build_buffer_load_format()Samuel Pitoiset2018-01-261-0/+1
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: add i64_0 and i64_1 to llvm build contextTimothy Arceri2018-01-141-0/+2
| | | | | | These will be used in the following patch. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add f64_0 to the llvm build contextTimothy Arceri2018-01-121-0/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add f64_1 to the llvm build contextTimothy Arceri2018-01-121-0/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add ac_build_fmin/fmax helpersMarek Olšák2018-01-061-1/+4
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move some helpers to ac_llvm_build.cTimothy Arceri2018-01-051-0/+8
| | | | | | | We will call these from the radeonsi NIR backend. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: pass the family to ac_llvm_context_init()Samuel Pitoiset2017-12-221-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: add ac_build_waitcnt()Samuel Pitoiset2017-12-141-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: move build_varying_gather_values() to ac_llvm_build.h and exposeTimothy Arceri2017-12-041-0/+4
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add v2f32 to the common code and make use of itTimothy Arceri2017-11-031-0/+1
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>
* ac: add v3i32 to the common code and make use of itTimothy Arceri2017-11-031-0/+1
| | | | | Reviewed-by: Marek Olšák <[email protected] Acked-by: Nicolai Hähnle <[email protected]>