aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
...
* ac/nir: implement 8-bit conversionsRhys Perry2019-03-211-0/+4
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: add 8-bit types to glsl_base_to_llvm_typeRhys Perry2019-03-211-0/+3
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: implement 8-bit ssbo storesRhys Perry2019-03-211-2/+7
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_tbuffer_store_byte() helperSamuel Pitoiset2019-03-212-0/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: implement 8-bit push constant, ssbo and ubo loadsRhys Perry2019-03-211-10/+55
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_tbuffer_load_byte() helperSamuel Pitoiset2019-03-212-0/+26
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add various int8 definitionsSamuel Pitoiset2019-03-212-2/+10
| | | | | | | Original patch by Rhys Perry. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()Samuel Pitoiset2019-03-201-40/+26
| | | | | | | New buffer intrinsics have a separate soffset parameter. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use new LLVM 8 intrinsic when storing 16-bit valuesSamuel Pitoiset2019-03-203-21/+33
| | | | | | | vindex is always 0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_{struct,raw}_tbuffer_store() helpersSamuel Pitoiset2019-03-202-0/+156
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use new LLVM 8 intrinsics in ac_build_buffer_load()Samuel Pitoiset2019-03-201-0/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: use ac_build_buffer_store_dword() for SSBO store operationsSamuel Pitoiset2019-03-201-14/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: use ac_build_buffer_load() for SSBO load operationsSamuel Pitoiset2019-03-201-29/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: use new LLVM 8 intrinsics for SSBO atomic operationsSamuel Pitoiset2019-03-201-24/+42
| | | | | | | Use the raw version (ie. IDXEN=0) because vindex is unused. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: remove one useless check in visit_store_ssbo()Samuel Pitoiset2019-03-201-6/+3
| | | | | | | Trivial. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_buffer_store_format() helperSamuel Pitoiset2019-03-203-21/+119
| | | | | | | Similar to ac_build_buffer_load_format(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: set attrib flags for SSBO and image store operationsSamuel Pitoiset2019-03-201-3/+6
| | | | | | | For consistency regarding other store operations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: make use of ac_get_store_intr_attribs() where possibleSamuel Pitoiset2019-03-201-6/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use llvm.amdgcn.fract intrinsic for nir_op_ffractSamuel Pitoiset2019-03-201-5/+4
| | | | | | | | | | | | | | | | | | | | | | Noticed with a Doom shader. 29077 shaders in 15096 tests Totals: SGPRS: 1282125 -> 1282133 (0.00 %) VGPRS: 908716 -> 908616 (-0.01 %) Spilled SGPRs: 24811 -> 24779 (-0.13 %) Code Size: 49048176 -> 48936488 (-0.23 %) bytes Max Waves: 244232 -> 244226 (-0.00 %) Totals from affected shaders: SGPRS: 229584 -> 229592 (0.00 %) VGPRS: 163268 -> 163168 (-0.06 %) Spilled SGPRs: 8682 -> 8650 (-0.37 %) Code Size: 12819572 -> 12707884 (-0.87 %) bytes Max Waves: 24398 -> 24392 (-0.02 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir_to_llvm: add assert to emit_bcsel()Timothy Arceri2019-03-181-0/+2
| | | | | | | nir to llvm assumes we have already split vectors to scalars via nir_lower_alu_to_scalar(). Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use the raw tbuffer version for 16-bit SSBO loadsSamuel Pitoiset2019-03-133-6/+3
| | | | | | | vindex is always 0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_{struct,raw}_tbuffer_load() helpersSamuel Pitoiset2019-03-132-17/+68
| | | | | | | The struct version sets IDXEN=1, while the raw version sets IDXEN=0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: rework typed buffers loads for LLVM 7Samuel Pitoiset2019-03-133-57/+83
| | | | | | | Be more generic, this will be used by an upcoming series. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: fix 16-bit ssbo storesRhys Perry2019-03-121-0/+2
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()Timothy Arceri2019-03-061-1/+1
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename record_location_offset() -> struct_location_offset()Timothy Arceri2019-03-061-2/+2
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/record_location_offset(/struct_location_offset(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* radv: Fix float16 interpolation set up.Bas Nieuwenhuizen2019-02-222-0/+39
| | | | | | | | float16 types can have non-flat interpolation so set up the HW correctly for that. Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Handle clip+cull distances more generally as compact arrays.Bas Nieuwenhuizen2019-02-201-2/+11
| | | | | | | | | | | | Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 . That MR keeps the clip and cull arrays split. So we have to handle - compact arrays with location_frac != 0 - VARYING_SLOT_CLIP_DIST1 Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpowKenneth Graunke2019-02-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL leaves it undefined). Performing fpow lowering in NIR would break this behavior, preventing us from using prog_to_nir. According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>, which presumably does a zero-wins multiply. Lowering in NIR results in a non-legacy multiply, where: pow(0, 0) = 2^(log2(0) * 0) = 2^(-INF * 0) = 2^(-NaN) = -NaN which isn't the desired result. This reverts: - commit d6b75392067712908bdc372f1007e085439bf9f5 (ac/nir: remove emission of nir_op_fpow) - commit 22430224fec31591432d4a3e65c6f457ba1c1653 (radeonsi/nir: enable lowering of fpow) and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir after enabling prog_to_nir in st/mesa later in this series. Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: implement half-float nir_op_ldexpRhys Perry2019-02-191-1/+3
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: implement half-float nir_op_frsqRhys Perry2019-02-191-2/+1
| | | | | | | v2: don't use ac_get_onef() Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: implement half-float nir_op_frcpRhys Perry2019-02-191-2/+1
| | | | | | | v2: don't use ac_get_onef() Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: make ac_build_fdiv support 16-bit floatsRhys Perry2019-02-191-1/+1
| | | | | | | v2: don't use ac_get_onef() Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: make ac_build_isign work on all bit sizesRhys Perry2019-02-191-23/+4
| | | | | | | v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size() Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: make ac_build_clamp work on all bit sizesRhys Perry2019-02-191-4/+9
| | | | | | | | v2: don't use ac_get_zerof() and ac_get_onef() v3: rename "intr" to "name" Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: fix 64-bit nir_op_f2f16_rtzRhys Perry2019-02-191-0/+2
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: implement 8-bit nir_load_const_instrRhys Perry2019-02-191-0/+4
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: use new LLVM 8 intrinsic when loading 16-bit valuesSamuel Pitoiset2019-02-181-14/+27
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_llvm8_tbuffer_load() helperSamuel Pitoiset2019-02-182-0/+52
| | | | | | | It uses the new LLVM intrinsics. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: make use of ac_build_expand_to_vec4() in visit_image_store()Samuel Pitoiset2019-02-143-8/+6
| | | | | | | And make ac_build_expand() a static function. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for push constants inlining when possibleSamuel Pitoiset2019-02-122-3/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | This removes some scalar loads from shaders, but it increases the number of SET_SH_REG packets. This is currently basic but it could be improved if needed. Inlining dynamic offsets might also help. Original idea from Dave Airlie. 29077 shaders in 15096 tests Totals: SGPRS: 1321325 -> 1357101 (2.71 %) VGPRS: 936000 -> 932576 (-0.37 %) Spilled SGPRs: 24804 -> 24791 (-0.05 %) Code Size: 49827960 -> 49642232 (-0.37 %) bytes Max Waves: 242007 -> 242700 (0.29 %) Totals from affected shaders: SGPRS: 290989 -> 326765 (12.29 %) VGPRS: 244680 -> 241256 (-1.40 %) Spilled SGPRs: 1442 -> 1429 (-0.90 %) Code Size: 8126688 -> 7940960 (-2.29 %) bytes Max Waves: 80952 -> 81645 (0.86 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: Implement global memory accesses.Bas Nieuwenhuizen2019-02-061-20/+131
| | | | | | | | | | Needed for VK_EXT_buffer_device_address. The pointers are implmemented as i8*, since I could not figure out how to emulate setting struct offsets in LLVM based on the SPIR-V offsets (and more weird stuff like row major matrices). Acked-by: Samuel Pitoiset <[email protected]>
* amd/common: Do not use 32-bit loads for shared memory.Bas Nieuwenhuizen2019-02-061-6/+12
| | | | | | | | | We use a straight glsl->llvm type conversion so types should already be right. Also even though the writemasks were changed we we not actually doing 32-bit things, so this fails miserably. Reviewed-by: Samuel Pitoiset <[email protected]>
* amd/common: handle nir_deref_cast for shared memory from integers.Bas Nieuwenhuizen2019-02-061-68/+82
| | | | | | | Can happen e.g. after a phi. Fixes: a2b5cc3c399 "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <[email protected]>
* amd/common: Handle nir_deref_type_ptr_as_array for shared memory.Bas Nieuwenhuizen2019-02-061-0/+4
| | | | | Fixes: a2b5cc3c399 "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <[email protected]>
* amd/common: Fix stores to derefs with unknown variable.Bas Nieuwenhuizen2019-02-061-8/+13
| | | | | Fixes: a2b5cc3c399 "radv: enable variable pointers" Reviewed-by: Samuel Pitoiset <[email protected]>
* amd/common: Use correct writemask for shared memory stores.Bas Nieuwenhuizen2019-02-061-1/+1
| | | | | | | The check was for 1 bit being set, which is clearly not what we want. CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* amd/common: Implement ptr->int casts in ac_to_integer.Bas Nieuwenhuizen2019-02-061-0/+13
| | | | | | | | | For the implicit casts inherent in nir. This should probably have been done for shared memory for VK_KHR_variable_pointers. Reviewed-by: Samuel Pitoiset <[email protected]>
* amd/common: Add gep helper for pointer increment.Bas Nieuwenhuizen2019-02-062-0/+13
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/radv/radeonsi: add ac_get_num_physical_sgprs() helperTimothy Arceri2019-02-011-0/+6
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>