mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	ac: add a bug workaround for the 100% NGG culling case	Marek Olšák	2020-03-09	1	-0/+33
\| \| \| \| \| \|	Fixes: 8db00a51f85 - radeonsi/gfx10: implement NGG culling for 4x wave32 subgroups Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4079>
*	amd: join emit_kill() from radv and radeonsi in ac_nir_to_llvm	Daniel Schürmann	2020-03-09	2	-3/+1
\| \| \| \| \| \|	Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>
*	amd/llvm: implement nir_intrinsic_demote(_if) and ↵	Daniel Schürmann	2020-03-09	3	-11/+132
\| \| \| \| \| \| \| \| \| \|	nir_intrinsic_is_helper_invocation The current implementation uses a temporary helper variable to ensure correct behavior until LLVM provides an intrinsic. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>
*	radeonsi: remove AMD_DEBUG=sisched option	Pierre-Eric Pelloux-Prayer	2020-03-06	2	-11/+9
\| \| \| \| \| \| \| \| \|	sisched is not maintained anymore in LLVM. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4059> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4059>
*	ac/llvm: flush denorms for nir_op_fmed3 on GFX8 and older gens	Samuel Pitoiset	2020-02-27	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \|	The hardware doesn't flush denorms, exactly like fmin/fmax, so we have to do it manually. This doesn't fix anything known. Fixes: d6a07732c9c ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
*	ac/llvm: fix 16-bit fmed3 on GFX8 and older gens	Samuel Pitoiset	2020-02-27	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \|	16-bit med3 is only supported on GFX9+. Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f16.*. Fixes: d6a07732c9c ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
*	ac/llvm: fix 64-bit fmed3	Samuel Pitoiset	2020-02-27	1	-17/+31
\| \| \| \| \| \| \| \| \| \| \|	Lower 64-bit fmed3 because LLVM doesn't expose an intrinsic. Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.mid3.f64.*. Fixes: d6a07732c9c ("ac: use llvm.amdgcn.fmed3 intrinsic for nir_op_fmed3") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3962>
*	ac/llvm: implement VK_AMD_shader_explicit_vertex_parameter	Samuel Pitoiset	2020-01-29	1	-20/+49
\| \| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
*	ac/llvm: fix missing casts in ac_build_readlane()	Samuel Pitoiset	2020-01-24	1	-6/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because ac_build_optimization_barrier() overwrites the original src_type, we have to keep track of it before emitting that barrier. Otherwise, wrong conversions are expected for pointers or small bitsizes. By doing this, we no longer need to do the cast dance in ac_build_readlane_no_opt_barrier(), it was just necessary for ac_build_optimization_barrier(). This fixes a bunch of crashes with subgroups related tests when RADV_DEBUG=checkir is enabled, and it also fixes a compiler crash with The Surge 2. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2395 Fixes: 0f45d4dc2b1 ("ac: add ac_build_readlane without optimization barrier") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3535>
*	ac/nir: add support for nir_texop_fragment_{mask}_fetch	Samuel Pitoiset	2020-01-23	1	-3/+35
\| \| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3304>
*	ac: add helper ac_build_triangle_strip_indices_to_triangle	Marek Olšák	2020-01-20	2	-0/+39
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	ac: add ac_build_readlane without optimization barrier	Marek Olšák	2020-01-20	2	-4/+17
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	ac: add prefix bitcount functions	Marek Olšák	2020-01-20	2	-0/+64
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	ac/cull: don't read Position.Z if it's not needed for culling	Marek Olšák	2020-01-15	1	-1/+1
\| \| \| \| \| \|	It could be NULL. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	nir/lower_atomics_to_ssbo: Also lower barriers	Jason Ekstrand	2020-01-13	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \|	This is more correct for a pass which is supposed to completely lower away atomic counters. It also lets us stop supporting atomic counter barriers in most of the drivers. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
*	nir: Rename nir_intrinsic_barrier to control_barrier	Jason Ekstrand	2020-01-13	1	-2/+2
\| \| \| \| \| \| \| \|	This is a more explicit name now that we don't want it to be doing any memory barrier stuff for us. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
*	nir: Add a new memory_barrier_tcs_patch intrinsic	Jason Ekstrand	2020-01-13	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Right now, it's implemented as a no-op for everyone. For most drivers, it's a switch case in the NIR -> whatever which just breaks. For ir3, they already have code to delete tessellation barriers so we just add a case to also delete memory_barrier_tcs_patch. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3307>
*	ac/llvm: Fix ac_build_reduce in wave32 mode.	Timur Kristóf	2020-01-10	1	-6/+9
\| \| \| \| \| \| \| \| \|	Previously, when cluster_size was set to 0, it always worked as if the cluster size was 64. This commit fixes it in wave32 mode by changing to work as if the cluster size was set to 32. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	amd/llvm: handle nir_intrinsic_image_deref_{load,store} with lod	Samuel Pitoiset	2020-01-09	1	-2/+10
\| \| \| \| \| \| \| \|	Use image_load_mip and image_store_mip respectively if the lod parameter isn't zero. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac: add ac_build_s_endpgm	Marek Olšák	2020-01-08	2	-0/+7
\| \| \| \|	Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	ac: add 128-bit bitcount	Marek Olšák	2020-01-08	2	-0/+12
\| \| \| \|	Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	ac: unify primitive export code	Marek Olšák	2020-01-08	2	-0/+66
\| \| \| \|	Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	ac: unify build_sendmsg_gs_alloc_req	Marek Olšák	2020-01-08	2	-0/+23
\| \| \| \|	Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	ac: fix the return value in cull_bbox when bbox culling is disabled	Marek Olšák	2019-12-16	1	-1/+1
\| \| \| \| \|	Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
*	ac: fix ac_get_i1_sgpr_mask for Wave32	Marek Olšák	2019-12-16	1	-2/+11
\| \| \| \| \|	Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3095>
*	ac/nir: fix out-of-bound access when loading constants from global	Samuel Pitoiset	2019-12-12	1	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Global load/store instructions can't know if the offset is out-of-bound because they don't use descriptors (no range). Fix this by clamping the offset for arrays that are indexed with a non-constant offset that's greater or equal to the array size. This fixes VM faults and GPU hangs with Dead Rising 4. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2148 Fixes: 71a67942003 ("ac/nir: Enable nir_opt_large_constants") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac/llvm: fix atomic var operations if source isn't a deref	Samuel Pitoiset	2019-12-03	1	-7/+9
\| \| \| \| \| \| \| \|	Fixes some CTS regressions. Fixes: e61a826f396 ("ac/llvm: fix pointer type for global atomics") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac/llvm: improve sync scope for global atomics	Rhys Perry	2019-12-02	1	-0/+3
\| \| \| \| \| \| \|	Stronger ordering is implemented in SPIRV->NIR with barriers. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	ac/llvm: fix pointer type for global atomics	Rhys Perry	2019-12-02	1	-0/+6
\| \| \| \| \|	Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	radv,ac/nir: lower deref operations for shared memory	Samuel Pitoiset	2019-11-29	1	-22/+26
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This shouldn't introduce any functional changes for RadeonSI when NIR is enabled because these operations are already lowered. pipeline-db (NAVI10/LLVM): SGPRS: 9043 -> 9051 (0.09 %) VGPRS: 7272 -> 7292 (0.28 %) Code Size: 638892 -> 621628 (-2.70 %) bytes LDS: 1333 -> 1331 (-0.15 %) blocks Max Waves: 1614 -> 1608 (-0.37 %) Found this while glancing at some F12019 shaders. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	amd/llvm: Refactor ac_build_scan.	Bas Nieuwenhuizen	2019-11-28	1	-40/+51
\| \| \| \| \| \| \| \|	Split out the logic for exclusive scans into a separate function that makes clear what it does instead of having this opaque 60 line if. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	ac/llvm: convert src operands to pointers if necessary	Samuel Pitoiset	2019-11-28	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	To avoid generating invalid LLVM IR when both operands don't have the same type. This might happen when performing pointer comparisons with SPIRV 1.4. Fixes invalid LLVM IR for: dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrequal.variable_pointers_ssbo_equal dEQP-VK.spirv_assembly.instruction.spirv1p4.opptrnotequal.variable_pointers_ssbo_not_equal Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac/nir: don't rely on data.patch for tess factors	Marek Olšák	2019-11-27	1	-2/+6
\| \| \| \|	Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	ac: add 8-bit and 16-bit supports to ac_build_permlane16()	Samuel Pitoiset	2019-11-27	1	-8/+16
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	radv/gfx10: fix implementation of exclusive scans	Samuel Pitoiset	2019-11-27	1	-24/+52
\| \| \| \| \| \| \| \| \| \| \|	This implementation is loosely based on ROCm. https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl This fixes dEQP-VK.subgroups.arithmetic..subgroupexclusive on GFX10. Fixes: 227c29a80de ("amd/common/gfx10: implement scan & reduce operations") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac/llvm: fix warning in ac_build_canonicalize()	Samuel Pitoiset	2019-11-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	../src/amd/llvm/ac_llvm_build.c: In function ‘ac_build_canonicalize’: ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘intr’ may be used uninitialized in this function [-Wmaybe-uninitialized] 4567 \| return ac_build_intrinsic(ctx, intr, type, params, 1, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4568 \| AC_FUNC_ATTR_READNONE); \| ~~~~~~~~~~~~~~~~~~~~~~ ../src/amd/llvm/ac_llvm_build.c:4567:9: warning: ‘type’ may be used uninitialized in this function [-Wmaybe-uninitialized] Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
*	radeonsi/nir: don't run si_nir_opts again if there is no change	Marek Olšák	2019-11-25	2	-7/+10
\| \| \| \| \| \| \|	0.3% less overhead Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
*	ac: set swizzled bit in cache policy as a hint not to merge loads/stores	Marek Olšák	2019-11-25	3	-10/+7
\| \| \| \| \| \|	LLVM now merges loads and stores for all opcodes, so this must be set. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	ac/nir, radv, radeonsi: Switch to using ac_shader_args	Connor Abbott	2019-11-25	3	-77/+65
\| \| \| \| \|	Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Marek Olšák <marek.olsak@amd.com>
*	ac: Add a shared interface between radv, radeonsi, LLVM and ACO	Connor Abbott	2019-11-25	3	-0/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	ac_shader_args will be similar to ac_shader_abi, except for being free from LLVM-specific concepts and therefore capable of being shared between LLVM and ACO. This will help us accomplish a few different things: - Decouple setting up SGPR and VGPR arguments from translating to LLVM, so that we can reference these arguments in NIR lowering passes, which will let us lower e.g. descriptor sets in NIR. - Stop using radv-specific structures for things like determining the chip generation in ACO. In the end, we should replace ac_shader_abi with this structure + driver-specific lowering passes. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	ac/llvm: fix the local invocation index for wave32	Samuel Pitoiset	2019-11-25	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Fixes dEQP-VK.compute.builtin_var.local_invocation_index with RADV_PERFTEST=cswave32. My initial fix was to lower it but Rhys suggested the shift-right and it's much better like this. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	amd/llvm: Add Subgroup Scan functions for SI	Daniel Schürmann	2019-11-20	1	-6/+75
\| \| \| \| \| \| \|	The idea of this implementation is taken from the ROCm Device Libs: https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
*	nir: move data.image.access to data.access	Marek Olšák	2019-11-19	1	-2/+2
\| \| \| \| \| \|	The size of the data structure doesn't change. Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
*	ac: add 16-bit float support to ac_build_alu_op()	Samuel Pitoiset	2019-11-19	1	-4/+5
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac: add 8-bit and 16-bit supports to ac_build_optimization_barrier()	Samuel Pitoiset	2019-11-19	1	-2/+13
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac: add 8-bit and 16-bit supports to ac_build_wwm()	Samuel Pitoiset	2019-11-19	1	-3/+18
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac: add 8-bit and 16-bit supports to get_reduction_identity()	Samuel Pitoiset	2019-11-19	1	-1/+33
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac: add 8-bit and 16-bit supports to ac_build_swizzle()	Samuel Pitoiset	2019-11-19	1	-6/+13
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac: add 8-bit and 16-bit supports to ac_build_dpp()	Samuel Pitoiset	2019-11-19	1	-13/+20
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
*	ac: add 8-bit and 16-bit supports to ac_build_set_inactive()	Samuel Pitoiset	2019-11-19	1	-0/+9
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>