mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	ac/nir: restrict fmask lookup to image load intrinsics	Samuel Pitoiset	2018-12-20	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	We don't ever want to do the fmask lookup on a atomic or store, the fmask should have been decompressed if the surface has been moved to IMAGE_LAYOUT. Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: Work around non-renderable 128bpp compressed 3d textures on GFX9.	Bas Nieuwenhuizen	2018-12-20	2	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	Exactly what title says, the new addrlib does not allow the above with certain dimensions that the CTS seems to hit. Work around it by not allowing the app to render to it via compat with other 128bpp formats and do not render to it ourselves during copies. Fixes: 776b9113656 "amd/addrlib: update Mesa's copy of addrlib" Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac/nir: remove the bitfield_extract workaround for LLVM 8	Samuel Pitoiset	2018-12-20	1	-9/+15
\| \| \| \| \| \| \| \|	This workaround has been introduced by 3d41757788a and it is no longer needed since LLVM r346422. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when available	Nicolai Hähnle	2018-12-19	2	-0/+3
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	ac/surface: 3D and cube surfaces are never displayable	Nicolai Hähnle	2018-12-19	1	-3/+5
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	amd/common: add i1 special case to ac_build_{inclusive,exclusive}_scan	Nicolai Hähnle	2018-12-19	1	-2/+25
\| \| \| \| \| \| \|	Allow for a unified but efficient treatment of adding a bitmask over a wave or an entire threadgroup. Reviewed-by: Marek Olšák <[email protected]>
*	amd/common: scan/reduce across waves of a workgroup	Nicolai Hähnle	2018-12-19	2	-4/+227
\| \| \| \| \| \| \|	Order-aware scan/reduce can trade-off LDS traffic for external atomics memory traffic in producer/consumer compute shaders. Reviewed-by: Marek Olšák <[email protected]>
*	amd/common: add ac_build_ifcc	Nicolai Hähnle	2018-12-19	2	-4/+4
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	amd/common: whitespace fixes	Nicolai Hähnle	2018-12-19	1	-10/+8
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	amd/sid_tables: add additional python3 compatibility imports	Nicolai Hähnle	2018-12-19	1	-1/+1
\| \| \| \| \| \|	This happened to bite me while doing some experiments. Reviewed-by: Marek Olšák <[email protected]>
*	nir: Rename Boolean-related opcodes to include 32 in the name	Jason Ekstrand	2018-12-16	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a squash of a bunch of individual changes: nir/builder: Generate 32-bit bool opcodes transparently nir/algebraic: Remap Boolean opcodes to the 32-bit variant Use 32-bit opcodes in the NIR producers and optimizations Generated with a little hand-editing and the following sed commands: sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' */.c sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' */.c sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' */.c sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' */.c sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' */.c sed -i 's/nir_op_bcsel/nir_op_b32csel/g' */.c Use 32-bit opcodes in the NIR back-ends Generated with a little hand-editing and the following sed commands: sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' */.c sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' */.c sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' */.c sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' */.c sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' */.c sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' */.c sed -i 's/nir_op_bcsel/nir_op_b32csel/g' */.c Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
*	ac: split 16-bit ssbo loads that may not be dword aligned	Rhys Perry	2018-12-16	1	-0/+2
\| \| \| \| \| \| \|	Fixes: 7e7ee826982 ('ac: add support for 16bit buffer loads') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108114 Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac: refactor visit_load_buffer	Rhys Perry	2018-12-16	2	-44/+42
\| \| \| \| \| \| \|	This is so that we can split different types of loads more easily. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	amd: remove support for LLVM 6.0	Samuel Pitoiset	2018-12-06	6	-298/+32
\| \| \| \| \| \| \|	User are encouraged to switch to LLVM 7.0 released in September 2018. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	nir: Make boolean conversions sized just like the others	Jason Ekstrand	2018-12-05	1	-4/+8
\| \| \| \| \| \| \| \| \|	Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is one if 8, 16, 32, or 64. This leads to having a few more opcodes but now everything is consistent and booleans aren't a weird special case anymore. Reviewed-by: Connor Abbott <[email protected]>
*	amd/addrlib: update Mesa's copy of addrlib	Nicolai Hähnle	2018-11-29	1	-2/+2
\| \| \| \| \| \| \| \|	Update to the internal master as of 2018-11-15. This has a lot of gratuitous whitespace change, but on the plus side it's built using the same tooling that's used for AMDVLK, which should help going forward.
*	ac/surface/gfx9: let addrlib choose the preferred swizzle kind	Nicolai Hähnle	2018-11-29	1	-18/+4
\| \| \| \| \| \| \| \| \|	Our choices here are simply redundant as long as sin.flags is set correctly. (v2: - remove unused function parameter) Reviewed-by: Marek Olšák <[email protected]>
*	radv: remove dependency on addrlib gfx9_enum.h	Nicolai Hähnle	2018-11-29	1	-0/+3
\| \| \| \| \| \| \|	v2: - use SI_CONTEXT_REG_OFFSET Reviewed-by: Dave Airlie <[email protected]>
*	ac: handle cast derefs	Dave Airlie	2018-11-21	1	-0/+3
\| \| \| \| \| \|	Just give back the same value for now. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: handle loading from shared pointers	Dave Airlie	2018-11-21	1	-9/+18
\| \| \| \| \| \| \| \| \| \|	We won't have a var to load from, so don't try to the processing required if we don't need it. This avoids crashes in: dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.workgroup_two_buffers Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: avoid casting pointers on bcsel and stores	Dave Airlie	2018-11-21	3	-3/+14
\| \| \| \| \| \| \| \|	For variable pointers we really don't want to case the pointers to int without a good reason, just add a wrapper for bcsel loading and result storing. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: fix intrinsic name string size in visit_image_atomic()	Samuel Pitoiset	2018-11-20	1	-1/+1
\| \| \| \| \| \| \| \|	Fixes an assertion in SoTTR. Fixes: dd0172e865 ("radv: Use structured intrinsics instead of indexing workaround for GFX9.") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: Use structured intrinsics instead of indexing workaround for GFX9.	Bas Nieuwenhuizen	2018-11-19	2	-7/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These force the index to be used in the instruction so we don't need the workaround. Totals: SGPRS: 1321642 -> 1321802 (0.01 %) VGPRS: 943664 -> 943788 (0.01 %) Spilled SGPRs: 28468 -> 28480 (0.04 %) Spilled VGPRs: 88 -> 89 (1.14 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 80 -> 80 (0.00 %) dwords per thread Code Size: 52415292 -> 52338932 (-0.15 %) bytes LDS: 400 -> 400 (0.00 %) blocks Max Waves: 233903 -> 233803 (-0.04 %) Wait states: 0 -> 0 (0.00 %) Totals from affected shaders: SGPRS: 238344 -> 238504 (0.07 %) VGPRS: 232732 -> 232856 (0.05 %) Spilled SGPRs: 13125 -> 13137 (0.09 %) Spilled VGPRs: 88 -> 89 (1.14 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 80 -> 80 (0.00 %) dwords per thread Code Size: 15752712 -> 15676352 (-0.48 %) bytes LDS: 139 -> 139 (0.00 %) blocks Max Waves: 31680 -> 31580 (-0.32 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac/surface: remove the overallocation workaround for Vega12	Marek Olšák	2018-11-09	1	-4/+0
\| \| \| \| \| \|	not needed anymore (probably since the tile_swizzle fix) Reviewed-by: Samuel Pitoiset <[email protected]>
*	radv: use LOAD_CONTEXT_REG when loading fast clear values	Samuel Pitoiset	2018-11-08	1	-0/+1
\| \| \| \| \| \| \| \| \|	This avoids syncing the Micro Engine. This is only supported for VI+ currently. There is probably a way for using LOAD_CONTEXT_REG on previous chips but that could be done later. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	ac/nir_to_llvm: fix b2f for f64	Timothy Arceri	2018-11-07	1	-3/+12
\| \| \| \| \| \|	Fixes: d7e0d47b9de3 ("nir: Add a bunch of b2[if] optimizations") Reviewed-by: Dave Airlie <[email protected]>
*	amd: Make vgpr-spilling depend on llvm version	Jan Vesely	2018-11-02	1	-1/+2
\| \| \| \| \| \| \| \| \|	The option was removed in LLVM r345763 Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
*	ac/nir: make use of i1false in few more places	Samuel Pitoiset	2018-11-01	1	-3/+3
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	radv: use WAIT_REG_MEM_GREATER_OR_EQUAL instead of a magic value	Samuel Pitoiset	2018-10-31	1	-0/+1
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	radeonsi: add support for Raven2 (v2)	Marek Olšák	2018-10-30	5	-0/+16
\| \| \| \| \| \|	v2: fix enabling primitive binning Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac: fix ac_build_fdiv for f64	Marek Olšák	2018-10-29	1	-1/+2
\| \| \| \| \| \|	trivial Fixes: a5f35aa742c
*	radv: implement VK_EXT_transform_feedback	Samuel Pitoiset	2018-10-29	1	-0/+1
\| \| \| \| \| \| \| \|	This implementation should work and potential bugs can be fixed during the release candidates window anyway. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	util: use C99 declaration in the for-loop hash_table_foreach() macro	Eric Engestrom	2018-10-25	1	-1/+0
\| \| \| \| \|	Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	amd/common: check DRM version 3.27 for JPEG decode	Leo Liu	2018-10-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	JPEG was added after DRM version 3.26 Signed-off-by: Leo Liu <[email protected]> Fixes: 4558758c51749(amd/common: add vcn jpeg ip info query) Cc: Boyuan Zhang <[email protected]> Cc: Alex Smith <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	amd/common: add vcn jpeg ip info query	Boyuan Zhang	2018-10-23	1	-2/+12
\| \| \| \| \|	Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Leo Liu <[email protected]>
*	ac: Fix loading a dvec3 from an SSBO	Connor Abbott	2018-10-22	1	-2/+2
\| \| \| \| \| \| \| \| \|	The comment was wrong, since the loop above casts to a type with the correct bitsize already. Fixes: 7e7ee82698247d8f93fe37775b99f4838b0247dd ("ac: add support for 16bit buffer loads") Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: Introduce ac_build_expand()	Connor Abbott	2018-10-22	2	-14/+29
\| \| \| \| \| \| \| \|	And implement ac_bulid_expand_to_vec4() on top of it. Fixes: 7e7ee82698247d8f93fe37775b99f4838b0247dd ("ac: add support for 16bit buffer loads") Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: add helpers for fast integer division by a constant	Marek Olšák	2018-10-16	2	-0/+78
\|
*	radeonsi: save raster config in screen, add se_tile_repeat	Marek Olšák	2018-10-16	2	-3/+13
\|
*	radeonsi: rename si_gfx_* functions to si_cp_*	Marek Olšák	2018-10-16	1	-0/+1
\| \| \| \|	and write_event_eop -> release_mem
*	radeonsi: make si_gfx_write_event_eop more configurable	Marek Olšák	2018-10-16	1	-0/+5
\|
*	ac/nir: Use context-specific LLVM types	Alex Smith	2018-10-16	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LLVMInt*Type() return types from the global context and therefore are not safe for use in other contexts. Use types from our own context instead. Fixes frequent crashes seen when doing multithreaded pipeline creation. Fixes: 4d0b02bb5a "ac: add support for 16bit load_push_constant" Fixes: 7e7ee82698 "ac: add support for 16bit buffer loads" Cc: "18.2" <[email protected]> Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	radv: emit the GLC bit for SSBO loads/stores when needed	Samuel Pitoiset	2018-10-12	3	-8/+22
\| \| \| \| \| \| \| \| \|	This fixes some new memory model tests: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108112 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: add ac_build_round	Marek Olšák	2018-10-06	3	-3/+19
\|
*	ac: correct PKT3_COPY_DATA definitions	Marek Olšák	2018-10-06	1	-2/+9
\|
*	ac: simplify LLVM alloca helpers	Marek Olšák	2018-10-06	1	-7/+4
\|
*	ac: define all address spaces properly	Marek Olšák	2018-10-06	3	-10/+12
\|
*	radv: do not use the availability bit for timestamp queries	Samuel Pitoiset	2018-09-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	It's unnecessary because we can just check if the timestamp is to different to the default value when a pool is created or resetted. Instead of waiting for the availability bit to be 1, we have to emit a not equal WAIT_REG_MEM for checking if the timestamp is ready. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	ac: add 16-bit support to ac_build_bitfield_reverse()	Samuel Pitoiset	2018-09-17	1	-0/+5
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: add 16-bit support to ac_build_bit_count()	Samuel Pitoiset	2018-09-17	1	-0/+5
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>