mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	radeonsi: move CP_DMA_ALIGNMENT definition	Marek Olšák	2017-02-10	2	-10/+10
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: remove SI_CONTEXT_FLUSH_AND_INV_FRAMEBUFFER	Marek Olšák	2017-02-10	3	-6/+6
\| \| \| \| \| \|	not necessary Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: remove separate CB/DB_META flush flags	Marek Olšák	2017-02-10	3	-17/+8
\| \| \| \| \| \|	not used separately Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: reduce the number of FMASK input coordinates	Marek Olšák	2017-02-10	1	-7/+3
\| \| \| \| \| \| \| \| \|	Before: image_load v3, v[0:3] ... After: image_load v3, v[0:1] ... Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: write shader asm annotated with wave info into GPU hang reports	Marek Olšák	2017-02-10	3	-3/+252
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Note that the disassembly is written twice - first the unmodified compiler output and then the wave-annotated output only if there are waves executing the shader. Sample output from a real GPU hang most likely caused by image_sample: The number of active waves = 28 Pixel Shader - annotated disassembly: s_mov_b64 s[6:7], exec ; BE86017E [PC=0x10f3e3800, off=0, size=4] s_wqm_b64 exec, exec ; BEFE077E [PC=0x10f3e3804, off=4, size=4] ... image_sample v[7:9], v[0:1], s[12:19], s[20:23] dmask:0x7 ; F0800700 00A30700 [PC=0x10f3e3a94, off=660, size=8] s_buffer_load_dword s20, s[0:3], 0x50 ; C0220500 00000050 [PC=0x10f3e3a9c, off=668, size=8] s_load_dwordx4 s[24:27], s[4:5], 0x170 ; C00A0602 00000170 [PC=0x10f3e3aa4, off=676, size=8] s_load_dwordx8 s[12:19], s[4:5], 0x140 ; C00E0302 00000140 [PC=0x10f3e3aac, off=684, size=8] s_buffer_load_dword s11, s[0:3], 0x5c ; C02202C0 0000005C [PC=0x10f3e3ab4, off=692, size=8] s_buffer_load_dword s21, s[0:3], 0x54 ; C0220540 00000054 [PC=0x10f3e3abc, off=700, size=8] s_buffer_load_dword s22, s[0:3], 0x58 ; C0220580 00000058 [PC=0x10f3e3ac4, off=708, size=8] s_waitcnt vmcnt(0) ; BF8C0F70 [PC=0x10f3e3acc, off=716, size=4] ^ SE0 SH0 CU1 SIMD1 WAVE0 EXEC=aaaaaaa555aaaaaa INST32=BF8C0F70 ^ SE0 SH0 CU1 SIMD2 WAVE0 EXEC=aaaa85555555552a INST32=BF8C0F70 ^ SE0 SH0 CU1 SIMD3 WAVE0 EXEC=000000000000000a INST32=BF8C0F70 ^ SE0 SH0 CU6 SIMD1 WAVE0 EXEC=25a5a5aa82aaaaaa INST32=BF8C0F70 ^ SE0 SH0 CU6 SIMD3 WAVE0 EXEC=50aaaa8fffa55555 INST32=BF8C0F70 ^ SE0 SH0 CU7 SIMD0 WAVE0 EXEC=5554aaaaaaa1a555 INST32=BF8C0F70 ^ SE0 SH0 CU7 SIMD0 WAVE1 EXEC=aaaa5555ffffffff INST32=BF8C0F70 ^ SE0 SH0 CU7 SIMD1 WAVE0 EXEC=555557aaaaaaaaa5 INST32=BF8C0F70 ^ SE0 SH0 CU7 SIMD3 WAVE0 EXEC=5555aaaaaaaaaa85 INST32=BF8C0F70 ^ SE1 SH0 CU3 SIMD1 WAVE0 EXEC=aaaaaaaaaaaaaaaa INST32=BF8C0F70 ^ SE1 SH0 CU4 SIMD0 WAVE0 EXEC=aaaaaaaa5a5a5a5a INST32=BF8C0F70 ^ SE1 SH0 CU4 SIMD1 WAVE0 EXEC=aaaaaaa5a5a5a4a5 INST32=BF8C0F70 ^ SE1 SH0 CU4 SIMD2 WAVE0 EXEC=5555555000000000 INST32=BF8C0F70 ^ SE1 SH0 CU4 SIMD3 WAVE0 EXEC=aa555554155aaaaa INST32=BF8C0F70 ^ SE1 SH0 CU5 SIMD0 WAVE0 EXEC=55ffff55555555aa INST32=BF8C0F70 ^ SE1 SH0 CU5 SIMD1 WAVE0 EXEC=555555555aaaaaaa INST32=BF8C0F70 ^ SE1 SH0 CU5 SIMD2 WAVE0 EXEC=a0aaaaaaa8555555 INST32=BF8C0F70 ^ SE1 SH0 CU5 SIMD3 WAVE0 EXEC=8aaaaaaaaaaaa555 INST32=BF8C0F70 ^ SE1 SH0 CU6 SIMD0 WAVE0 EXEC=000000002aaaaaaa INST32=BF8C0F70 ^ SE2 SH0 CU1 SIMD0 WAVE0 EXEC=5aaaa5400aaaa15a INST32=BF8C0F70 ^ SE2 SH0 CU1 SIMD1 WAVE0 EXEC=00aaaaaaaa5555aa INST32=BF8C0F70 ^ SE2 SH0 CU1 SIMD2 WAVE0 EXEC=aa00005555554555 INST32=BF8C0F70 ^ SE2 SH0 CU1 SIMD3 WAVE0 EXEC=aaaaaaa000000000 INST32=BF8C0F70 ^ SE3 SH0 CU4 SIMD0 WAVE0 EXEC=5555aaaaaaaaaaaa INST32=BF8C0F70 ^ SE3 SH0 CU4 SIMD2 WAVE0 EXEC=ffaaaaaaaaaa5555 INST32=BF8C0F70 ^ SE3 SH0 CU4 SIMD3 WAVE0 EXEC=aaaa55555555aa00 INST32=BF8C0F70 ^ SE3 SH0 CU5 SIMD0 WAVE0 EXEC=00aaaaaaaaaaaa5a INST32=BF8C0F70 ^ SE3 SH0 CU5 SIMD1 WAVE0 EXEC=5a555555005555ff INST32=BF8C0F70 v_mul_f32_e32 v7, s6, v7 ; 0A0E0E06 [PC=0x10f3e3ad0, off=720, size=4] ... Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: write wave information into GPU hang reports	Marek Olšák	2017-02-10	1	-0/+20
\| \| \| \| \| \| \| \|	UMR is our new debugging tool. It must have +s set for Mesa to use it without root privileges: sudo chmod +s .../umr Reviewed-by: Nicolai Hähnle <[email protected]>
*	tgsi-dump: dump label if instruction has one	Marc-André Lureau	2017-02-10	1	-11/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The instruction has an associated label when Instruction.Label == 1, as can be seen in ureg_emit_label() or tgsi_build_full_instruction(). This fixes dump generating extra :0 labels on conditionals, and virgl parsing more than the expected tokens and eventually reaching "Illegal command buffer" (when parsing more than a safety margin of 10 we currently have). Signed-off-by: Marc-André Lureau <[email protected]> Cc: "13.0 17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	tgsi: remove ureg_label_insn	Marc-André Lureau	2017-02-10	2	-38/+0
\| \| \| \| \| \| \|	Unused since commit 2897cb3dba9287011f9c43cd2f214100952370c0. Signed-off-by: Marc-André Lureau <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: handle queue submission with no cs but semaphores	Dave Airlie	2017-02-09	1	-2/+20
\| \| \| \| \| \| \| \| \|	It's legal to submit just semaphores with no command streams, this patch fixes this case by emitting the empty cs, it also handles the fence emission for this case better. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	util/disk_cache: error check asprintf()	Timothy Arceri	2017-02-10	1	-5/+7
\| \| \| \| \| \| \|	Fixes: f3d911463e8 "util/disk_cache: stop using ralloc_asprintf() unnecessarily" Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	nvc0/ir: fix ubo max clamp, reset file index	Ilia Mirkin	2017-02-09	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	We just increased the max UBO, so we should also increase the clamp that we do for robustness. Similarly, as we're including the fileIndex in the new indirect value, we should reset fileIndex to 0 so that it is not added in a second time. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
*	nv50/ir: always return 0 when trying to read thread id along unit dim	Ilia Mirkin	2017-02-09	4	-5/+17
\| \| \| \| \| \| \| \| \| \| \| \| \|	Many many many compute shaders only define a 1- or 2-dimensional block, but then continue to use system values that take the full 3d into account (like gl_LocalInvocationIndex, etc). So for the special case that a dimension is exactly 1, we know that the thread id along that axis will always be 0, so return it as such and allow constant folding to fix things up. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ compute	Ilia Mirkin	2017-02-09	1	-25/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Kepler and up unfortunately only support up to 8 constbufs. We work around this by loading from constbufs as if they were storage buffers. However we were not consistently applying limits to loads from these buffers. Make sure to do the same thing we do for storage buffers. Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
*	nvc0: increase number of ubo binding points	Ilia Mirkin	2017-02-09	1	-3/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but it's unclear what it refers to). We need to expose an extra binding point for the "program parameters", which means this must be 15. Remove the last vestige of the "use c14 for immediates" idea. Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
*	nvc0: expose int64	Ilia Mirkin	2017-02-09	1	-1/+1
\| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: make it possible to have the flags def in def0	Ilia Mirkin	2017-02-09	5	-12/+15
\| \| \| \| \| \| \|	There's all kinds of logic that doesn't like there being holes in defs or srcs lists. Avoid them. This also fixes the sched logic for maxwell. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: add support for 64-bit shift lowering on SM20/SM30	Ilia Mirkin	2017-02-09	1	-6/+62
\| \| \| \| \| \| \|	Unfortunately there is no SHF.L/SHF.R instruction pre-SM35. So we have to do a bit more work to get the job done. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: add support for all the new int64 tgsi opcodes	Ilia Mirkin	2017-02-09	6	-5/+302
\| \| \| \| \| \| \| \| \| \| \| \|	A few thoughts: - Some of that LegalizeSSA logic should really live much earlier and be subject to the likes of DCE and other useful passes - Some of the "lowering" done in from_tgsi should be done later so that proper optimization might be done. However this all works and the above can be improved upon later. Signed-off-by: Ilia Mirkin <[email protected]>
*	nv50/ir: Split 64-bit integer MAD/MUL operations	Pierre Moreau	2017-02-09	1	-0/+116
\| \| \| \| \| \| \|	Hardware does not support 64-bit integers MAD and MUL operations, so we need to transform them in 32-bit operations. Signed-off-by: Pierre Moreau <[email protected]>
*	nvc0/ir: add a "high" subop for shifts, emit shf.l/shf.r for 64-bit	Ilia Mirkin	2017-02-09	3	-3/+74
\| \| \| \| \| \|	Note that this is not available for SM20/SM30. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: fix SET and SLCT emission	Ilia Mirkin	2017-02-09	2	-0/+6
\| \| \| \| \| \| \| \|	We were never emitting a .X flag for consuming condition code on SET, and weren't emitting a signed type for SLCT comparison. Discovered while working on int64 logic. Signed-off-by: Ilia Mirkin <[email protected]>
*	nvc0/ir: add support for emitting partial min/max ops for int64	Ilia Mirkin	2017-02-09	4	-1/+14
\| \| \| \| \| \| \| \| \| \|	These operations allow you to compute min/max on arbitrary-width integers, 32 bits at a time. Note that the low/med ops implicitly set the condition code, and the med/high ops implicitly consume it. Signed-off-by: Ilia Mirkin <[email protected]>
*	gallium: add separate PIPE_CAP_INT64_DIVMOD	Ilia Mirkin	2017-02-09	18	-0/+21
\| \| \| \| \| \| \| \| \| \| \|	Nouveau does not currently have logic to implement this as a library function. Even though such a library could be written, there's no big advantage to do it that way for now given that int64 is a very uncommon use-case. Allow a driver to expose INT64 without supporting division and modulo operations. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	glsl: Allow compatibility shaders with MESA_GL_VERSION_OVERRIDE=...	Matt Turner	2017-02-09	4	-4/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously if you used MESA_GL_VERSION_OVERRIDE=3.3COMPAT, Mesa exposed an OpenGL 3.3 compatibility profile context (with various unimplemented features and bugs), but still refused to compile shaders with #version 330 compatibility This patch simply adds a small bit of plumbing to let that through. Of course the same caveats apply: compatibility profile is still not supported (and will not be supported), so there are no guarantees that anything will work. Tested-by: Dylan Baker <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965/fs: add support for int64 to bool conversion	Samuel Iglesias Gonsálvez	2017-02-09	1	-2/+13
\| \| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
*	nir: add opcode to perform int64 to bool conversions	Samuel Iglesias Gonsálvez	2017-02-09	2	-0/+2
\| \| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965/fs: Add support for nir_op_[iu]2[iu]32	Samuel Iglesias Gonsálvez	2017-02-09	1	-0/+4
\| \| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965/fs: Add support for nir_op_[iu]642f	Samuel Iglesias Gonsálvez	2017-02-09	1	-0/+2
\| \| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965/fs: legalize [u]int64 to 32-bit data conversions in lower_d2x	Samuel Iglesias Gonsálvez	2017-02-09	1	-1/+3
\| \| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965/fs: Add support for nir_op_[iu]642d	Jason Ekstrand	2017-02-09	1	-0/+2
\| \| \| \| \|	Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	i965: Allow int64 conversion operations in channel_expressions	Jason Ekstrand	2017-02-09	1	-24/+24
\| \| \| \| \| \| \|	This fixes 143 of the new piglit tests added by Nicolai Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	util/disk_cache: stop using ralloc_asprintf() unnecessarily	Timothy Arceri	2017-02-09	1	-13/+12
\| \| \| \|	Reviewed-by: Anuj Phogat <[email protected]>
*	glsl: add param to force shader recompile	Timothy Arceri	2017-02-09	4	-4/+5
\| \| \| \| \| \|	This will be used to skip checking the cache and force a recompile. Reviewed-by: Anuj Phogat <[email protected]>
*	util: add a disk_cache_remove() function	Timothy Arceri	2017-02-09	2	-0/+34
\| \| \| \| \| \| \| \| \| \|	This will be used to remove cache items created with old versions of Mesa or other invalid cache items from the cache. V2: rename stub function (cache_* funtions were renamed disk_cache_*) in master. Reviewed-by: Anuj Phogat <[email protected]>
*	st/mesa/i965: create link status enum	Timothy Arceri	2017-02-09	13	-21/+32
\| \| \| \| \| \| \| \| \| \| \| \|	For the on-disk shader cache we want to be able to differentiate between a program that was linked and one that was loaded from cache. V2: - don't return the new enum directly to the application when queried, instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome corruptions when using cache. Reviewed-by: Anuj Phogat <[email protected]>
*	radv: Add CPU color packing for VK_FORMAT_A2B10G10R10_UNORM_PACK32.	Bas Nieuwenhuizen	2017-02-08	1	-2/+6
\| \| \| \| \| \| \| \| \|	For allowing fast color clears in the main render targets of dota2. [airlied: fix clear_vals[1] as suggested by Andres. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	mesa: (trivial) include <inttypes.h> for PRIx64 macros	Roland Scheidegger	2017-02-08	1	-0/+1
\| \| \| \|	Fixes a compile error with mingw.
*	swr: [rasterizer jitter] Pass LLVM-IR size into jitter	Tim Rowley	2017-02-08	3	-3/+4
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer core] Frontend SIMD16 WIP	Tim Rowley	2017-02-08	4	-293/+331
\| \| \| \| \| \| \| \| \|	Removed temporary scafolding in PA, widended the PA_STATE interface for SIMD16, and implemented PA_STATE_CUT and PA_TESS for SIMD16. PA_STATE_CUT and PA_TESS now work in SIMD16. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer jitter] Disable unsafe FP optimizations in the jitter	Tim Rowley	2017-02-08	1	-1/+1
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer core] Frontend SIMD16 WIP	Tim Rowley	2017-02-08	4	-142/+243
\| \| \| \| \| \| \|	Widen simdvertex to SIMD16/simd16vertex in frontend for passing VS attributes from VS to PA. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer jitter] Add DEBUGTRAP jit builder function	Tim Rowley	2017-02-08	2	-1/+9
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer jitter] Multisample blend jit fix	Tim Rowley	2017-02-08	1	-2/+2
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer jitter] Change SimdVector representation to array	Tim Rowley	2017-02-08	2	-6/+2
\| \| \| \| \| \| \| \| \| \|	Make all SimdVectors in LLVM represented as simdscalar[4] rather than a struct. Fixes issues with promotion of values from i32 to i64 to match register width. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer jitter] Fix issues with stream-out on llvm>=3.8	Tim Rowley	2017-02-08	3	-6/+6
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer jitter] Adjust jitter header includes	Tim Rowley	2017-02-08	6	-11/+11
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer core] Frontend SIMD16 WIP	Tim Rowley	2017-02-08	5	-43/+813
\| \| \| \| \| \| \| \|	SIMD16 Primitive Assembly (PA) only supports TriList and RectList. CUT_AWARE_PA, TESS, GS, and SO disabled in the SIMD16 front end. Reviewed-by: Bruce Cherniak <[email protected]>
*	r600/sb: Fix memory leak	Bartosz Tomczyk	2017-02-08	1	-1/+7
\| \| \| \|	Signed-off-by: Marek Olšák <[email protected]>
*	mesa: use PRId64/PRIu64 when printing 64-bit ints	Timothy Arceri	2017-02-08	1	-2/+2
\| \| \| \| \| \|	V2: actually use PRIu64 Reviewed-by: Dave Airlie <[email protected]>
*	mesa/st: fix strict aliasing issue in int64 code.	Dave Airlie	2017-02-08	1	-4/+2
\| \| \| \| \| \| \|	This fixes the int64 code same as the double code. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>