mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	gallium/radeon: use gart_page_size instead of hardcoded 4096	Marek Olšák	2016-05-10	6	-10/+18
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	winsys/radeon: use gart_page_size instead of private size_align	Marek Olšák	2016-05-10	3	-14/+11
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	winsys/amdgpu: move gart_page_size to struct radeon_winsys	Marek Olšák	2016-05-10	4	-10/+10
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
*	gallivm: print declarations of intrinsics with GALLIVM_DEBUG=ir	Roland Scheidegger	2016-05-10	1	-0/+5
\| \| \| \| \| \| \|	Those aren't really interesting, however outputting them is helpful when trying to feed the IR to llvm llc (or opt) for debugging. Reviewed-by: Jose Fonseca <[email protected]>
*	gallivm: use InternalLinkage instead of PrivateLinkage for texture functions	Roland Scheidegger	2016-05-10	1	-1/+1
\| \| \| \| \| \| \|	At least with MCJIT the disassembler will crash otherwise when trying to disassemble such functions. Reviewed-by: Jose Fonseca <[email protected]>
*	gallivm: disable avx512 features	Roland Scheidegger	2016-05-10	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We don't target this yet, and some llvm versions incorrectly enable it based on cpu string, causing crashes. (Albeit this is a losing battle, it is pretty much guaranteed when the next new feature comes along llvm will mistakenly enable it on some future cpu, thus we would have to proactively disable all new features as llvm adds them.) This should fix https://bugs.freedesktop.org/show_bug.cgi?id=94291 (untested) Tested-by: Timo Aaltonen <[email protected]> Reviewed-by: Jose Fonseca <[email protected] CC: <[email protected]>
*	freedreno/ir3: lower lrp when operating with double operands	Samuel Iglesias Gonsálvez	2016-05-10	1	-0/+1
\| \| \| \| \| \| \| \| \|	Lower lrp when operating with double operands because float version of lrp is also lowered. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nv50/ir: silence unsupported TGSI_PROPERTY_CS_FIXED_BLOCK_*	Samuel Pitoiset	2016-05-09	1	-0/+5
\| \| \| \| \| \| \|	We don't need them for compute shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	freedreno/ir3: fix fallout from new block iterators	Rob Clark	2016-05-09	1	-1/+1
\| \| \| \| \| \| \|	Since this is potentially modifying the block structure of the shader, it needs the _safe() version of the iterator. Signed-off-by: Rob Clark <[email protected]>
*	radeonsi: workaround for tesselation on SI	Nicolai Hähnle	2016-05-09	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We request more than 32KB of LDS here, which SI doesn't have. Since LLVM recently started checking the size of declared LDS allocations, all shaders involved in tesselation fail to compile on SI. Note that the entire calculation here seems wrong, given how we calculate indices for generic attributes, so the number ends up wrong on CI+ as well. A proper solution is clearly needed, but this patch should serve as a band-aid for SI in the meantime. Also note that the real size of the LDS allocation in hardware is independent from what we tell LLVM, so this is really more of a "cosmetic" change. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95198 Cc: "11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: always allocate export memory for pixel shaders	Nicolai Hähnle	2016-05-09	1	-5/+10
\| \| \| \| \| \| \| \|	Experiments with framebuffer-no-attachments type draw calls have shown that NULL exports stall terribly unless we ensure that export memory is allocated by the SPI. Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: expose performance counters as 64 bit	Nicolai Hähnle	2016-05-09	2	-16/+19
\| \| \| \| \| \| \|	This is useful for shader-related counters, since they tend to quickly exceed 32 bits. Reviewed-by: Marek Olšák <[email protected]>
*	gallium: enable intel jitevents profiling	Tim Rowley	2016-05-09	1	-0/+9
\| \| \| \| \| \| \| \|	LLVM when configured with "intel jitevents" enabled can inform VTune about dynamic code, so individual shaders are attributed profiling data and the resulting assembly can be examined. Acked-by: Roland Scheidegger <[email protected]>
*	swr: Add missing break in query switch statement.	Bruce Cherniak	2016-05-09	1	-0/+1
\| \| \| \| \| \|	Missed a switch break in query stat collection when refactoring queries. Reviewed-by: George Kyriazis <[email protected]>
*	freedreno/ir3: allow for additional VS sysval inputs	Rob Clark	2016-05-09	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \| \|	There are a total of four possible currently, rather than 2. So we need to be prepared for the input array to grow by 16 components. We could get away with less if we could pack sysval inputs.. and the way this is handled currently isn't really the nicest thing. But it's a tactical fix for an issue hit in: GL31-CTS.gtf30.GL3Tests.transform_feedback.transform_feedback_vertex_id Signed-off-by: Rob Clark <[email protected]>
*	r300g: add support for PIPE_FORMAT_x8R8G8B8_*	Marek Olšák	2016-05-09	2	-15/+77
\| \| \| \| \| \| \| \| \| \| \| \|	And set endian swap for packed formats the way it should be done in theory. This allows big endian to work again, but it can still be buggy. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789 Cc: 11.1 11.2 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: fix undefined behavior (memcpy arguments must be non-NULL)	Nicolai Hähnle	2016-05-07	1	-1/+3
\| \| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: fix some reported undefined left-shifts	Nicolai Hähnle	2016-05-07	1	-3/+3
\| \| \| \| \| \| \| \|	One of these is an unsigned bitfield, which I suspect is a false positive, but gcc 5.3.1 complains about it with -fsanitize=undefined. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	gallium/radeon: clean left-shift undefined behavior	Nicolai Hähnle	2016-05-07	11	-3989/+3989
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Shifting into the sign bit of a signed int is undefined behavior. Unfortunately, there are potentially many places where this happens using the register macros. This commit is the result of running sed -ie "s/(((\(\w\+\)) & 0x\(\w\+\)) << \(\w\+\))/(((unsigned)(\1) \& 0x\2) << \3)/g" on all header files in gallium/{r600,radeon,radeonsi}. Reviewed-by: Marek Olšák <[email protected]>
*	gallium: fix various undefined left shifts into sign bit	Nicolai Hähnle	2016-05-07	4	-5/+5
\| \| \| \| \| \| \| \| \|	Funnily enough, some of these were turned into a compile-time error by gcc with -fsanitize=undefined ("initializer is not a constant"). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	radeonsi: Compute correct LDS size for fragment shaders.	Bas Nieuwenhuizen	2016-05-06	1	-3/+6
\| \| \| \| \| \| \| \|	No sure where the 36 came from, but we clearly need at least 48 bytes per attribute per primitive. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	vc4: Add support for loading immediate values in QIR.	Eric Anholt	2016-05-06	4	-0/+32
\| \| \| \| \| \| \|	This will be used for resetting the uniform stream in the presence of branching, but may also be useful as an optimization to reduce how many uniforms we have to copy out per draw call (in exchange for increasing icache pressure).
*	vc4: Make vc4_qpu_validate() produce more verbose failures.	Eric Anholt	2016-05-06	1	-35/+71
\| \| \| \| \| \|	Seeing the expansion of a QPU_GET_FIELD in an assert isn't very informative, and it's hard find what's going wrong without getting a dump of the instruction that failed.
*	vc4: Add a small QIR validate pass.	Eric Anholt	2016-05-06	4	-0/+127
\| \| \| \| \|	This has caught a couple of bugs during loop development so far, and I should probably have written it long ago.
*	vc4: Fix the src count on exp2/log2.	Eric Anholt	2016-05-06	1	-2/+2
\| \| \| \|	Found by the upcoming QIR validate pass.
*	vc4: Reuse QPU disasm's cond flags in QIR.	Eric Anholt	2016-05-06	3	-27/+46
\| \| \| \|	In the process, this made me flatten out the "%s%s%s%s" fprintf arguments.
*	vc4: When emitting an instruction to an existing temp, mark it non-SSA.	Eric Anholt	2016-05-06	1	-0/+2
\| \| \| \|	Prevents a bug in the later control-flow support series.
*	vc4: Make sure that we don't overwrite the signal for PROG_END.	Eric Anholt	2016-05-06	1	-0/+8
\| \| \| \| \| \| \| \|	We should have already emitted a NOP due to the last instruction being a TLB or VPM write. However, if you disable dead code elimination then you might get dead code at the end, and that dead code might have the signal bits set to something non-default, at which point you die in assertion failure.
*	nvc0: unreference images when the context is destroyed	Samuel Pitoiset	2016-05-06	1	-0/+4
\| \| \| \| \| \| \|	Like other resources, we need to unreference all images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	radeonsi: set DECOMPRESS_Z_ON_FLUSH if nr_samples >= 4	Marek Olšák	2016-05-06	1	-1/+2
\| \| \| \| \| \| \| \|	Vulkan always sets this. It only affects in-place Z decompression. This is recommended for performance, but what app uses MSAA depth texturing? Reviewed-by: Nicolai Hähnle <[email protected]>
*	r600g: use the hw MSAA resolving if formats are compatible	Marek Olšák	2016-05-06	1	-1/+2
\| \| \| \| \| \| \|	This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Alex Deucher <[email protected]>
*	st/omx/enc: fix incorrect reference picture order for B frames	Leo Liu	2016-05-05	1	-7/+12
\| \| \| \| \| \| \| \| \|	Stacking frames is for driver that's capable to do dual instances encoding. Such feature is not enabled for B frames currently. Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Cc: "11.1 11.2" <[email protected]>
*	vc4: fixup for new nir_foreach_block()	Connor Abbott	2016-05-05	4	-48/+20
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	ir3: fixup for new nir_foreach_block()	Connor Abbott	2016-05-05	1	-30/+21
\|
*	swr: [rasterizer core] Faster modulo operator in ProcessVerts	Tim Rowley	2016-05-05	1	-1/+4
\| \| \| \| \| \|	Avoid % operator, since we know that curVertex is always incrementing. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer] Small warning cleanup	Tim Rowley	2016-05-05	2	-8/+4
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer] Add SWR_ASSUME / SWR_ASSUME_ASSERT macros	Tim Rowley	2016-05-05	2	-14/+52
\| \| \| \| \| \|	Fix static code analysis errors found by coverity on Linux Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer] Miscellaneous backend changes	Tim Rowley	2016-05-05	3	-22/+31
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer] Add support for X24_TYPELESS_G8_UINT format	Tim Rowley	2016-05-05	3	-7/+41
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer jitter] Fix printing bugs for tracing.	Tim Rowley	2016-05-05	1	-81/+24
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer memory] Add missing store tiles function	Tim Rowley	2016-05-05	1	-1/+4
\| \| \| \| \| \|	Storing color hot tile to 8bit w-major stencil format. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer jitter] Add asserts for supported formats in fetch shader	Tim Rowley	2016-05-05	1	-0/+2
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer core] Fix thread allocation	Tim Rowley	2016-05-05	1	-17/+47
\| \| \| \| \| \| \| \|	Fix windows in 32-bit mode when hyperthreading is disabled on Xeons. Some support for asymmetric processor topologies. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer core] Fix threadviz support in buckets	Tim Rowley	2016-05-05	3	-12/+14
\| \| \| \| \| \| \|	Need to do lazy eval of the threadviz knob since order of globals is undefined. Reviewed-by: Bruce Cherniak <[email protected]>
*	swr: [rasterizer] Whitespace cleanup and misc changes	Tim Rowley	2016-05-05	5	-5/+2
\| \| \| \|	Reviewed-by: Bruce Cherniak <[email protected]>
*	radeonsi: mark descriptor loads as using dynamically uniform indices	Nicolai Hähnle	2016-05-05	1	-5/+17
\| \| \| \| \| \| \| \|	This tells LLVM to always use SMEM loads for descriptors. It fixes a regression in piglit's arb_shader_storage_buffer_object/execution/indirect.shader_test that was caused by LLVM r268259 (but the proper fix is really here in Mesa). Reviewed-by: Marek Olšák <[email protected]>
*	swr: Remove stall waiting for core query counters.	Bruce Cherniak	2016-05-05	4	-124/+81
\| \| \| \| \| \| \| \|	When gathering query results, swr_gather_stats was unnecessarily stalling the entire pipeline. Results are now collected asynchronously, with a fence marking completion. Reviewed-By: George Kyriazis <[email protected]>
*	freedreno: remove null check before free	Thomas Hindoe Paaboel Andersen	2016-05-05	1	-2/+1
\| \| \| \|	Reviewed-by: Eduardo Lima Mitev <[email protected]>
*	r600,compute: create vtx buffer for text + rodata	Jan Vesely	2016-05-04	1	-2/+10
\| \| \| \| \| \| \|	Reserve buffer id 2 Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
*	freedreno: allow ctx->draw_vbo to fail	Rob Clark	2016-05-04	5	-30/+37
\| \| \| \| \| \| \|	Pretty much only happens if shader variant compile fails. But in this case, if we haven't emitted cmdstream, we don't want to set needs_flush. Signed-off-by: Rob Clark <[email protected]>