mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	gallium/radeon: change some driver query types to Hz	Marek Olšák	2015-08-06	1	-2/+2
\| \| \| \| \|	Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: before storing tess levels, load them from LDS instead of temporary	Marek Olšák	2015-08-06	1	-79/+57
\| \| \| \| \| \| \| \| \|	Also use only one store if stride <= 4. All the fetches from and stores to temporaries can be removed now. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91461 Reviewed-by: Michel Dänzer <[email protected]>
*	gallium/radeon: always use the llvm. prefix in intrinsic names	Marek Olšák	2015-08-06	1	-6/+16
\| \| \| \| \|	Acked-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
*	gallium/radeon: allow the winsys to choose the IB size	Marek Olšák	2015-08-06	9	-15/+14
\| \| \| \| \| \| \|	Picked from the amdgpu branch. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
*	gallium/radeon: suspend timer queries between IBs	Marek Olšák	2015-08-06	5	-25/+66
\| \| \| \| \| \| \|	When we are measuring the time spent in a draw call, an unexpected flush can distort the result. Reviewed-by: Michel Dänzer <[email protected]>
*	vc4: Use nir_lower_load_const_to_scalar().	Eric Anholt	2015-08-04	1	-0/+1
\|
*	vc4: Don't bother de-SSAing values that aren't part of phi webs.	Eric Anholt	2015-08-04	1	-15/+44
\| \| \| \|	We can just support them the same way we do load_const's SSA values.
*	vc4: Don't bother saturating the dst color for blending.	Eric Anholt	2015-08-04	1	-8/+2
\| \| \| \| \| \| \| \| \|	Since we just pulled it out of the destination as 8-bit unorm, we know it's in [0, 1] already. shader-db: total instructions in shared programs: 100040 -> 98208 (-1.83%) instructions in affected programs: 14084 -> 12252 (-13.01%)
*	vc4: Make r4-writes implicitly move to a temp, and allocate temps to r4.	Eric Anholt	2015-08-04	8	-107/+106
\| \| \| \| \| \| \| \| \| \| \|	Previously, SFU values always moved to a temporary, and TLB color reads and texture reads always lived in r4. Instead, we can have these results just be normal temporaries, and the register allocator can leave the values in r4 when they don't interfere with anything else using r4. shader-db results: total instructions in shared programs: 100809 -> 100040 (-0.76%) instructions in affected programs: 42383 -> 41614 (-1.81%)
*	vc4: Drop a dead prototype.	Eric Anholt	2015-08-04	1	-8/+0
\|
*	freedreno/a4xx: add independent blend function support	Rob Clark	2015-08-04	2	-8/+10
\| \| \| \| \| \|	needed for MRT Signed-off-by: Rob Clark <[email protected]>
*	freedreno/a4xx: MRT support	Rob Clark	2015-08-04	12	-132/+212
\| \| \| \|	Signed-off-by: Rob Clark <[email protected]>
*	freedreno: move the half-precision logic into core	Rob Clark	2015-08-04	4	-31/+38
\| \| \| \| \| \| \| \|	Both a3xx and a4xx need the same logic to decide if half-precision can be used for blit shaders. So move it to core and simplify things a bit with a helper that considers all render targets. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: simplify/cleanup resource status tracking	Rob Clark	2015-08-04	4	-48/+71
\| \| \| \| \| \| \| \| \|	Collapse dirty/reading bools into status bitmask (and drop writing which should really be the same as dirty). And use 'used_resources' list for all tracking, including zsbuf/cbufs, rather than special casing the color and depth/stencil buffers. Signed-off-by: Rob Clark <[email protected]>
*	freedreno: fix stream-out caps vec4->components	Rob Clark	2015-08-04	1	-2/+2
\| \| \| \| \| \|	Should be in units of components, not vec4's Signed-off-by: Rob Clark <[email protected]>
*	freedreno: small bit of cleanup about max rendertargets	Rob Clark	2015-08-04	13	-17/+40
\| \| \| \| \| \| \| \|	We hard-coded 4 or 8 as the max in various places. Switch it all to a define since the limit will go up with a4xx (and maybe even again in the future?) Signed-off-by: Rob Clark <[email protected]>
*	r600,compute: force tiling on 2D and 3D texture compute resources	Zoltan Gilian	2015-08-03	1	-2/+9
\| \| \| \| \| \| \|	To circumvent a problem occuring when LINEAR_ALIGNED array mode is selected on a TEXTURE_2D RAT. This configuration causes MEM_RAT STORE_TYPED to write to incorrect locations.
*	r600g: re-enable single-sample fast clear	Marek Olšák	2015-08-03	1	-6/+1
\| \| \| \| \| \| \|	Fixed by the CB_SHADER_MASK fix. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	r600g: fix the CB_SHADER_MASK setup	Marek Olšák	2015-08-03	2	-4/+5
\| \| \| \| \| \| \| \|	This fixes the single-sample fast clear hang. Cc: 10.6 <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	r600g: fix the single-sample fast clear setup	Marek Olšák	2015-08-03	1	-2/+6
\| \| \| \| \| \| \|	No effect, but this is what we should be doing. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	radeonsi: flush if the memory usage for an IB is too high	Marek Olšák	2015-08-02	2	-0/+17
\| \| \| \| \| \| \|	Picked from the amdgpu branch. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Christian König <[email protected]>
*	Revert "gallium/radeon: re-enable unsafe math for graphics shaders"	Marek Olšák	2015-08-01	1	-4/+0
\| \| \| \| \| \|	This reverts commit 8559f6ce62a9d5b52fa8189ba2352cd48bdabccf. It causes hangs in DOTA 2 Reborn.
*	radeonsi: copy *8_SNORM bits exactly in resource_copy_region	Marek Olšák	2015-07-31	1	-1/+3
\| \| \| \| \| \| \| \| \| \|	Disabling the FP16 mode didn't help. If needed, we can use this trick for blits too, but not for scaled blits. + 4 piglits Reviewed-by: Michel Dänzer <[email protected]>
*	r600g: early exit in r600_clear if there's nothing to do	Marek Olšák	2015-07-31	1	-0/+2
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: early exit in si_clear if there's nothing to do	Marek Olšák	2015-07-31	1	-0/+2
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: fix a regression since the resource_copy_region cleanup	Marek Olšák	2015-07-31	1	-1/+1
\| \| \| \| \| \| \| \| \|	Broken since: 46b2b3b - radeonsi: don't change pipe_resource in resource_copy_region Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91444 Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
*	radeonsi: fix broken st/nine from merging tessellation	Marek Olšák	2015-07-31	1	-2/+7
\| \| \| \|	st/nine uses GENERIC slots greater than 60.
*	radeonsi: move CP DMA functions to their own file	Marek Olšák	2015-07-31	6	-236/+274
\| \| \| \|	Reviewed-by: Michel Dänzer <[email protected]>
*	radeonsi: add a debug flag that disables printing ISA in shader dumps	Marek Olšák	2015-07-31	3	-9/+13
\|
*	radeonsi: add a debug flag that disables printing TGSI in shader dumps	Marek Olšák	2015-07-31	3	-1/+3
\| \| \| \|	Reviewed-by: Dave Airlie <[email protected]>
*	radeonsi: add a debug flag that disables printing the LLVM IR in shader dumps	Marek Olšák	2015-07-31	6	-29/+29
\| \| \| \|	This is for shader-db and should reduce size of shader dumps.
*	radeonsi: store shader disassemblies in memory for future users	Marek Olšák	2015-07-31	7	-17/+18
\| \| \| \| \|	This will be used by the new ddebug pipe. I'm including it now to avoid conflicts with other patches.
*	radeonsi: don't use llvm.AMDIL.fraction for FRC and DFRAC	Marek Olšák	2015-07-31	1	-4/+16
\| \| \| \| \| \| \| \|	There are 2 reasons for this: - LLVM optimization passes can work with floor - there are patterns to select v_fract from floor anyway There is no change in the generated code.
*	gallium/radeon: re-enable unsafe math for graphics shaders	Marek Olšák	2015-07-31	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit 4db985a5fa9ea985616a726b1770727309502d81. The grass no longer disappears, which was the reason the commit was reverted. This might affect tessellation. We'll see. Totals from affected shaders: SGPRS: 151672 -> 150232 (-0.95 %) VGPRS: 90620 -> 89776 (-0.93 %) Code Size: 3980472 -> 3920836 (-1.50 %) bytes LDS: 67 -> 67 (0.00 %) blocks Scratch: 1357824 -> 1202176 (-11.46 %) bytes per wave Reviewed-by: Tom Stellard <[email protected]>
*	gallium/radeon: don't use rsq_action	Marek Olšák	2015-07-31	1	-7/+3
\| \| \| \|	Reviewed-by: Dave Airlie <[email protected]>
*	gallium/radeon: move r600-specific code to r600g	Marek Olšák	2015-07-31	2	-152/+150
\| \| \| \|	Reviewed-by: Tom Stellard <[email protected]>
*	gallium/radeon: remove unused variables and old comments	Marek Olšák	2015-07-31	4	-35/+0
\| \| \| \|	Reviewed-by: Dave Airlie <[email protected]>
*	gallium/radeon: remove build_intrinsic and build_tgsi_intrinsic	Marek Olšák	2015-07-31	4	-108/+58
\| \| \| \| \| \|	duplicated now Reviewed-by: Dave Airlie <[email protected]>
*	gallivm: add LLVMAttribute parameter to lp_build_intrinsic	Marek Olšák	2015-07-31	2	-9/+9
\| \| \| \| \| \|	This will help remove some duplicated code from radeon. Reviewed-by: Dave Airlie <[email protected]>
*	radeonsi: completely rework updating descriptors without CP DMA	Marek Olšák	2015-07-31	4	-271/+128
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The patch has a better explanation. Just a summary here: - The CPU always uploads a whole descriptor array to previously-unused memory. - CP DMA isn't used. - No caches need to be flushed. - All descriptors are always up-to-date in memory even after a hang, because CP DMA doesn't serve as a middle man to update them. This should bring: - better hang recovery (descriptors are always up-to-date) - better GPU performance (no KCACHE and TC flushes) - worse CPU performance for partial updates (only whole arrays are uploaded) - less used IB space (no CP_DMA and WRITE_DATA packets) - simpler code - hopefully, some of the corruption issues with SI cards will go away. If not, we'll know the issue is not here. Reviewed-by: Michel Dänzer <[email protected]>
*	vc4: Lower uniform loads to scalar in NIR.	Eric Anholt	2015-07-30	2	-31/+81
\| \| \| \| \|	This also moves the vec4-to-byte-addressing math into NIR, so that algebraic has a chance at it.
*	vc4: Move some FS input lowering into NIR.	Eric Anholt	2015-07-30	2	-35/+50
\|
*	vc4: Move program keys to the header file.	Eric Anholt	2015-07-30	2	-47/+49
\| \| \| \| \|	I want to be able to inspect them from other files for lowering passes in NIR.
*	vc4: Lower NIR inputs to scalar as well.	Eric Anholt	2015-07-30	2	-4/+44
\| \| \| \| \|	For now this is just scalarizing, but it also means we'll get to dump a bunch of QIR-based lowering in a moment.
*	vc4: Start adding a NIR-based output lowering pass.	Eric Anholt	2015-07-30	4	-7/+137
\| \| \| \| \| \|	For now, this just splits up store_output intrinsics to be scalars, and drops unused outputs in the coordinate shader. My goal is to be able to drop a bunch of my VC4-specific optimization by letting NIR handle it.
*	vc4: Mark our shaders as single-threaded.	Eric Anholt	2015-07-30	2	-0/+6
\| \| \| \| \|	I had my understanding of this bit flipped. We're using the full register space, so we need to say so.
*	vc4: Avoid leaking indirect array access UBOs.	Eric Anholt	2015-07-30	1	-0/+2
\|
*	vc4: Avoid overflowing various static tables.	Eric Anholt	2015-07-30	4	-4/+4
\|
*	vc4: Fix return values from recent validation changes.	Eric Anholt	2015-07-30	1	-4/+4
\|
*	radeonsi: enable GL4.1 and update documentation (v2)	Dave Airlie	2015-07-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	This enables GL4.1 for radeonsi, and updates the docs in the correct places. v2: enable only for llvm 3.7 which has fixes in place. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>