mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	glsl: Optimize open-coded lrp into lrp.	Jordan Justen	2014-01-21	1	-0/+52
\| \| \| \| \| \| \| \| \| \|	total instructions in shared programs: 1498191 -> 1487051 (-0.74%) instructions in affected programs: 669388 -> 658248 (-1.66%) GAINED: 1 LOST: 0 Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Jordan Justen <[email protected]>
*	glsl: Optimize pow(2, x) --> exp2(x).	Kenneth Graunke	2014-01-07	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Haswell, POW takes 24 cycles, while EXP2 only takes 14. Plus, using POW requires putting 2.0 in a register, while EXP2 doesn't. I believe that EXP2 will be faster than POW on basically all GPUs, so it makes sense to optimize it. Looking at the savage2 subset of shader-db: total instructions in shared programs: 113225 -> 113179 (-0.04%) instructions in affected programs: 2139 -> 2093 (-2.15%) instances of 'math pow': 795 -> 749 (-6.14%) instances of 'math exp': 389 -> 435 (11.8%) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Optimize pow(1.0, X) --> 1.0.	Kenneth Graunke	2014-01-07	1	-0/+6
\| \| \| \| \| \| \|	Surprisingly, this helps one vertex shader in 3DMMES. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Apply the transformation "1/rsq(x) == sqrt(x)" in opt_algebraic.	Eric Anholt	2013-11-15	1	-3/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	The comment was stale, because the lowering in question wasn't happening in lower_instructions.cpp. Presumably if the lowering ever moves there, we can plumb the lowering mask through to opt_algebraic. total instructions in shared programs: 1618696 -> 1616810 (-0.12%) instructions in affected programs: 243018 -> 241132 (-0.78%) GAINED: 0 LOST: 0 Reviewed-by: Jordan Justen <[email protected]>
*	glsl: Apply the transformation "(a ^^ a) -> false" in opt_algebraic.	Eric Anholt	2013-11-15	1	-1/+3
\| \| \| \|	Reviewed-by: Jordan Justen <[email protected]>
*	glsl: Apply the transformation "(a && a) -> a" in opt_algebraic.	Eric Anholt	2013-11-15	1	-1/+3
\| \| \| \|	Reviewed-by: Jordan Justen <[email protected]>
*	glsl: Apply the transformation "(a \|\| a) -> a" in opt_algebraic.	Eric Anholt	2013-11-15	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	total instructions in shared programs: 1732385 -> 1732373 (-0.00%) instructions in affected programs: 416 -> 404 (-2.88%) GAINED: 0 LOST: 0 (That's 4 already-short fragment shaders in dota2) Reviewed-by: Jordan Justen <[email protected]>
*	glsl: Drop no-op shifts involving 0.	Eric Anholt	2013-10-28	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \|	I noticed this in a shader in Unigine Heaven that was spilling. While it doesn't really reduce register pressure, it shaves a few instructions anyway (7955 -> 7882). v2: Fix turning "0 >> x" into "x" instead of "0" (caught by Erik Faye-Lund). Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Use ir_builder more in opt_algebraic.	Eric Anholt	2013-10-28	1	-30/+10
\| \| \| \| \| \| \| \| \|	While ir_builder is slightly less efficient, we're only increasing the work when there's actual optimization being done, and it's way more readable code. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Move common code out of opt_algebraic's handle_expression().	Eric Anholt	2013-10-28	1	-78/+39
\| \| \| \| \| \| \| \| \| \|	Matt and I had each screwed up these common required patterns recently, in ways that wouldn't have been noticed for a long time if not for code review. Just enforce it in the caller so that we don't rely on code review catching these bugs. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl: Optimize (not A) and (not B) into not (A or B).	Matt Turner	2013-10-25	1	-0/+9
\| \| \| \| \| \|	No shader-db changes, but seems like a good idea. Reviewed-by: Eric Anholt <[email protected]>
*	glsl: Optimize (not A) or (not B) into not (A and B).	Matt Turner	2013-10-25	1	-0/+12
\| \| \| \| \| \| \| \|	A few Serious Sam 3 shaders affected: instructions in affected programs: 4384 -> 4344 (-0.91%) Reviewed-by: Eric Anholt <[email protected]>
*	glsl: Optimize -(-expr) into expr.	Matt Turner	2013-10-21	1	-0/+10
\| \| \| \| \|	Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: Optimize abs(-expr) and abs(abs(expr)) into abs(expr).	Matt Turner	2013-10-21	1	-0/+18
\| \| \| \| \|	Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: Use saved values instead of recomputing them.	Matt Turner	2013-10-21	1	-8/+4
\| \| \| \| \|	Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: Optimize mul(a, -1) into neg(a).	Matt Turner	2013-10-16	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \|	Two extra instructions in some heroesofnewerth shaders, but a win for everything else. total instructions in shared programs: 1531352 -> 1530815 (-0.04%) instructions in affected programs: 121898 -> 121361 (-0.44%) Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: Add support for new bit built-ins in ARB_gpu_shader5.	Matt Turner	2013-05-06	1	-3/+3
\| \| \| \| \| \|	v2: Move use of ir_binop_bfm and ir_triop_bfi to a later patch. Reviewed-by: Chris Forbes <[email protected]>
*	glsl: Optimize ir_triop_lrp(x, y, a) with a = 0.0f or 1.0f	Matt Turner	2013-02-28	1	-0/+11
\| \| \| \| \|	Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: Convert mix() to use a new ir_triop_lrp opcode.	Kenneth Graunke	2013-02-28	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Many GPUs have an instruction to do linear interpolation which is more efficient than simply performing the algebra necessary (two multiplies, an add, and a subtract). Pattern matching or peepholing this is more desirable, but can be tricky. By using an opcode, we can at least make shaders which use the mix() built-in get the more efficient behavior. Currently, all consumers lower ir_triop_lrp. Subsequent patches will actually generate different code. v2 [mattst88]: - Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a subsequent patch and ir_triop_lrp translated directly. v3 [mattst88]: - Move changes from the next patch to opt_algebraic.cpp to accept 3-src operations. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
*	glsl: Transform dot product by a basis vector into a swizzle	Matt Turner	2012-06-12	1	-0/+24
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: Check for zero vectors in ir_binop_dot	Matt Turner	2012-06-12	1	-0/+7
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: Put a bunch of optimization visitors under anonymous namespaces.	Eric Anholt	2012-06-11	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because these classes are used entirely from their own source files and not from separate DSOs, the linker gets to produce massively less code. This cuts about 13k of text in the libdricore case. In the non-libdricore case, the additional linkage information allows the compiler to inline some code, so libglsl.a size actually increases by about 300 bytes. For a dricore build, improves shader_runner runtime on glsl-fs-copy-propagation-texcoords-1 by 0.21% +/- 0.03% (n=353574, outliers removed). No statistically significant difference with n=322 on glslparsertest on a yofrankie shader intended to test compiler performance. Reviewed-by: Kenneth Graunke <[email protected]>
*	Convert everything from the talloc API to the ralloc API.	Kenneth Graunke	2011-01-31	1	-1/+1
\|
*	glsl: fix matrix type check in ir_algebraic	Aras Pranckevicius	2010-11-30	1	-2/+2
\| \| \| \|	Fixes glsl-mat-mul-1.
*	glsl: Add ir_quadop_vector expression	Ian Romanick	2010-11-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The vector operator collects 2, 3, or 4 scalar components into a vector. Doing this has several advantages. First, it will make ud-chain tracking for components of vectors much easier. Second, a later optimization pass could collect scalars into vectors to allow generation of SWZ instructions (or similar as operands to other instructions on R200 and i915). It also enables an easy way to generate IR for SWZ instructions in the ARB_vertex_program assembler.
*	glsl: Eliminate assumptions about size of ir_expression::operands	Ian Romanick	2010-11-19	1	-0/+1
\| \| \| \|	This may grow in the near future.
*	glsl: Fix Doxygen tag \file in recently renamed files	Chad Versace	2010-11-17	1	-1/+1
\|
*	glsl: Refactor is_vec_{zero,one} to be methods of ir_constant	Ian Romanick	2010-11-16	1	-68/+4
\| \| \| \|	These predicates will be used in other places soon.
*	glsl: Rename various ir_* files to lower_* and opt_*.	Kenneth Graunke	2010-11-15	1	-0/+474
	This helps distinguish between lowering passes, optimization passes, and other compiler code.