mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	glsl: Add a lowering pass to remove reads of shader output variables.	Vincent Lejeune	2012-01-06	1	-0/+1
\| \| \| \| \| \| \| \| \|	This is similar to Gallium's existing glsl_to_tgsi::remove_output_read lowering pass, but done entirely inside the GLSL compiler. Signed-off-by: Vincent Lejeune <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	glsl: Add uniform_locations_assigned parameter to do_dead_code opt pass	Ian Romanick	2011-10-25	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Setting this flag prevents declarations of uniforms from being removed from the IR. Since the IR is directly used by several API functions that query uniforms in shaders, uniform declarations cannot be removed after the locations have been set. However, it should still be safe to reorder the declarations (this is not tested). Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41980 Tested-by: Brian Paul <[email protected]> Reviewed-by: Bryan Cain <[email protected]> Cc: Vinson Lee <[email protected]> Cc: José Fonseca <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Yuanhan Liu <[email protected]>
*	glsl: Implement a lowering pass for gl_ClipDistance.	Paul Berry	2011-09-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In i965 GEN6+ (and I suspect most other hardware), gl_ClipDistance needs to be laid out as a pair of vec4's (the first containing clip distances 0-3, and the second containing clip distances 4-7). However, it is declared in GLSL as an array of 8 floats. This lowering pass acts at the GLSL level, modifying the declaration of gl_ClipDistance so that it is an array of vec4's rather than an array of floats, and renaming it to gl_ClipDistanceMESA. In addition, it modifies all accesses to the array so that they access the appropiate component of one of the vec4's. Since some hardware may not internally represent gl_ClipDistance as a pair of vec4's, this lowering pass is optional. To enable it, set the LowerClipDistance flag in gl_shader_compiler_options to true. Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: Use a separate div_to_mul_rcp lowering flag for integers.	Bryan Cain	2011-08-31	1	-6/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Using multiply and reciprocal for integer division involves potentially lossy floating point conversions. This is okay for older GPUs that represent integers as floating point, but undesirable for GPUs with native integer division instructions. TGSI, for example, has UDIV/IDIV instructions for integer division, so it makes sense to handle this directly. Likewise for i965. Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Bryan Cain <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
*	glsl: Factor out code that generates block of index comparisons	Ian Romanick	2011-07-23	1	-0/+4
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	glsl: Remove unused function prototypes.	Paul Berry	2011-07-08	1	-2/+0
\| \| \| \| \|	No functional change. Remove prototypes for do_mod_to_fract() and do_sub_to_add_neg(), which haven't existed since November 2010.
*	glsl: Add a new opt_copy_propagation variant that does it channel-wise.	Eric Anholt	2011-02-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	This patch cleans up many of the extra copies in GLSL IR introduced by i965's scalarizing passes. It doesn't result in a statistically significant performance difference on nexuiz high settings (n=3) or my demo (n=10), due to brw_fs.cpp's register coalescing covering most of those extra moves anyway. However, it does make the debug of wine's GLSL shaders much more tractable, and reduces instruction count of glsl-fs-convolution-2 from 376 to 288.
*	glsl: Support if-flattening beyond a given maximum nesting depth.	Kenneth Graunke	2010-12-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This adds a new optional max_depth parameter (defaulting to 0) to lower_if_to_cond_assign, and makes the pass only flatten if-statements nested deeper than that. By default, all if-statements will be flattened, just like before. This patch also renames do_if_to_cond_assign to lower_if_to_cond_assign, to match the new naming conventions.
*	glsl: Lower ir_binop_pow to a sequence of EXP2 and LOG2	Ian Romanick	2010-12-01	1	-2/+3
\|
*	glsl: Add a lowering pass to move discards out of if-statements.	Kenneth Graunke	2010-12-01	1	-0/+1
\| \| \| \| \| \| \|	This should allow lower_if_to_cond_assign to work in the presence of discards, fixing bug #31690 and likely #31983. NOTE: This is a candidate for the 7.9 branch.
*	glsl: Add an optimization pass to simplify discards.	Kenneth Graunke	2010-12-01	1	-0/+1
\| \| \| \|	NOTE: This is a candidate for the 7.9 branch.
*	glsl: Combine many instruction lowering passes into one.	Kenneth Graunke	2010-11-19	1	-2/+8
\| \| \| \| \| \| \|	This should save on the overhead of tree-walking and provide a convenient place to add more instruction lowering in the future. Signed-off-by: Ian Romanick <[email protected]>
*	glsl: Add ir_quadop_vector expression	Ian Romanick	2010-11-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	The vector operator collects 2, 3, or 4 scalar components into a vector. Doing this has several advantages. First, it will make ud-chain tracking for components of vectors much easier. Second, a later optimization pass could collect scalars into vectors to allow generation of SWZ instructions (or similar as operands to other instructions on R200 and i915). It also enables an easy way to generate IR for SWZ instructions in the ARB_vertex_program assembler.
*	glsl: Add a lowering pass for texture projection.	Eric Anholt	2010-09-30	1	-0/+1
\|
*	glsl2: Add flags to enable variable index lowering	Ian Romanick	2010-09-17	1	-1/+2
\|
*	glsl: add pass to lower variable array indexing to conditional assignments	Luca Barbieri	2010-09-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currenly GLSL happily generates indirect addressing of any kind of arrays. Unfortunately DirectX 9 GPUs are not guaranteed to support any of them in general. This pass fixes that by lowering such constructs to a binary search on the values, followed at the end by vectorized generation of equality masks, and 4 conditional assignments for each mask generation. Note that this requires the ir_binop_equal change so that we can emit SEQ to generate the boolean masks. Unfortunately, ir_structure_splitting is too dumb to turn the resulting constant array references to individual variables, so this will need to be added too before this pass can actually be effective for temps. Several patches in the glsl2-lower-variable-indexing were squashed into this commit. These patches fix bugs in Luca's original implementation, and the individual patches can be seen in that branch. This was done to aid bisecting in the future. Signed-off-by: Ian Romanick <[email protected]>
*	glsl2: Add pass to remove redundant jumps	Ian Romanick	2010-09-13	1	-0/+1
\|
*	glsl: add continue/break/return unification/elimination pass (v2)	Luca Barbieri	2010-09-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Changes in v2: - Base class renamed to ir_control_flow_visitor - Tried to comply with coding style This is a new pass that supersedes ir_if_return and "lowers" jumps to if/else structures. Currently it causes no regressions on softpipe and nv40, but I'm not sure whether the piglit glsl tests are thorough enough, so consider this experimental. It can be asked to: 1. Pull jumps out of ifs where possible 2. Remove all "continue"s, replacing them with an "execute flag" 3. Replace all "break" with a single conditional one at the end of the loop 4. Replace all "return"s with a single return at the end of the function, for the main function and/or other functions This gives several great benefits: 1. All functions can be inlined after this pass 2. nv40 and other pre-DX10 chips without "continue" can be supported 3. nv30 and other pre-DX10 chips with no control flow at all are better supported Note that for full effect we should also teach the unroller to unroll loops with a fixed maximum number of iterations but with the canonical conditional "break" that this pass will insert if asked to. Continues are lowered by adding a per-loop "execute flag", initialized to TRUE, that when cleared inhibits all execution until the end of the loop. Breaks are lowered to continues, plus setting a "break flag" that is checked at the end of the loop, and trigger the unique "break". Returns are lowered to breaks/continues, plus adding a "return flag" that causes loops to break again out of their enclosing loops until all the loops are exited: then the "execute flag" logic will ignore everything until the end of the function. Note that "continue" and "return" can also be implemented by adding a dummy loop and using break. However, this is bad for hardware with limited nesting depth, and prevents further optimization, and thus is not currently performed.
*	glsl2: Add lowering pass to remove noise opcodes	Ian Romanick	2010-09-09	1	-0/+1
\|
*	glsl: add several EmitNo* options, and MaxUnrollIterations	Luca Barbieri	2010-09-08	1	-1/+1
\| \| \| \| \| \| \| \| \|	This increases the chance that GLSL programs will actually work. Note that continues and returns are not yet lowered, so linking will just fail if not supported. Signed-off-by: Ian Romanick <[email protected]>
*	glsl2: Add a pass to strip out noop swizzles.	Eric Anholt	2010-08-13	1	-0/+1
\| \| \| \| \| \|	With the glsl2-965 branch, the optimization of glsl-algebraic-rcp-rcp regressed due to noop swizzles hiding information from ir_algebraic. This cleans up those noop swizzles for us.
*	glsl2: Move the common optimization passes to a helper function.	Eric Anholt	2010-08-13	1	-0/+2
\| \| \| \| \|	These are passes that we expect all codegen to be happy with. The other lowering passes for Mesa IR are moved to the Mesa IR generator.
*	glsl2: Add a pass to transform ir_binop_sub to add(op0, neg(op1))	Eric Anholt	2010-08-09	1	-0/+1
\| \| \| \| \| \|	All the current HW backends transform subtract to adding the negation, so I haven't bothered peepholing it back out in Mesa IR. This allows some subtract of subtract to get removed in ir_algebraic.
*	glsl2: Add constant propagation.	Eric Anholt	2010-08-09	1	-0/+1
\| \| \| \| \| \| \| \|	Whereas constant folding evaluates constant expressions at rvalue nodes, constant propagation tracks constant components of vectors across execution to replace (possibly swizzled) variable dereferences with constant values, triggering possible constant folding or reduced variable liveness.
*	glsl2: Add a pass to convert exp and log to exp2 and log2.	Eric Anholt	2010-08-05	1	-0/+1
\| \| \| \| \| \| \| \| \|	Fixes ir_to_mesa handling of unop_log, which used the weird ARB_vp LOG opcode that doesn't do what we want. This also lets the multiplication coefficients in there get constant-folded, possibly. Fixes: glsl-fs-log
*	ir_structure_splitting: New pass to chop structures into their components.	Eric Anholt	2010-08-05	1	-0/+1
\| \| \| \| \| \| \|	This doesn't do anything if your structure goes through an uninlined function call or if whole-structure assignment occurs. As such, the impact is limited, at least until we do some global copy propagation to reduce whole-structure assignment.
*	glsl2: Add a pass for removing unused functions.	Eric Anholt	2010-08-05	1	-0/+1
\| \| \| \| \| \| \| \| \|	For a shader involving many small functions, this avoids running optimization across all of them after they've been inlined post-linking. Reduces the runtime of linking and running a fragment shader from Yo Frankie from 1.6 seconds to 0.9 seconds (-44.9%, +/- 3.3%).
*	glsl2: Add new tree grafting optimization pass.	Eric Anholt	2010-07-31	1	-0/+1
\|
*	glsl2: Make the dead code handler make its own talloc context.	Eric Anholt	2010-07-27	1	-4/+2
\| \| \| \| \|	This way, we don't need to pass in a parse state, and the context doesn't grow with the number of passes through optimization.
*	glsl2: Add optimization pass for algebraic simplifications.	Eric Anholt	2010-07-27	1	-1/+2
\| \| \| \| \| \|	This cleans up the assembly output of almost all the non-logic tests glsl-algebraic-*. glsl-algebraic-pow-two needs love (basically, flattening to a temporary and squaring it).
*	glsl2: Add a pass for converting if statements to conditional assignment.	Eric Anholt	2010-07-19	1	-0/+1
\| \| \| \|	This will be used on 915 and similar hardware of that generation.
*	glsl2: Add a new pass at the IR level to break down matrix ops to vector ops.	Eric Anholt	2010-07-12	1	-0/+1
\| \| \| \| \| \| \|	This will be used by the Mesa IR and likely most HW backends, as it allows other optimizations to occur that might not otherwise. Fixes glsl-vs-mat-sub-1, glsl-vs-mat-div-1.
*	glsl2: Add a pass to simplify if statements returning from both sides.	Eric Anholt	2010-07-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This allows function inlining making the following tests work even without function calls implemented: glsl-fs-functions-2 glsl-fs-functions-3 glsl-vs-functions glsl-vs-functions-2 glsl-vs-functions-3 glsl-vs-vec4-indexing-5 (Note that those tests were designed to trigger actual function calls, and this defeats them. However, those testcases ended up catching the bug in the previous commit.)
*	glsl2: Add pass for supporting variable vector indexing in rvalues.	Eric Anholt	2010-07-06	1	-0/+1
\| \| \| \| \| \| \|	The Mesa IR needs this to support vector indexing correctly, and hardware backends such as 915 would want this behavior as well. Fixes glsl-vs-vec4-indexing-2.
*	glsl2: Add a pass to break ir_binop_div to _mul and _rcp.	Eric Anholt	2010-07-02	1	-0/+1
\| \| \| \|	This results in constant folding of a constant divisor.
*	glsl2: Add a pass to convert mod(a, b) to b * fract(a/b).	Eric Anholt	2010-07-01	1	-0/+1
\| \| \| \|	This is used by the Mesa IR backend to implement mod, fixing glsl-fs-mod.
*	glsl2: Use the parser state as the talloc context for dead code elimination.	Eric Anholt	2010-06-25	1	-2/+4
\| \| \| \|	This cuts runtime by around 20% from talloc_parent() lookups.
*	glsl2: Move the compiler to the subdirectory it will live in in Mesa.	Eric Anholt	2010-06-24	1	-0/+41