mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	glsl: Rewrite and fix min/max to saturate optimization.	Matt Turner	2015-02-27	1	-29/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were some bugs, and the code was really difficult to follow. We would optimize min(max(x, b), 1.0) into max(sat(x), b) but not pay attention to the order of min/max and also do max(min(x, b), 1.0) into max(sat(x), b) Corrects four shaders from Champions of Regnum that do min(max(x, 1), 10) and corrects rendering of Mass Effect under VMware Workstation. Cc: "10.4 10.5" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89180 Reviewed-by: Abdiel Janulgue <[email protected]> Reviewed-by: Ian Romanick <[email protected]> (cherry picked from commit cb25087c7bd5f1ad2515647278b32d3f07803f77)
*	Avoid fighting with Solaris headers over isnormal()	Alan Coopersmith	2015-02-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	When compiling in C99 or C++11 modes, Solaris defines isnormal() as a macro via <math.h>, which causes the function definition to become too mangled to compile. Signed-off-by: Alan Coopersmith <[email protected]> Cc: "10.5" <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]> (cherry picked from commit d602fbd861e2c3c5570b55f0839361a6f8bd32c7)
*	Remove extraneous ; after DECL_TYPE usage	Alan Coopersmith	2015-02-24	1	-33/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The macro is defined to provide a trailing ; so this caused the expansion to end in ";;" which made the Solaris Studio compilers issue warnings for every line of: "builtin_type_macros.h", line 113: Warning: extra ";" ignored. for every file that included the header, filling build logs with thousands of useless warnings. Signed-off-by: Alan Coopersmith <[email protected]> Cc: "10.5" <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]> (cherry picked from commit 815b3bd096a3eab9f00f9270d45a6885d73180e9)
*	glsl: Reduce memory consumption of copy propagation passes.	Kenneth Graunke	2015-02-24	2	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	opt_copy_propagation and opt_copy_propagation_elements create new ACP and Kill sets each time they enter a new control flow block. For if blocks, they also copy the entire existing ACP set contents into the new set. When we exit the control flow block, we discard the new sets. However, we weren't freeing them - so they lived on until the pass finished. This can waste a lot of memory (57MB on one pessimal shader). This patch makes the pass allocate ACP entries using this->acp as the memory context, and Kill entries out of this->kill. It also steals kill entries when moving them from the inner kill list to the parent. It then frees the lists, including their contents. v2: Move ralloc_free(this->acp) just before this->acp = orig_acp (suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "10.5 10.4" <[email protected]> (cherry picked from commit 76960a55e6656bb0022e9c31ae7542010da130e3)
*	nir: add missing header to the sources list	Emil Velikov	2015-02-12	1	-0/+1
\| \| \| \| \| \| \|	Cc: "10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	nir: resolve nir.h dependency list (fix make distcheck)	Emil Velikov	2015-02-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Use nir/nir_opcodes.h as is (w/o the absolute path), as it is the target name used to generate the actual file. Otherwise the target is missing, the file won't get generated and the build will fail. Cc: "10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	nir: Fix broken fsat recognizer.	Eric Anholt	2015-02-06	1	-1/+1
\| \| \| \| \| \| \| \|	We've probably never seen this ridiculous pattern in the wild, so it didn't matter. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Slightly simplify algebraic code generation by reusing a struct.	Eric Anholt	2015-02-06	1	-6/+3
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	glsl: GLSL ES identifiers cannot exceed 1024 characters	Iago Toral Quiroga	2015-02-06	1	-1/+7
\| \| \| \| \| \| \| \| \| \| \|	v2 (Ian Romanick) - Move the check to the lexer before rallocing a copy of the large string. Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_vertex dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_fragment Reviewed-by: Ian Romanick <[email protected]>
*	nir: add an optimization to remove useless phi nodes	Connor Abbott	2015-02-03	3	-0/+112
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This removes phi nodes whose sources all point to the same thing. Shader-db results: total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%) NIR instructions in affected programs: 126564 -> 122480 (-3.23%) helped: 615 HURT: 0 total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%) FS instructions in affected programs: 24622 -> 23174 (-5.88%) helped: 138 HURT: 0 Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
*	nir/validate: Ensure that phi sources are SSA-only	Jason Ekstrand	2015-02-03	1	-10/+3
\| \| \| \|	Reviewed-by: Connor Abbott <[email protected]>
*	nir/validate: Validate that only float ALU outputs are saturated	Jason Ekstrand	2015-02-03	1	-0/+8
\| \| \| \|	Reviewed-by: Connor Abbott <[email protected]>
*	nir/lower_source_mods: Don't lower saturate for non-float outputs	Jason Ekstrand	2015-02-03	1	-0/+4
\| \| \| \|	Reviewed-by: Connor Abbott <[email protected]>
*	nir: Add a pass to lower vector phi nodes to scalar phi nodes	Jason Ekstrand	2015-02-03	3	-0/+293
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2 Jason Ekstrand <[email protected]>: - Add better comments - Use nir_ssa_dest_init and nir_src_for_ssa more places - Fix some void * casts v3 Jason Ekstrand <[email protected]>: - Rework the way we determine whether or not to sccalarize a phi node to make the recursion non-bogus - Treat load_const instructions as scalarizable v4 Jason Ekstrand <[email protected]>: - Allow uniform and input loads to be scalarizable v5 Jason Ekstrand <[email protected]>: - Also consider loads of inputs (varying, uniform, or ubo) to be scalarizable. We were already doing this for load_var on uniforms and inputs. Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl/list: Note that exec_lists may not be realloc'd.	Matt Turner	2015-02-03	1	-0/+4
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: Improve precision of mod(x,y)	Iago Toral Quiroga	2015-02-03	3	-28/+41
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error ----------------------------------------------------- mod(-1.951171875, 1.9980468750) 0.0000000447 mod(121.57, 13.29) 0.0000023842 mod(3769.12, 321.99) 0.0000762939 mod(3769.12, 1321.99) 0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.0312500000 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* Reviewed-by: Ian Romanick <[email protected]>
*	glsl: can't have 'const' qualifier used with struct or interface block members	Iago Toral Quiroga	2015-02-03	1	-0/+7
\| \| \| \| \| \| \| \|	Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_fragment Reviewed-by: Ian Romanick <[email protected]>
*	glsl: interface blocks must be declared at global scope	Iago Toral Quiroga	2015-02-03	1	-0/+8
\| \| \| \| \| \| \| \|	Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_fragment Reviewed-by: Ian Romanick <[email protected]>
*	glsl: Pick ast_conditional branch regardless of op1/2 being constant.	Kenneth Graunke	2015-02-02	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If the ?: operator's condition is a constant value, and both branches were pure expressions, we can just make the resulting value one or the other. Previously, we only did this if op[1] and op[2] were also constant values - but there's no actual reason for that restriction. No changes in shader-db, probably because we usually optimize this later anyway. But it does make us generate less stupid code up front. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	nir/opt_algebraic: Add some constant bcsel reductions	Jason Ekstrand	2015-01-29	1	-2/+28
\| \| \| \| \| \| \| \|	total instructions in shared programs: 5998190 -> 5997603 (-0.01%) instructions in affected programs: 54276 -> 53689 (-1.08%) helped: 293 Reviewed-by: Kenneth Graunke <[email protected]>
*	nir/opt_algebraic: Add some boolean simplifications	Jason Ekstrand	2015-01-29	1	-4/+5
\| \| \| \| \| \| \| \|	total instructions in shared programs: 5998321 -> 5998287 (-0.00%) instructions in affected programs: 4520 -> 4486 (-0.75%) helped: 8 Reviewed-by: Kenneth Graunke <[email protected]>
*	nir/algebraic: Support specifying variable as constant or by type	Jason Ekstrand	2015-01-29	2	-6/+26
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	nir/algebraic: Fail to compile of a variable is used in a replace but not ↵	Jason Ekstrand	2015-01-29	1	-0/+7
\| \| \| \| \| \|	the search Reviewed-by: Kenneth Graunke <[email protected]>
*	nir/search: Allow for matching variables based on types	Jason Ekstrand	2015-01-29	2	-0/+23
\| \| \| \| \| \| \| \| \| \| \|	This allows you to match on an unknown value but only if it is of a given type. 90% of the uses of this are for matching only booleans, but adding the generality of arbitrary types is no more complex. nir_algebraic.py doesn't handle this yet but that's ok because the C language will ensure that the default type on all variables is void. Reviewed-by: Kenneth Graunke <[email protected]>
*	nir/search: Add support for matching unknown constants	Jason Ekstrand	2015-01-29	2	-0/+13
\| \| \| \| \| \| \| \| \| \| \| \| \|	There are some algebraic transformations that we want to do but only if certain things are constants. For instance, we may want to replace a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant. While this generates more instructions, some of it will get constant folded. nir_algebraic.py doesn't handle this yet, but that's ok because the C language will make sure that false is the default for now. Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Add an invalid type	Jason Ekstrand	2015-01-29	1	-0/+1
\| \| \| \| \| \|	This allows us to indicate a concept of an invalid type. Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Add variants of some of the comparison simplifications.	Eric Anholt	2015-01-29	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \|	We end up with these from TGSI-to-NIR because the pass generating the comparisons doesn't know if the arg is actually a bool input or not. vc4 results: total instructions in shared programs: 41801 -> 41508 (-0.70%) instructions in affected programs: 4253 -> 3960 (-6.89%) Reviewed-by: Matt Turner <[email protected]>
*	nir: Don't try to to-SSA ALU instructions that are already SSA.	Eric Anholt	2015-01-29	1	-0/+3
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Fix a bit of broken indentation.	Eric Anholt	2015-01-29	1	-1/+1
\| \| \| \| \|	Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add a couple of helpers for glsl types.	Eric Anholt	2015-01-29	2	-1/+16
\| \| \| \| \| \| \| \| \| \|	This will be used by tgsi_to_nir, which needs to get vec4 types for declaring shader input/output variables. v2: Add a missing space. Reviewed-by: Matt Turner <[email protected]> (v2) Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Make vec-to-movs handle src/dest aliasing.	Eric Anholt	2015-01-28	1	-10/+72
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It now emits vector MOVs instead of a series of individual MOVs, which should be useful to any vector backends. This pushes the problem of src/dest aliasing of channels on a scalar chip to the backend, but if there are any vector operations in your shader then you needed to be handling this already. Fixes fs-swap-problem with my scalarizing patches. v2: Rename to insert_mov(), and add a comment about what it does. v3: Rewrite the comment. Reviewed-by: Connor Abbott <[email protected]> (v3)
*	nir/opcodes: Use a return type of tfloat for ldexp	Jason Ekstrand	2015-01-28	1	-1/+1
\| \| \| \| \|	Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	Revert "nir/opcodes: Use fpclassify() instead of isnormal() for ldexp"	Jason Ekstrand	2015-01-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	This reverts commit d7d340fb2f68c46bd5a0008ecf53c6693e29c916. We have an isnormal() implementation available, the only problem was that we had the wrong return type (fixed in a later patch). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806 Acked-by: Matt Turner <[email protected]>
*	nir/opcodes: Use fpclassify() instead of isnormal() for ldexp	Jason Ekstrand	2015-01-28	1	-1/+1
\| \| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806 Reviewed-by: Ian Romanick <[email protected]>
*	nir: fix a bug with constant folding non-per-component instructions	Connor Abbott	2015-01-26	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	Before, we were only copying the first N channels, where N is the size of the SSA destination, which is fine for per-component instructions, but non-per-component instructions like fdot3 can have more source components than destination components. Fix this using the helper function introduced in the last patch. v2: use new helper name Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
*	nir: add a helper function for getting the number of source components	Connor Abbott	2015-01-26	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Unlike with non-SSA ALU instructions, where if they're per-component you have to look at the writemask to know which source channels are being used, SSA ALU instructions always have all the possible channels enabled so we can just look at the number of components in the SSA definition for per-component instructions to say how many source components are being used. v2: use new name nir_ssa_alu_instr_src_components() Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
*	nir/opcodes: Don't go through doubles when constant-folding iabs	Jason Ekstrand	2015-01-26	1	-1/+1
\| \| \| \| \| \| \| \| \|	Previously, we called the abs() function in math.h. However, this involves unnecessarily going through double. This commit changes it to use integers directly with a ternary. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	nir/opcodes: Simplify and fix the unpack_half__split_ constant expressions	Jason Ekstrand	2015-01-26	1	-6/+4
\| \| \| \| \| \| \| \| \| \|	Previously, these functions were explicitly writing to dst.x and dst.y. However they both return only one component so writing to dst.y is invalid. Also, since they only return one component, we don't need the explicit assignment in the expression and can simplify it use an implicit assignment. Reviewed-by: Connor Abbott <[email protected]>
*	nir: Use pointers for nir_src_copy and nir_dest_copy	Jason Ekstrand	2015-01-26	10	-53/+47
\| \| \| \| \| \| \| \|	This avoids the overhead of copying structures and better matches the newly added nir_alu_src_copy and nir_alu_dest_copy. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir/constant_folding: use the new constant folding infrastructure	Connor Abbott	2015-01-24	1	-158/+21
\| \| \| \| \|	Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: add new constant folding infrastructure	Jason Ekstrand	2015-01-24	6	-184/+787
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a required field to the Opcode class, const_expr, that contains an expression or statement that computes the result of the opcode given known constant inputs. Then take those const_expr's and expand them into a function that takes an opcode and an array of constant inputs and spits out the constant result. This means that when adding opcodes, there's one less place to update, and almost all the opcodes are self-documenting since the information on how to compute the result is right next to the definition. The helper functions in nir_constant_expressions.c were taken from ir_constant_expressions.cpp. v3 Jason Ekstrand <[email protected]> - Use mako to generate one function per opcode instead of doing piles of string splicing v4 Jason Ekstrand <[email protected]> - More comments and better indentation in the mako - Add a description of the constant expression language in nir_opcodes.py - Added nir_constant_expressions.py to EXTRA_DIST in Makefile.am Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir: use Python to autogenerate opcode information	Connor Abbott	2015-01-24	8	-401/+478
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before, we used a system where a file, nir_opcodes.h, defined some macros that were included to generate the enum values and the nir_op_infos structure. This worked pretty well, but for development the error messages were never very useful, Python tools couldn't understand the opcode list, and it was difficult to use nir_opcodes.h to do other things like autogenerate a builder API. Now, we store opcode information in nir_opcodes.py, and we have nir_opcodes_c.py to generate the old nir_opcodes.c and nir_opcodes_h.py to generate nir_opcodes.h, which contains all the enum names and gets included into nir.h like before. In addition to solving the above problems, using Python and Mako to generate everything means that it's much easier to add keep information centralized as we add new things like constant propagation that require per-opcode information. v2: - make Opcode derive from object (Dylan) - don't use assert like it's a function (Dylan) - style fixes for fnoise, use xrange (Dylan) - use iterkeys() in nir_opcodes_h.py (Dylan) - use pydoc-style comments (Jason) - don't make fmin/fmax commutative and associative yet (Jason) Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> v3 Jason Ekstrand <[email protected]> - Alphabetize source file lists - Generate nir_opcodes.h in the builddir instead of the source dir - Include $(builddir)/src/glsl/nir in the i965 build - Rework nir_opcodes.h generation so it generates a complete header file instead of one that has to be embedded inside an enum declaration
*	glsl: Add a foreach_in_list_reverse_safe macro.	Matt Turner	2015-01-23	1	-0/+6
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Expose nir_print_instr() for debug prints	Eric Anholt	2015-01-23	2	-2/+8
\| \| \| \| \| \| \| \| \|	It's nice to have this present in your default cases so you can see what instruction is triggering an abort. v2: Just pass a NULL state, now that it won't crash when you do. Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: When asked to print with a NULL state, just use bare variable names.	Eric Anholt	2015-01-23	1	-6/+16
\| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add nir_lower_alu_to_scalar.	Eric Anholt	2015-01-23	3	-0/+188
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is the equivalent of brw_fs_channel_expressions.cpp, which I wanted for vc4. v2: Use the nir_src_for_ssa() helper, and another instance of nir_alu_src_copy(). v3: Drop the non-SSA support. All intended callers will have SSA-only ALU ops. v4: Use insert_before, drop stale bcsel/fcsel comment, drop now-unused unsupported() function, drop lower_context struct. v5: Completely rename the pass to nir_lower_alu_to_scalar(), add an assert about weird input_sizes[]. Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Make some helpers for copying ALU src/dests.	Eric Anholt	2015-01-23	4	-9/+25
\| \| \| \| \| \| \| \| \|	There aren't many users yet, but I wanted to do this from my scalarizing pass. v2: Constify the src arguments. Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add algebraic optimizations for division and reciprocal.	Kenneth Graunke	2015-01-23	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These also exist in opt_algebraic.cpp. total NIR instructions in shared programs: 2011430 -> 2011211 (-0.01%) NIR instructions in affected programs: 42221 -> 42002 (-0.52%) helped: 198 total i965 instructions in shared programs: 6020553 -> 6020116 (-0.01%) i965 instructions in affected programs: 84322 -> 83885 (-0.52%) helped: 394 HURT: 1 (by 1 instruction) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	nir: Add algebraic optimizations for exponential/logarithmic functions.	Kenneth Graunke	2015-01-23	1	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most of these exist in the GLSL IR algebraic pass already. However, SSA allows us to find more instances of the patterns. total NIR instructions in shared programs: 2015593 -> 2011430 (-0.21%) NIR instructions in affected programs: 124189 -> 120026 (-3.35%) helped: 604 total i965 instructions in shared programs: 6025505 -> 6018717 (-0.11%) i965 instructions in affected programs: 261295 -> 254507 (-2.60%) helped: 1295 HURT: 3 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	nir: Add algebraic optimizations for simplifying comparisons.	Kenneth Graunke	2015-01-23	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The first batch removes bonus fnot/inot operations, possibly allowing other optimizations to better recognize patterns. The next batch replaces a fadd and constant 0.0 with an fneg - negation is usually free on GPUs, while addition is not. total NIR instructions in shared programs: 2020814 -> 2015593 (-0.26%) NIR instructions in affected programs: 411143 -> 405922 (-1.27%) helped: 2233 HURT: 214 A few shaders are hurt by a few instructions due to moving neg such that it has a constant operand, which is then folded, resulting in two distinct load_consts for x and -x. We can always clean that up later. total i965 instructions in shared programs: 6035392 -> 6025505 (-0.16%) i965 instructions in affected programs: 784980 -> 775093 (-1.26%) helped: 4508 HURT: 2 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>