summaryrefslogtreecommitdiffstats
path: root/src/glsl/opt_algebraic.cpp
Commit message (Collapse)AuthorAgeFilesLines
* glsl: Make sure not to dereference NULLNeil Roberts2015-07-061-1/+1
| | | | | | | | | | In this bit of code point_five can be NULL if the expression is not a constant. This fixes it to match the pattern of the rest of the chunk of code so that it checks for NULLs. Cc: Matt Turner <[email protected]> Cc: "10.6" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add missing check for whether an expression is an add operationNeil Roberts2015-07-061-1/+1
| | | | | | | | | | | | | | | There is a piece of code that is trying to match expressions of the form (mul (floor (add (abs x) 0.5) (sign x))). However the check for the add expression wasn't checking whether it had the expected operation. It looks like this was just an oversight because it doesn't match the pattern for the rest of the code snippet. The existing line to check whether add_expr!=NULL was added as part of a coverity fix in 3384179f. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91226 Cc: Matt Turner <[email protected]> Cc: "10.6" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Transform pow(x, 4) into (x*x)*(x*x).Matt Turner2015-04-241-0/+20
| | | | | Reviewed-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Make sure not to dereference NULL.Matt Turner2015-04-011-0/+2
| | | | Found by Coverity.
* glsl: Reassociate multiplication of mat*mat*vec.Matt Turner2015-03-311-0/+14
| | | | | | | | | | | | | | | | | | | The typical case of mat4*mat4*vec4 is 80 scalar multiplications, but mat4*(mat4*vec4) is only 32. On HSW (with vec4 vertex shaders): instructions in affected programs: 4420 -> 3194 (-27.74%) On BDW (with scalar vertex shaders): instructions in affected programs: 12756 -> 6726 (-47.27%) Implementing a general matrix chain ordering is harder (or at least tedious) because of having to walk the GLSL IR to create a list of multiplicands. I'm guessing that this patch handles 90+% of cases, but of course to tell definitively you'd have to implement the general thing. Reviewed-by: Chris Forbes <[email protected]>
* glsl: Recognize sat(add(b2f(a), b2f(b))) as a logical OR.Matt Turner2015-03-241-0/+12
| | | | | | Transform this into b2f(or(a, b)). Reviewed-by: Ian Romanick <[email protected]>
* glsl: Recognize mul(b2f(a), b2f(b)) as a logical AND.Matt Turner2015-03-241-0/+4
| | | | | | | | | | Transform this into b2f(and(a, b)). total instructions in shared programs: 6190291 -> 6189225 (-0.02%) instructions in affected programs: 267247 -> 266181 (-0.40%) helped: 866 Reviewed-by: Ian Romanick <[email protected]>
* glsl: optimize (0 cmp x + y) into (-x cmp y).Samuel Iglesias Gonsalvez2015-03-131-3/+12
| | | | | | | | | | | | | The optimization done by commit 34ec1a24d did not take it into account. Fixes: dEQP-GLES3.functional.shaders.random.all_features.fragment.20 Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.4 10.5" <[email protected]>
* glsl: silence uninitialized var warning on MinGWBrian Paul2015-02-271-0/+1
| | | | Reviewed-by: Anuj Phogat <[email protected]>
* glsl: Rewrite and fix min/max to saturate optimization.Matt Turner2015-02-251-29/+46
| | | | | | | | | | | | | | | | | | | | | | There were some bugs, and the code was really difficult to follow. We would optimize min(max(x, b), 1.0) into max(sat(x), b) but not pay attention to the order of min/max and also do max(min(x, b), 1.0) into max(sat(x), b) Corrects four shaders from Champions of Regnum that do min(max(x, 1), 10) and corrects rendering of Mass Effect under VMware Workstation. Cc: "10.4 10.5" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89180 Reviewed-by: Abdiel Janulgue <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add support doubles in optimization passesDave Airlie2015-02-191-4/+22
| | | | | | Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Optimize (f2i(trunc x)) into (f2i x).Matt Turner2015-02-111-0/+9
| | | | | | total instructions in shared programs: 5950326 -> 5949286 (-0.02%) instructions in affected programs: 88264 -> 87224 (-1.18%) helped: 692
* glsl: Optimize round-half-up pattern.Matt Turner2015-02-111-0/+33
| | | | | Hurts some Psychonauts shaders, but after the next patch (which this enables) they're fewer instructions than before this patch.
* glsl: Optimize 1/exp(x) into exp(-x).Matt Turner2015-02-101-0/+6
| | | | | | | | | | | | | Lots of shaders divide by exp2(...) which we turn into a multiplication by the reciprocal. We can avoid the reciprocal by simply negating exp2's argument. total instructions in shared programs: 5947154 -> 5946695 (-0.01%) instructions in affected programs: 118661 -> 118202 (-0.39%) helped: 380 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Don't optimize min/max into saturate when EmitNoSat is setAbdiel Janulgue2014-12-081-1/+1
| | | | | | | v3: Fix multi-line comment format (Ian) Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* glsl: Optimize scalar all_equal/any_nequal into equal/nequal.Matt Turner2014-12-051-0/+10
| | | | | | | | | Cuts an instruction from two shaders in Tesseract, by allowing the (x+y) cmp 0 -> x cmp -y optimization to take place. instructions in affected programs: 1198 -> 1194 (-0.33%) Reviewed-by: Eric Anholt <[email protected]>
* glsl: Remove now useless dot optimization on basis vectMatt Turner2014-11-031-23/+0
| | | | | | | The optimization in commit d056863b covers these cases, which were the first optimizations I added to the GLSL compiler. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Emit mul instead of dot if only one component left.Matt Turner2014-11-031-1/+4
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85683 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85691 Reviewed-by: Ian Romanick <[email protected]>
* glsl: Drop constant 0.0 components from dot products.Matt Turner2014-10-291-0/+27
| | | | | | | | | Helps a small number of vertex shaders in the games Dungeon Defenders and Shank, as well as an internal benchmark. instructions in affected programs: 2801 -> 2719 (-2.93%) Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Recognize open-coded pow(x, y).Matt Turner2014-09-271-0/+14
| | | | | | | | pow(x, y) is equivalent to exp(log(x) * y). instructions in affected programs: 578 -> 458 (-20.76%) Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Optimize clamp(x, b, 1.0), where b > 0.0 as max(saturate(x),b)Abdiel Janulgue2014-08-311-0/+23
| | | | | | | | | | | | v2: - Output max(saturate(x),b) instead of saturate(max(x,b)) - Make sure we do component-wise comparison for vectors (Ian Romanick) v3: - Add missing condition where the outer constant value is > 0.0 and inner constant is 1.0. - Fix comments to show that the optimization is a commutative operation (Matt Turner) Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* glsl: Optimize clamp(x, 0.0, b), where b < 1.0 as min(saturate(x),b)Abdiel Janulgue2014-08-311-0/+39
| | | | | | | | | | | v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by Ilia Mirkin - Make sure we do component-wise comparison for vectors (Ian Romanick) v3: - Add missing condition where the outer constant value is zero and inner constant is < 1 - Fix comments to reflect we are doing a commutative operation (Matt Turner) Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* glsl: Optimize clamp(x, 0, 1) as saturate(x)Abdiel Janulgue2014-08-311-0/+36
| | | | | | | | | | | v2: - Check that the base type is float (Ian Romanick) v3: - Make sure comments reflect that we are doing a commutative operation - Add missing condition where the inner constant is 1.0 and outer constant is 0.0 - Make indexing of operands easier to read (Matt Turner) Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* glsl: Don't convert reductions of ivec to a dot-productIan Romanick2014-06-251-1/+3
| | | | | | | | | | | | | | | | | | Mesa has an optimization that converts expressions like "v.x + v.y + v.z + v.w" into dot(v, 1.0). And therein lies the rub: the other operand to the dot-product is always a float... even if the vector is an ivec or uvec. This results in an assertion failure in ir_builder. If the base type of the operand is not float, don't try the optimization. Dot-product is not valid on integer data. Fixes piglit vs-integer-reduction.shader_test and OpenGL ES conformance test ES2-CTS.gtf.GL2Tests.glGetUniform.glGetUniform. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Christoph Brill <[email protected]>
* glsl: Optimize (v.x + v.y) + (v.z + v.w) into dot(v, 1.0).Matt Turner2014-06-191-0/+46
| | | | Cuts five instructions out of SynMark's Gl32VSInstancing benchmark.
* glsl: Pass in options to do_algebraic().Matt Turner2014-06-191-3/+8
| | | | | | Will be used in the next commit. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Pass ctx->Const.NativeIntegers to do_algebraic.Kenneth Graunke2014-04-081-3/+5
| | | | | | | | | The next patch will introduce an optimization that only works when integers are not represented as floating point values. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize (x + y cmp 0) into (x cmp -y).Matt Turner2014-04-051-0/+22
| | | | | | | | Cuts a small handful of instructions in Serious Sam 3: instructions in affected programs: 4692 -> 4666 (-0.55%) Reviewed-by: Ian Romanick <[email protected]>
* glsl: Optimize pow(x, 2) into x * x.Matt Turner2014-03-181-0/+8
| | | | | | Cuts two instructions out of SynMark's Gl32VSInstancing benchmark. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Fix broken LRP algebraic optimization.Kenneth Graunke2014-03-021-1/+3
| | | | | | | | | | | | | | | | | | | | | opt_algebraic was translating lrp(x, 0, a) into add(x, -mul(x, a)). Unfortunately, this references "x" twice, which is invalid in the IR, leading to assertion failures in the validator. Normally, cloning IR solves this. However, "x" could actually be an arbitrary expression tree, so copying it could result in huge piles of wasted computation. This is why we avoid reusing subexpressions. Instead, transform it into mul(x, add(1.0, -a)), which is equivalent but doesn't need two references to "x". Fixes a regression since d5fa8a95621169, which isn't in any stable branches. Fixes 18 shaders in shader-db (bastion and yofrankie). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize lrp(x, 0, a) into x - (x * a).Matt Turner2014-02-281-0/+2
| | | | | | | | Helps one program in shader-db: instructions in affected programs: 96 -> 92 (-4.17%) Reviewed-by: Ian Romanick <[email protected]>
* glsl: Optimize lrp(0, y, a) into y * a.Matt Turner2014-02-281-0/+2
| | | | | | | | Helps two programs in shader-db: instructions in affected programs: 254 -> 234 (-7.87%) Reviewed-by: Ian Romanick <[email protected]>
* glsl: Optimize triop_csel with all-true or all-false.Eric Anholt2014-02-071-0/+7
| | | | Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize various cases of fma (aka MAD).Eric Anholt2014-02-071-0/+13
| | | | Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize lrp(x, x, coefficient) --> x.Eric Anholt2014-02-071-0/+2
| | | | | | | | | | | total instructions in shared programs: 1627754 -> 1624534 (-0.20%) instructions in affected programs: 45748 -> 42528 (-7.04%) GAINED: 3 LOST: 0 (serious sam, humus domino demo) Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize pow(x, 1) -> x.Eric Anholt2014-02-071-0/+4
| | | | | | | | | | | total instructions in shared programs: 1627826 -> 1627754 (-0.00%) instructions in affected programs: 6640 -> 6568 (-1.08%) GAINED: 0 LOST: 0 (HoN and savage2) Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize log(exp(x)) and exp(log(x)) into x.Eric Anholt2014-02-071-0/+36
| | | | Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize ~~x into x.Eric Anholt2014-02-071-0/+5
| | | | | | | v2: Fix pasteo of an extra abs being inserted (caught by many). Rewrite to drop the silly switch statement. Reviewed-by: Matt Turner <[email protected]> (v1)
* glsl: Optimize open-coded lrp into lrp.Jordan Justen2014-01-211-0/+52
| | | | | | | | | | total instructions in shared programs: 1498191 -> 1487051 (-0.74%) instructions in affected programs: 669388 -> 658248 (-1.66%) GAINED: 1 LOST: 0 Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Jordan Justen <[email protected]>
* glsl: Optimize pow(2, x) --> exp2(x).Kenneth Graunke2014-01-071-0/+11
| | | | | | | | | | | | | | | | | On Haswell, POW takes 24 cycles, while EXP2 only takes 14. Plus, using POW requires putting 2.0 in a register, while EXP2 doesn't. I believe that EXP2 will be faster than POW on basically all GPUs, so it makes sense to optimize it. Looking at the savage2 subset of shader-db: total instructions in shared programs: 113225 -> 113179 (-0.04%) instructions in affected programs: 2139 -> 2093 (-2.15%) instances of 'math pow': 795 -> 749 (-6.14%) instances of 'math exp': 389 -> 435 (11.8%) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize pow(1.0, X) --> 1.0.Kenneth Graunke2014-01-071-0/+6
| | | | | | | Surprisingly, this helps one vertex shader in 3DMMES. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Apply the transformation "1/rsq(x) == sqrt(x)" in opt_algebraic.Eric Anholt2013-11-151-3/+4
| | | | | | | | | | | | | The comment was stale, because the lowering in question wasn't happening in lower_instructions.cpp. Presumably if the lowering ever moves there, we can plumb the lowering mask through to opt_algebraic. total instructions in shared programs: 1618696 -> 1616810 (-0.12%) instructions in affected programs: 243018 -> 241132 (-0.78%) GAINED: 0 LOST: 0 Reviewed-by: Jordan Justen <[email protected]>
* glsl: Apply the transformation "(a ^^ a) -> false" in opt_algebraic.Eric Anholt2013-11-151-1/+3
| | | | Reviewed-by: Jordan Justen <[email protected]>
* glsl: Apply the transformation "(a && a) -> a" in opt_algebraic.Eric Anholt2013-11-151-1/+3
| | | | Reviewed-by: Jordan Justen <[email protected]>
* glsl: Apply the transformation "(a || a) -> a" in opt_algebraic.Eric Anholt2013-11-151-1/+3
| | | | | | | | | | | total instructions in shared programs: 1732385 -> 1732373 (-0.00%) instructions in affected programs: 416 -> 404 (-2.88%) GAINED: 0 LOST: 0 (That's 4 already-short fragment shaders in dota2) Reviewed-by: Jordan Justen <[email protected]>
* glsl: Drop no-op shifts involving 0.Eric Anholt2013-10-281-0/+10
| | | | | | | | | | | | I noticed this in a shader in Unigine Heaven that was spilling. While it doesn't really reduce register pressure, it shaves a few instructions anyway (7955 -> 7882). v2: Fix turning "0 >> x" into "x" instead of "0" (caught by Erik Faye-Lund). Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Use ir_builder more in opt_algebraic.Eric Anholt2013-10-281-30/+10
| | | | | | | | | While ir_builder is slightly less efficient, we're only increasing the work when there's actual optimization being done, and it's way more readable code. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Move common code out of opt_algebraic's handle_expression().Eric Anholt2013-10-281-78/+39
| | | | | | | | | | Matt and I had each screwed up these common required patterns recently, in ways that wouldn't have been noticed for a long time if not for code review. Just enforce it in the caller so that we don't rely on code review catching these bugs. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize (not A) and (not B) into not (A or B).Matt Turner2013-10-251-0/+9
| | | | | | No shader-db changes, but seems like a good idea. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Optimize (not A) or (not B) into not (A and B).Matt Turner2013-10-251-0/+12
| | | | | | | | A few Serious Sam 3 shaders affected: instructions in affected programs: 4384 -> 4344 (-0.91%) Reviewed-by: Eric Anholt <[email protected]>