nir: Lower flrp with Boolean interpolator to bcsel

On Intel platforms that don't set lower_flrp, using bcsel instead of flrp seems to be a small amount worse. On those platforms, the use of flrp, bcsel, and multiply of b2f is still an active area of research. In review, Matt suggested this is because bcsel turns into CMP+SEL, and because of the flag register we can't schedule instructions well. shader-db results: G4X / Ironlake total instructions in shared programs: 4016538 -> 4012279 (-0.11%) instructions in affected programs: 161556 -> 157297 (-2.64%) helped: 1077 HURT: 1 total cycles in shared programs: 84328296 -> 84315862 (-0.01%) cycles in affected programs: 4174570 -> 4162136 (-0.30%) helped: 926 HURT: 53 Unsurprisingly, no changes on later platforms. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
author: Ian Romanick <[email protected]> 2016-03-07 13:09:30 -0800
committer: Ian Romanick <[email protected]> 2016-03-22 14:42:42 -0700
commit: a4079f1cb2d7142d798f6d7fcb7ef29f34976f5b (patch)
tree: 897f28c9b2d2baf8f919bf39e4e7334d302cd36d /src/compiler/nir/nir_opt_algebraic.py
parent: 9442db4f89156713bcb241803f6de7c0777262c5 (diff)
1 files changed, 5 insertions, 2 deletions
diff --git a/src/compiler/nir/nir_opt_algebraic.py b/src/compiler/nir/nir_opt_algebraic.py
index 39be85f639e..8a44a7a0a66 100644
--- a/src/compiler/nir/nir_opt_algebraic.py
+++ b/src/compiler/nir/nir_opt_algebraic.py
@@ -83,10 +83,13 @@ optimizations = [
    (('flrp', a, b, 1.0), b),
    (('flrp', a, a, b), a),
    (('flrp', 0.0, a, b), ('fmul', a, b)),
+   (('flrp', a, b, ('b2f', c)), ('bcsel', c, b, a), 'options->lower_flrp'),
    (('flrp', a, b, c), ('fadd', ('fmul', c, ('fsub', b, a)), a), 'options->lower_flrp'),
    (('ffract', a), ('fsub', a, ('ffloor', a)), 'options->lower_ffract'),
-   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', c))), ('fmul', b, c)), ('flrp', a, b, c), '!options->lower_flrp'),
-   (('fadd', a, ('fmul', c, ('fadd', b, ('fneg', a)))), ('flrp', a, b, c), '!options->lower_flrp'),
+   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg', ('b2f', c)))), ('fmul', b, ('b2f', c))), ('bcsel', c, b, a), 'options->lower_flrp'),
+   (('fadd', ('fmul', a, ('fadd', 1.0, ('fneg',         c ))), ('fmul', b,         c )), ('flrp', a, b, c), '!options->lower_flrp'),
+   (('fadd', a, ('fmul', ('b2f', c), ('fadd', b, ('fneg', a)))), ('bcsel', c, b, a), 'options->lower_flrp'),
+   (('fadd', a, ('fmul',         c , ('fadd', b, ('fneg', a)))), ('flrp', a, b, c), '!options->lower_flrp'),
    (('ffma', a, b, c), ('fadd', ('fmul', a, b), c), 'options->lower_ffma'),
    (('fadd', ('fmul', a, b), c), ('ffma', a, b, c), '!options->lower_ffma'),
    # Comparison simplifications
author	Ian Romanick <[email protected]>	2016-03-07 13:09:30 -0800
committer	Ian Romanick <[email protected]>	2016-03-22 14:42:42 -0700
commit	a4079f1cb2d7142d798f6d7fcb7ef29f34976f5b (patch)
tree	897f28c9b2d2baf8f919bf39e4e7334d302cd36d /src/compiler/nir/nir_opt_algebraic.py
parent	9442db4f89156713bcb241803f6de7c0777262c5 (diff)