i965: Use nir_lower_load_const_to_scalar().

I don't know why, but we never hooked up this pass Eric wrote. Otherwise, you can end up with stupid scalarized code such as: vec4 ssa_7 = load_const (0.0, 0.0, 0.0, 0.0) vec4 ssa_8 = ... vec1 ssa_9 = feq ssa_8, ssa_7 vec1 ssa_10 = feq ssa_8.y, ssa_7.y vec1 ssa_11 = feq ssa_8, ssa_7.z vec1 ssa_12 = feq ssa_8.y, ssa_7.w ssa_8.xyxy == <0, 0, 0, 0> should only take two feq instructions. shader-db on Skylake: total instructions in shared programs: 9121153 -> 9120749 (-0.00%) instructions in affected programs: 32421 -> 32017 (-1.25%) helped: 277 HURT: 69 total cycles in shared programs: 69003364 -> 69000912 (-0.00%) cycles in affected programs: 899186 -> 896734 (-0.27%) helped: 313 HURT: 403 This also prevents regressions when disabling channel expressions. v2: Don't call opt_cse afterwards (requested by Matt). It should happen in the optimization loop below anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Matt Turner <[email protected]>
author: Kenneth Graunke <[email protected]> 2016-01-21 16:37:20 -0800
committer: Kenneth Graunke <[email protected]> 2016-02-08 18:10:34 -0800
commit: 74f956c416d5b0b37b4c2d6b957167bb203502c3 (patch)
tree: dd120bc75edb496fedb54ec2ab201bdfb26199a9 /src
parent: 184afd8fd9e7891322224f57a12c2e0fe52b46cb (diff)
1 files changed, 4 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c
index 287f935d539..46b51163579 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -482,6 +482,10 @@ brw_preprocess_nir(nir_shader *nir, bool is_scalar)
 
    nir = nir_optimize(nir, is_scalar);
 
+   if (is_scalar) {
+      OPT_V(nir_lower_load_const_to_scalar);
+   }
+
    /* Lower a bunch of stuff */
    OPT_V(nir_lower_var_copies);
author	Kenneth Graunke <[email protected]>	2016-01-21 16:37:20 -0800
committer	Kenneth Graunke <[email protected]>	2016-02-08 18:10:34 -0800
commit	74f956c416d5b0b37b4c2d6b957167bb203502c3 (patch)
tree	dd120bc75edb496fedb54ec2ab201bdfb26199a9 /src
parent	184afd8fd9e7891322224f57a12c2e0fe52b46cb (diff)