intel/nir: Move 64-bit lowering later

Now that we have a loop unrolling cost function and loop unrolling isn't going to kill us the moment we have a 64-bit op in a loop, we can go ahead and move 64-bit lowering later. This gives us the opportunity to do more optimizations and actually let the full optimizer run even on 64-bit ops rather than hoping one round of opt_algebraic will fix everything. This substantially reduces both fp64 shader compile times and the resulting code size. On the vs-isnan-dvec test from piglit: Before this commit: 1684.63s user 17.29s system 99% cpu 28:28.24 total 101479 instructions. 0 loops. 802452 cycles. 79:369 spills:fills. Peak memory usage (according to massif): 1.435 GB After this commit: 179.64s user 7.75s system 99% cpu 3:07.92 total 57316 instructions. 0 loops. 459287 cycles. 0:0 spills:fills. Peak memory usage (according to massif): 531.0 MB Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
author: Jason Ekstrand <[email protected]> 2019-03-04 16:11:57 -0600
committer: Jason Ekstrand <[email protected]> 2019-03-06 17:24:57 +0000
commit: 656ace3dd85b2eb8c565383763a00d059519df4c (patch)
tree: 809e4a320d3c4e8ceb38e1586603daa8f36bf503 /src/intel/compiler/brw_nir.c
parent: e02959f442ed6546fb632a153ffc32848968038f (diff)
1 files changed, 13 insertions, 21 deletions
diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 34aaa29a5cb..0f929947696 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -665,27 +665,6 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir,
       OPT(nir_lower_alu_to_scalar);
    }
 
-   /* Run opt_algebraic before int64 lowering so we can hopefully get rid
-    * of some int64 instructions.
-    */
-   OPT(nir_opt_algebraic);
-
-   /* Lower 64-bit operations before nir_optimize so that loop unrolling sees
-    * their actual cost.
-    */
-   bool lowered_64bit_ops = false;
-   do {
-      progress = false;
-
-      OPT(nir_lower_int64, nir->options->lower_int64_options);
-      OPT(nir_lower_doubles, softfp64, nir->options->lower_doubles_options);
-
-      /* Necessary to lower add -> sub and div -> mul/rcp */
-      OPT(nir_opt_algebraic);
-
-      lowered_64bit_ops |= progress;
-   } while (progress);
-
    if (nir->info.stage == MESA_SHADER_GEOMETRY)
       OPT(nir_lower_gs_intrinsics);
 
@@ -714,6 +693,19 @@ brw_preprocess_nir(const struct brw_compiler *compiler, nir_shader *nir,
 
    nir = brw_nir_optimize(nir, compiler, is_scalar, true);
 
+   bool lowered_64bit_ops = false;
+   do {
+      progress = false;
+
+      OPT(nir_lower_int64, nir->options->lower_int64_options);
+      OPT(nir_lower_doubles, softfp64, nir->options->lower_doubles_options);
+
+      /* Necessary to lower add -> sub and div -> mul/rcp */
+      OPT(nir_opt_algebraic);
+
+      lowered_64bit_ops |= progress;
+   } while (progress);
+
    /* This needs to be run after the first optimization pass but before we
     * lower indirect derefs away
     */
author	Jason Ekstrand <[email protected]>	2019-03-04 16:11:57 -0600
committer	Jason Ekstrand <[email protected]>	2019-03-06 17:24:57 +0000
commit	656ace3dd85b2eb8c565383763a00d059519df4c (patch)
tree	809e4a320d3c4e8ceb38e1586603daa8f36bf503 /src/intel/compiler/brw_nir.c
parent	e02959f442ed6546fb632a153ffc32848968038f (diff)