diff options
author | Eric Anholt <[email protected]> | 2019-09-25 11:56:06 -0700 |
---|---|---|
committer | Daniel Schürmann <[email protected]> | 2019-09-30 09:44:10 +0000 |
commit | ca1aa5d225ab96bdff5052a46f3a3c22fc6f1646 (patch) | |
tree | a8b84b07be66ba10ac95162ca59f32f42789dc65 /src/broadcom | |
parent | 1d29895e5b8f34fefc280964e65f883f7c491dfe (diff) |
v3d: Enable the late algebraic optimizations to get real subs.
This worked better than my original v3d-local pass for just subs, and is a
huge win over not producing subs.
total instructions in shared programs: 6408469 -> 6167932 (-3.75%)
total threads in shared programs: 153784 -> 154104 (0.21%)
total uniforms in shared programs: 2157078 -> 1905823 (-11.65%)
total max-temps in shared programs: 904546 -> 895796 (-0.97%)
total spills in shared programs: 4959 -> 4993 (0.69%)
total fills in shared programs: 6558 -> 6670 (1.71%)
total sfu-stalls in shared programs: 25845 -> 25175 (-2.59%)
total inst-and-stalls in shared programs: 6434314 -> 6193107 (-3.75%)
Reviewed-by: Daniel Schürmann <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Diffstat (limited to 'src/broadcom')
-rw-r--r-- | src/broadcom/compiler/vir.c | 16 |
1 files changed, 16 insertions, 0 deletions
diff --git a/src/broadcom/compiler/vir.c b/src/broadcom/compiler/vir.c index 917ccfeaef1..802448b1e03 100644 --- a/src/broadcom/compiler/vir.c +++ b/src/broadcom/compiler/vir.c @@ -939,6 +939,22 @@ uint64_t *v3d_compile(const struct v3d_compiler *compiler, NIR_PASS_V(c->s, nir_lower_idiv); v3d_optimize_nir(c->s); + + /* Do late algebraic optimization to turn add(a, neg(b)) back into + * subs, then the mandatory cleanup after algebraic. Note that it may + * produce fnegs, and if so then we need to keep running to squash + * fneg(fneg(a)). + */ + bool more_late_algebraic = true; + while (more_late_algebraic) { + more_late_algebraic = false; + NIR_PASS(more_late_algebraic, c->s, nir_opt_algebraic_late); + NIR_PASS_V(c->s, nir_opt_constant_folding); + NIR_PASS_V(c->s, nir_copy_prop); + NIR_PASS_V(c->s, nir_opt_dce); + NIR_PASS_V(c->s, nir_opt_cse); + } + NIR_PASS_V(c->s, nir_lower_bool_to_int32); NIR_PASS_V(c->s, nir_convert_from_ssa, true); |