summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorJason Ekstrand <[email protected]>2018-03-23 11:05:04 -0700
committerJuan A. Suarez Romero <[email protected]>2018-04-12 21:49:31 +0200
commite49d7abf87e33520298d99e0fb117b715a1b721a (patch)
treeda8c9bc59cb9485704278f58d89a5f4f9038a94e
parent1ec916659831ac1da1b31475bfe18422275ba76d (diff)
nir/lower_vec_to_movs: Only coalesce if the vec had a SSA destination
Otherwise we may end up trying to coalesce in a case such as ssa_1 = fadd r1, r2 r3.x = fneg(r2); r3 = vec4(ssa_1, ssa_1.y, ...) and that would cause us to move the writes to r3 from the vec to the fadd which would re-order them with respect to the write from the fneg. In order to solve this, we just don't coalesce if the destination of the vec is not SSA. We could try to get clever and still coalesce if there are no writes to the destination of the vec between the vec and the ALU source. However, since registers only come from phi webs and indirects, the chances of having a vec with a register destination that is actually coalescable into its source is very slim. Shader-db results on Haswell: total instructions in shared programs: 13657906 -> 13659101 (<.01%) instructions in affected programs: 149291 -> 150486 (0.80%) helped: 0 HURT: 592 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105440 Fixes: 2458ea95c56 "nir/lower_vec_to_movs: Coalesce movs on-the-fly when possible" Reported-by: Vadym Shovkoplias <[email protected]> Tested-by: Vadym Shovkoplias <[email protected]> Reviewed-by: Matt Turner <[email protected]> (cherry picked from commit 800df942eadc5356840f5cbc2ceaa8a65c01ee91)
-rw-r--r--src/compiler/nir/nir_lower_vec_to_movs.c7
1 files changed, 6 insertions, 1 deletions
diff --git a/src/compiler/nir/nir_lower_vec_to_movs.c b/src/compiler/nir/nir_lower_vec_to_movs.c
index 711ddd38bda..8b24376b0a5 100644
--- a/src/compiler/nir/nir_lower_vec_to_movs.c
+++ b/src/compiler/nir/nir_lower_vec_to_movs.c
@@ -230,6 +230,7 @@ lower_vec_to_movs_block(nir_block *block, nir_function_impl *impl)
continue; /* The loop */
}
+ bool vec_had_ssa_dest = vec->dest.dest.is_ssa;
if (vec->dest.dest.is_ssa) {
/* Since we insert multiple MOVs, we have a register destination. */
nir_register *reg = nir_local_reg_create(impl);
@@ -263,7 +264,11 @@ lower_vec_to_movs_block(nir_block *block, nir_function_impl *impl)
if (!(vec->dest.write_mask & (1 << i)))
continue;
- if (!(finished_write_mask & (1 << i)))
+ /* Coalescing moves the register writes from the vec up to the ALU
+ * instruction in the source. We can only do this if the original
+ * vecN had an SSA destination.
+ */
+ if (vec_had_ssa_dest && !(finished_write_mask & (1 << i)))
finished_write_mask |= try_coalesce(vec, i);
if (!(finished_write_mask & (1 << i)))