summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorKenneth Graunke <[email protected]>2015-06-10 00:52:07 -0700
committerKenneth Graunke <[email protected]>2015-06-22 14:08:36 -0700
commit1762568fd39b9be42d963d335e36daea25df7044 (patch)
tree763dd0431662741306c078b53b11278bf81f8cf8
parent94e3864707e48d4b1d5fb5f88a01370a73ddb0cb (diff)
nir: Allow vec2/vec3/vec4 instructions in the select peephole pass.
These are basically just moves, so they should be safe as well. When disabling i965's GLSL IR level scalarizer (channel expressions) pass, I started seeing NIR code like this: if ssa_21 { block block_1: /* preds: block_0 */ vec4 ssa_120 = vec4 ssa_82, ssa_83, ssa_84, ssa_30 /* succs: block_3 */ } else { block block_2: /* preds: block_0 */ /* succs: block_3 */ } block block_3: /* preds: block_1 block_2 */ vec4 ssa_33 = phi block_1: ssa_120, block_2: ssa_2 Previously, the GLSL IR scalarizer pass would break the vec4 into a series of fmovs, which were allowed by the peephole pass. But with the vec4 operation, they were not. We want to keep getting selects. Normal i965 on Broadwell: instructions in affected programs: 200 -> 176 (-12.00%) helped: 4 With brw_fs_channel_expressions() disabled: instructions in affected programs: 1832 -> 1646 (-10.15%) helped: 30 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
-rw-r--r--src/glsl/nir/nir_opt_peephole_select.c4
1 files changed, 3 insertions, 1 deletions
diff --git a/src/glsl/nir/nir_opt_peephole_select.c b/src/glsl/nir/nir_opt_peephole_select.c
index 82c65bb442f..ef7c9775aa3 100644
--- a/src/glsl/nir/nir_opt_peephole_select.c
+++ b/src/glsl/nir/nir_opt_peephole_select.c
@@ -86,7 +86,9 @@ block_check_for_allowed_instrs(nir_block *block)
nir_alu_instr *mov = nir_instr_as_alu(instr);
if (mov->op != nir_op_fmov && mov->op != nir_op_imov &&
mov->op != nir_op_fneg && mov->op != nir_op_ineg &&
- mov->op != nir_op_fabs && mov->op != nir_op_iabs)
+ mov->op != nir_op_fabs && mov->op != nir_op_iabs &&
+ mov->op != nir_op_vec2 && mov->op != nir_op_vec3 &&
+ mov->op != nir_op_vec4)
return false;
/* Can't handle saturate */