From 52c7df1643ec9af119fd66f916f7fbdbcc798d2d Mon Sep 17 00:00:00 2001 From: Ian Romanick Date: Wed, 21 Feb 2018 18:06:56 -0800 Subject: i965/fs: Merge CMP and SEL into CSEL on Gen8+ MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit v2: Fix several problems handling inverted predicates. Add a much bigger comment around the BRW_CONDITIONAL_NZ case. v3: Allow uniforms and shader inputs as sources for the original SEL and CMP instructions. This enables a LOT more shaders to receive CSEL merging (5816 vs 8564 on SKL). v4: Report progress. Broadwell and Skylake had similar results. (Broadwell shown) helped: 8527 HURT: 0 helped stats (abs) min: 1 max: 27 x̄: 2.44 x̃: 1 helped stats (rel) min: 0.03% max: 17.80% x̄: 1.12% x̃: 0.70% 95% mean confidence interval for instructions value: -2.51 -2.36 95% mean confidence interval for instructions %-change: -1.15% -1.10% Instructions are helped. total cycles in shared programs: 559442317 -> 558288357 (-0.21%) cycles in affected programs: 372699860 -> 371545900 (-0.31%) helped: 6748 HURT: 1450 helped stats (abs) min: 1 max: 32000 x̄: 182.41 x̃: 12 helped stats (rel) min: <.01% max: 66.08% x̄: 3.42% x̃: 0.70% HURT stats (abs) min: 1 max: 2538 x̄: 53.08 x̃: 14 HURT stats (rel) min: <.01% max: 96.72% x̄: 3.32% x̃: 0.90% 95% mean confidence interval for cycles value: -179.01 -102.51 95% mean confidence interval for cycles %-change: -2.37% -2.08% Cycles are helped. LOST: 0 GAINED: 6 No changes on earlier platforms. Signed-off-by: Ian Romanick Reviewed-by: Samuel Iglesias Gonsálvez [v1] Reviewed-by: Kenneth Graunke [v3] Reviewed-by: Matt Turner --- src/intel/compiler/brw_fs.h | 1 + 1 file changed, 1 insertion(+) (limited to 'src/intel/compiler/brw_fs.h') diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h index 1b7df844696..e384db809dc 100644 --- a/src/intel/compiler/brw_fs.h +++ b/src/intel/compiler/brw_fs.h @@ -191,6 +191,7 @@ public: fs_reg resolve_source_modifiers(const fs_reg &src); void emit_discard_jump(); bool opt_peephole_sel(); + bool opt_peephole_csel(); bool opt_peephole_predicated_break(); bool opt_saturate_propagation(); bool opt_cmod_propagation(); -- cgit v1.2.3