summaryrefslogtreecommitdiffstats
path: root/src/intel
diff options
context:
space:
mode:
authorIan Romanick <[email protected]>2018-06-27 11:41:19 -0700
committerIan Romanick <[email protected]>2018-12-17 13:47:06 -0800
commit09b7e1d8e4e07e7c51debb20e85e213ab209985f (patch)
tree6055f454c575bbedf5811c617dbbe1f0f7b71e14 /src/intel
parent4cd1a0be76883c2b13aae8c97972e8f1404d06f7 (diff)
nir/opt_peephole_select: Don't try to remove flow control around indirect loads
That flow control may be trying to avoid invalid loads. On at least some platforms, those loads can also be expensive. No shader-db changes on any Intel platform (even with the later patch "intel/compiler: More peephole select"). v2: Add a 'indirect_load_ok' flag to nir_opt_peephole_select. Suggested by Rob. See also the big comment in src/intel/compiler/brw_nir.c. v3: Use nir_deref_instr_has_indirect instead of deref_has_indirect (from nir_lower_io_arrays_to_elements.c). v4: Fix inverted condition in brw_nir.c. Noticed by Lionel. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
Diffstat (limited to 'src/intel')
-rw-r--r--src/intel/compiler/brw_nir.c13
1 files changed, 12 insertions, 1 deletions
diff --git a/src/intel/compiler/brw_nir.c b/src/intel/compiler/brw_nir.c
index 594edde5413..e0aa927f2f4 100644
--- a/src/intel/compiler/brw_nir.c
+++ b/src/intel/compiler/brw_nir.c
@@ -568,7 +568,18 @@ brw_nir_optimize(nir_shader *nir, const struct brw_compiler *compiler,
OPT(nir_copy_prop);
OPT(nir_opt_dce);
OPT(nir_opt_cse);
- OPT(nir_opt_peephole_select, 0);
+
+ /* For indirect loads of uniforms (push constants), we assume that array
+ * indices will nearly always be in bounds and the cost of the load is
+ * low. Therefore there shouldn't be a performance benefit to avoid it.
+ * However, in vec4 tessellation shaders, these loads operate by
+ * actually pulling from memory.
+ */
+ const bool is_vec4_tessellation = !is_scalar &&
+ (nir->info.stage == MESA_SHADER_TESS_CTRL ||
+ nir->info.stage == MESA_SHADER_TESS_EVAL);
+ OPT(nir_opt_peephole_select, 0, !is_vec4_tessellation);
+
OPT(nir_opt_intrinsics);
OPT(nir_opt_idiv_const, 32);
OPT(nir_opt_algebraic);