summaryrefslogtreecommitdiffstats
path: root/src/intel/compiler/brw_fs.h
diff options
context:
space:
mode:
authorIan Romanick <[email protected]>2018-12-03 15:53:36 -0800
committerIan Romanick <[email protected]>2019-03-01 12:42:14 -0800
commit7725d609387a8165ccb71e2d9e0221d9248b1729 (patch)
treeca0283e23749f9149a15de7482afee96a398ca8a /src/intel/compiler/brw_fs.h
parentcb3e21cd1925c9378b4acb869601bbb011d0de97 (diff)
intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))
Since Boolean values are either -1 (true) or 0 (false), b2f(inot(a)) maps -1 => 0.0 and 0 => 1.0. This is equivalent to 1.0 + float(boolBitsToInt(a)). On Intel GPUs, ADD is one of the few instructions that can type-convert during write to destination, so we can achieve this in a single instruction: add g47F, g26D, 1D v2: Fix swizzles. v3: Fix typos in comments. Noticed by Ken. All Gen6+ platforms had similar results. (Skylake shown) Skylake total instructions in shared programs: 15185583 -> 15184683 (<.01%) instructions in affected programs: 239389 -> 238489 (-0.38%) helped: 899 HURT: 1 helped stats (abs) min: 1 max: 2 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.15% max: 1.85% x̄: 0.49% x̃: 0.44% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.09% max: 0.09% x̄: 0.09% x̃: 0.09% 95% mean confidence interval for instructions value: -1.01 -0.99 95% mean confidence interval for instructions %-change: -0.51% -0.48% Instructions are helped. total cycles in shared programs: 370964249 -> 370961508 (<.01%) cycles in affected programs: 1487586 -> 1484845 (-0.18%) helped: 420 HURT: 268 helped stats (abs) min: 1 max: 232 x̄: 22.41 x̃: 6 helped stats (rel) min: 0.05% max: 22.60% x̄: 1.30% x̃: 0.41% HURT stats (abs) min: 1 max: 230 x̄: 24.90 x̃: 10 HURT stats (rel) min: <.01% max: 21.60% x̄: 1.45% x̃: 0.52% 95% mean confidence interval for cycles value: -7.61 -0.36 95% mean confidence interval for cycles %-change: -0.44% -0.02% Cycles are helped. No changes on Iron Lake or GM45. Reviewed-by: Kenneth Graunke <[email protected]>
Diffstat (limited to 'src/intel/compiler/brw_fs.h')
-rw-r--r--src/intel/compiler/brw_fs.h2
1 files changed, 2 insertions, 0 deletions
diff --git a/src/intel/compiler/brw_fs.h b/src/intel/compiler/brw_fs.h
index df34c69663b..97956003973 100644
--- a/src/intel/compiler/brw_fs.h
+++ b/src/intel/compiler/brw_fs.h
@@ -205,6 +205,8 @@ public:
void nir_emit_block(nir_block *block);
void nir_emit_instr(nir_instr *instr);
void nir_emit_alu(const brw::fs_builder &bld, nir_alu_instr *instr);
+ bool try_emit_b2fi_of_inot(const brw::fs_builder &bld, fs_reg result,
+ nir_alu_instr *instr);
void nir_emit_load_const(const brw::fs_builder &bld,
nir_load_const_instr *instr);
void nir_emit_vs_intrinsic(const brw::fs_builder &bld,