diff options
author | Ian Romanick <[email protected]> | 2018-12-03 15:53:36 -0800 |
---|---|---|
committer | Ian Romanick <[email protected]> | 2019-03-01 12:42:14 -0800 |
commit | 7725d609387a8165ccb71e2d9e0221d9248b1729 (patch) | |
tree | ca0283e23749f9149a15de7482afee96a398ca8a /src/mesa/main/sse_minmax.h | |
parent | cb3e21cd1925c9378b4acb869601bbb011d0de97 (diff) |
intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))
Since Boolean values are either -1 (true) or 0 (false), b2f(inot(a))
maps -1 => 0.0 and 0 => 1.0. This is equivalent to 1.0 +
float(boolBitsToInt(a)). On Intel GPUs, ADD is one of the few
instructions that can type-convert during write to destination, so we
can achieve this in a single instruction:
add g47F, g26D, 1D
v2: Fix swizzles.
v3: Fix typos in comments. Noticed by Ken.
All Gen6+ platforms had similar results. (Skylake shown)
Skylake
total instructions in shared programs: 15185583 -> 15184683 (<.01%)
instructions in affected programs: 239389 -> 238489 (-0.38%)
helped: 899
HURT: 1
helped stats (abs) min: 1 max: 2 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.15% max: 1.85% x̄: 0.49% x̃: 0.44%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 0.09% max: 0.09% x̄: 0.09% x̃: 0.09%
95% mean confidence interval for instructions value: -1.01 -0.99
95% mean confidence interval for instructions %-change: -0.51% -0.48%
Instructions are helped.
total cycles in shared programs: 370964249 -> 370961508 (<.01%)
cycles in affected programs: 1487586 -> 1484845 (-0.18%)
helped: 420
HURT: 268
helped stats (abs) min: 1 max: 232 x̄: 22.41 x̃: 6
helped stats (rel) min: 0.05% max: 22.60% x̄: 1.30% x̃: 0.41%
HURT stats (abs) min: 1 max: 230 x̄: 24.90 x̃: 10
HURT stats (rel) min: <.01% max: 21.60% x̄: 1.45% x̃: 0.52%
95% mean confidence interval for cycles value: -7.61 -0.36
95% mean confidence interval for cycles %-change: -0.44% -0.02%
Cycles are helped.
No changes on Iron Lake or GM45.
Reviewed-by: Kenneth Graunke <[email protected]>
Diffstat (limited to 'src/mesa/main/sse_minmax.h')
0 files changed, 0 insertions, 0 deletions