diff options
author | Francisco Jerez <[email protected]> | 2015-07-09 21:42:28 +0300 |
---|---|---|
committer | Francisco Jerez <[email protected]> | 2015-07-16 18:29:32 +0300 |
commit | b00cd6e4a0f9a84d514f428428be348900236e2e (patch) | |
tree | 09054d5729657d29ecf52a07ad3497de6416b37b /src/mesa/drivers/dri/i965/brw_shader.cpp | |
parent | 3ee2daf23dc91b8dfc017b5c89c10ab1376ba4df (diff) |
i965: Implement nir_op_uadd_carry and _usub_borrow without accumulator.
This gets rid of two no16() fall-backs and should allow better
scheduling of the generated IR. There are no uses of usubBorrow() or
uaddCarry() in shader-db so no changes are expected. However the
"arb_gpu_shader5/execution/built-in-functions/fs-usubBorrow" and
"arb_gpu_shader5/execution/built-in-functions/fs-uaddCarry" piglit
tests go from 40 to 28 instructions. The reason is that the plain ADD
instruction can easily be CSE'ed with the original addition, and the
b2i negation can easily be propagated into the source modifier of
another instruction, so effectively both operations are performed with
just one instruction.
v2: Rely on carry_to_arith() and borrow_to_arith() to lower these
(Ilia Mirkin).
Reviewed-by: Matt Turner <[email protected]>
Diffstat (limited to 'src/mesa/drivers/dri/i965/brw_shader.cpp')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_shader.cpp | 4 |
1 files changed, 3 insertions, 1 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 3e3d78b9ad7..d66baf34b38 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -259,7 +259,9 @@ process_glsl_ir(struct brw_context *brw, EXP_TO_EXP2 | LOG_TO_LOG2 | bitfield_insert | - LDEXP_TO_ARITH); + LDEXP_TO_ARITH | + CARRY_TO_ARITH | + BORROW_TO_ARITH); /* Pre-gen6 HW can only nest if-statements 16 deep. Beyond this, * if-statements need to be flattened. |