diff options
author | Ian Romanick <[email protected]> | 2011-08-03 15:42:05 -0700 |
---|---|---|
committer | Ian Romanick <[email protected]> | 2011-08-16 14:09:43 -0700 |
commit | ba01df11c4d09c65514a8522cb319e29034ab5a8 (patch) | |
tree | 830204cec99c9eff16dd6dde7c0571accc10437f | |
parent | e7bf096e8b04931996c8c56548ce0b2c0af3a0dc (diff) |
ir_to_mesa: Implement ir_binop_all_equal using DP4 w/SGE
The operation ir_binop_all_equal is !(a.x != b.x || a.y != b.y || a.z
!= b.z || a.w != b.w). Logical-or is implemented using addition
(followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing
the logical-or operators with addition gives !bool((int(a.x != b.x) +
int(a.y == b.y) + int(a.z == b.z) + int(a.w == b.w)). This can be
implemented using a dot-product with a vector of all 1.0. After the
dot-product, the value will be an integer on the range [0,4].
Previously a SEQ instruction was used to clamp the resulting logic
value to [0,1] and invert the result. Using an SGE instruction on the
negation of the dot-product result has the same effect. Many older
shader architectures do not support the SEQ instruction. It must be
emulated using two SGE instructions and a MUL. On these
architectures, the single SGE saves two instructions.
Reviewed-by: Eric Anholt <[email protected]>
-rw-r--r-- | src/mesa/program/ir_to_mesa.cpp | 13 |
1 files changed, 12 insertions, 1 deletions
diff --git a/src/mesa/program/ir_to_mesa.cpp b/src/mesa/program/ir_to_mesa.cpp index 1c674ea8756..4c8b097de6b 100644 --- a/src/mesa/program/ir_to_mesa.cpp +++ b/src/mesa/program/ir_to_mesa.cpp @@ -1237,8 +1237,19 @@ ir_to_mesa_visitor::visit(ir_expression *ir) ir->operands[1]->type->is_vector()) { src_reg temp = get_temp(glsl_type::vec4_type); emit(ir, OPCODE_SNE, dst_reg(temp), op[0], op[1]); + + /* After the dot-product, the value will be an integer on the + * range [0,4]. Zero becomes 1.0, and positive values become zero. + */ emit_dp(ir, result_dst, temp, temp, vector_elements); - emit(ir, OPCODE_SEQ, result_dst, result_src, src_reg_for_float(0.0)); + + /* Negating the result of the dot-product gives values on the range + * [-4, 0]. Zero becomes 1.0, and negative values become zero. This + * achieved using SGE. + */ + src_reg sge_src = result_src; + sge_src.negate = ~sge_src.negate; + emit(ir, OPCODE_SGE, result_dst, sge_src, src_reg_for_float(0.0)); } else { emit(ir, OPCODE_SEQ, result_dst, op[0], op[1]); } |