diff options
author | Kenneth Graunke <[email protected]> | 2019-02-15 11:00:39 -0800 |
---|---|---|
committer | Kenneth Graunke <[email protected]> | 2019-02-19 15:56:19 -0800 |
commit | ba7519ca36ce0de74657b01fe4fa2c97aace538e (patch) | |
tree | e3cc10318f323a2c90641f66d4300f1876524c40 /src/amd/common/ac_nir_to_llvm.c | |
parent | 9c4d5926aa9cf0f8f5e0e163845b20b83f02b515 (diff) |
radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpow
ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL
leaves it undefined). Performing fpow lowering in NIR would break this
behavior, preventing us from using prog_to_nir.
According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common
expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>,
which presumably does a zero-wins multiply.
Lowering in NIR results in a non-legacy multiply, where:
pow(0, 0) = 2^(log2(0) * 0)
= 2^(-INF * 0)
= 2^(-NaN)
= -NaN
which isn't the desired result.
This reverts:
- commit d6b75392067712908bdc372f1007e085439bf9f5
(ac/nir: remove emission of nir_op_fpow)
- commit 22430224fec31591432d4a3e65c6f457ba1c1653
(radeonsi/nir: enable lowering of fpow)
and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir
after enabling prog_to_nir in st/mesa later in this series.
Reviewed-by: Timothy Arceri <[email protected]>
Diffstat (limited to 'src/amd/common/ac_nir_to_llvm.c')
-rw-r--r-- | src/amd/common/ac_nir_to_llvm.c | 4 |
1 files changed, 4 insertions, 0 deletions
diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 8fafe7639c6..40a35c346e8 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -801,6 +801,10 @@ static void visit_alu(struct ac_nir_context *ctx, const nir_alu_instr *instr) result = ac_build_intrinsic(&ctx->ac, "llvm.amdgcn.frexp.mant.f64", ctx->ac.f64, src, 1, AC_FUNC_ATTR_READNONE); break; + case nir_op_fpow: + result = emit_intrin_2f_param(&ctx->ac, "llvm.pow", + ac_to_float_type(&ctx->ac, def_type), src[0], src[1]); + break; case nir_op_fmax: result = emit_intrin_2f_param(&ctx->ac, "llvm.maxnum", ac_to_float_type(&ctx->ac, def_type), src[0], src[1]); |