diff options
author | Samuel Pitoiset <[email protected]> | 2020-04-14 09:42:48 +0200 |
---|---|---|
committer | Samuel Pitoiset <[email protected]> | 2020-04-15 10:12:44 +0200 |
commit | 08a396033be1d7ceddf48da0563a7e4d2cb64429 (patch) | |
tree | 4494456822a6c6c1b96e93f30ad80071da513df7 /src | |
parent | 9bf8e923863230914f6bf2a4abcf257cb8778ee7 (diff) |
aco: fix nir_op_frexp_exp with 16-bit floats and negative exponents
v_frexp_exp_i16_f16 returns the two's complement for negative
exponents. For example, with 0.333252 it returns 0.666504 for
the mantissa and 65535 for the exponent (-1 in decimal).
RADV/LLVM and AMDVLK do a v_bfe_i32 and AMDGPU-PRO uses SDWA with
the sign extension bit set. The latter is probably what we want to
do in long term but for now RA doesn't support changing non-SDWA
instructions to SDWA if useful/needed.
Fixes dEQP-VK.glsl.builtin.precision_fp16_storage16b.frexp.compute.*.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4546>
Diffstat (limited to 'src')
-rw-r--r-- | src/amd/compiler/aco_instruction_selection.cpp | 7 |
1 files changed, 6 insertions, 1 deletions
diff --git a/src/amd/compiler/aco_instruction_selection.cpp b/src/amd/compiler/aco_instruction_selection.cpp index bc973a05e5a..61a2d994d84 100644 --- a/src/amd/compiler/aco_instruction_selection.cpp +++ b/src/amd/compiler/aco_instruction_selection.cpp @@ -2114,7 +2114,12 @@ void visit_alu_instr(isel_context *ctx, nir_alu_instr *instr) Temp src = get_alu_src(ctx, instr->src[0]); if (instr->src[0].src.ssa->bit_size == 16) { Temp tmp = bld.vop1(aco_opcode::v_frexp_exp_i16_f16, bld.def(v1), src); - bld.pseudo(aco_opcode::p_extract_vector, Definition(dst), tmp, Operand(0u)); + aco_ptr<SDWA_instruction> sdwa{create_instruction<SDWA_instruction>(aco_opcode::v_mov_b32, asSDWA(Format::VOP1), 1, 1)}; + sdwa->operands[0] = Operand(tmp); + sdwa->definitions[0] = Definition(dst); + sdwa->sel[0] = sdwa_sbyte; + sdwa->dst_sel = sdwa_sdword; + ctx->block->instructions.emplace_back(std::move(sdwa)); } else if (instr->src[0].src.ssa->bit_size == 32) { bld.vop1(aco_opcode::v_frexp_exp_i32_f32, Definition(dst), src); } else if (instr->src[0].src.ssa->bit_size == 64) { |