diff options
author | Iago Toral Quiroga <[email protected]> | 2015-08-13 15:36:05 -0700 |
---|---|---|
committer | Samuel Iglesias Gonsálvez <[email protected]> | 2017-01-03 11:26:50 +0100 |
commit | 558f27953101c438747c3e9d3c3f98ce21e79007 (patch) | |
tree | 87eb24c967dbd74cea33e126bee73faaa4c3b296 /src/mesa/drivers/dri/i965/brw_vec4.cpp | |
parent | 2d6eee3144ce16b39909522be466bdb3871f4c1b (diff) |
i965/vec4: add double/float conversion pseudo-opcodes
These need to be emitted as align1 MOV's, since they need to have a
stride of 2 on the float register (whether src or dest) so that data
from another thread doesn't cross the middle of a SIMD8 register.
v2 (Iago):
- The float-to-double needs to align 32-bit data to 64-bit before doing the
conversion. This was doable in align16 when we tried to use an execsize
of 4, but with an execsize of 8 we would need another align1 opcode to do
that (since we need data to cross the middle of a SIMD register). Just
making the opcode handle this internally seems more practical that adding
another opcode just for this purpose and having the caller know about this
before converting.
- The double-to-float conversion produces 32-bit elements aligned to 64-bit
so we make the opcode re-pack the result to 32-bit and fit in one register,
as expected by SIMD4x2 operation. This still requires that callers reserve
two registers for the float data destination because we need to produce
64-bit aligned data first, and repack it later on the same destination
register, but it saves the need for a re-pack opcode only to achieve this
making the operation complete in a single opcode. Hopefully that is worth
the weirdness of the double register allocation...
Signed-off-by: Connor Abbott <[email protected]>
Signed-off-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Diffstat (limited to 'src/mesa/drivers/dri/i965/brw_vec4.cpp')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_vec4.cpp | 8 |
1 files changed, 8 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index bf36cacb0b7..3f3fd6bbcf3 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -253,6 +253,8 @@ vec4_instruction::can_do_writemask(const struct gen_device_info *devinfo) { switch (opcode) { case SHADER_OPCODE_GEN4_SCRATCH_READ: + case VEC4_OPCODE_DOUBLE_TO_FLOAT: + case VEC4_OPCODE_FLOAT_TO_DOUBLE: case VS_OPCODE_PULL_CONSTANT_LOAD: case VS_OPCODE_PULL_CONSTANT_LOAD_GEN7: case VS_OPCODE_SET_SIMD4X2_HEADER_GEN9: @@ -505,6 +507,12 @@ vec4_visitor::opt_reduce_swizzle() case BRW_OPCODE_DP2: swizzle = brw_swizzle_for_size(2); break; + + case VEC4_OPCODE_FLOAT_TO_DOUBLE: + case VEC4_OPCODE_DOUBLE_TO_FLOAT: + swizzle = brw_swizzle_for_size(4); + break; + default: swizzle = brw_swizzle_for_mask(inst->dst.writemask); break; |