diff options
author | Kenneth Graunke <[email protected]> | 2016-04-13 16:58:10 -0700 |
---|---|---|
committer | Kenneth Graunke <[email protected]> | 2016-04-20 15:05:13 -0700 |
commit | 60a17d071825da4a06303cb699e4417edaaa6386 (patch) | |
tree | c663cb1996431193dfb4ed3d2b47306029310a03 /src/mesa | |
parent | 1aa28f3509b033e0f86510a6d4c7993fca650b3b (diff) |
i965: Properly handle integer types in opt_vector_float().
Previously, opt_vector_float() always interpreted MOV sources as
floating point, and always created a MOV with a F-type destination.
This meant that we could mess up sequences of integer loads, such as:
mov vgrf6.0.x:D, 0D
mov vgrf6.0.y:D, 1D
mov vgrf6.0.z:D, 2D
mov vgrf6.0.w:D, 3D
Here, integer 0/1/2/3 become approximately 0.0f, so we generated:
mov vgrf6.0:F, [0F, 0F, 0F, 0F]
which is clearly wrong. We can properly handle this by converting
integer values to float (rather than bitcasting), and emitting a type
converting MOV:
mov vgrf6.0:D, [0F, 1F, 2F, 3F]
To do this, see first see if the integer values (converted to float)
are representable. If so, we use a D-type MOV. If not, we then try
the floating point values and an F-type MOV. We make zero not impose
type restrictions. This is important because 0D would imply a D-type
MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D,
where we want to use an F-type MOV.
Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend. This
recently became visible due to changes in opt_vector_float() which
made it optimize more cases, but it was a pre-existing bug.
Apparently it also manages to turn more integer loads into VFs,
producing the following shader-db statistics on Haswell:
total instructions in shared programs: 7084195 -> 7082191 (-0.03%)
instructions in affected programs: 246027 -> 244023 (-0.81%)
helped: 1937
total cycles in shared programs: 65669642 -> 65651968 (-0.03%)
cycles in affected programs: 531064 -> 513390 (-3.33%)
helped: 1177
v2: Handle the type of zero better.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Diffstat (limited to 'src/mesa')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_vec4.cpp | 22 |
1 files changed, 18 insertions, 4 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 32f4ef0a8ab..a2b35603b0e 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -361,9 +361,11 @@ vec4_visitor::opt_vector_float() int inst_count = 0; vec4_instruction *imm_inst[4]; unsigned writemask = 0; + enum brw_reg_type dest_type = BRW_REGISTER_TYPE_F; foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) { int vf = -1; + enum brw_reg_type need_type; /* Look for unconditional MOVs from an immediate with a partial * writemask. Skip type-conversion MOVs other than integer 0, @@ -375,23 +377,32 @@ vec4_visitor::opt_vector_float() inst->predicate == BRW_PREDICATE_NONE && inst->dst.writemask != WRITEMASK_XYZW && (inst->src[0].type == inst->dst.type || inst->src[0].d == 0)) { - vf = brw_float_to_vf(inst->src[0].f); + + vf = brw_float_to_vf(inst->src[0].d); + need_type = BRW_REGISTER_TYPE_D; + + if (vf == -1) { + vf = brw_float_to_vf(inst->src[0].f); + need_type = BRW_REGISTER_TYPE_F; + } } else { last_reg = -1; } /* If this wasn't a MOV, or the destination register doesn't match, - * then this breaks our sequence. Combine anything we've accumulated. + * or we have to switch destination types, then this breaks our + * sequence. Combine anything we've accumulated so far. */ if (last_reg != inst->dst.nr || last_reg_offset != inst->dst.reg_offset || - last_reg_file != inst->dst.file) { + last_reg_file != inst->dst.file || + (vf > 0 && dest_type != need_type)) { if (inst_count > 1) { unsigned vf; memcpy(&vf, imm, sizeof(vf)); vec4_instruction *mov = MOV(imm_inst[0]->dst, brw_imm_vf(vf)); - mov->dst.type = BRW_REGISTER_TYPE_F; + mov->dst.type = dest_type; mov->dst.writemask = writemask; inst->insert_before(block, mov); @@ -405,6 +416,7 @@ vec4_visitor::opt_vector_float() inst_count = 0; last_reg = -1; writemask = 0; + dest_type = BRW_REGISTER_TYPE_F; for (int i = 0; i < 4; i++) { imm[i] = 0; @@ -428,6 +440,8 @@ vec4_visitor::opt_vector_float() last_reg = inst->dst.nr; last_reg_offset = inst->dst.reg_offset; last_reg_file = inst->dst.file; + if (vf > 0) + dest_type = need_type; } } |