summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorIago Toral Quiroga <[email protected]>2016-06-17 08:47:29 +0200
committerSamuel Iglesias Gonsálvez <[email protected]>2017-01-03 11:26:51 +0100
commitf4b8649233fa10e89205b6b5f6f334279b198f22 (patch)
tree7cda7818b5cc4c38e64df5e1b43eda55a2a978a4
parent5356d52f31b8285c97f2c920e28cf4eb2a8caa58 (diff)
i965/vec4: split double-precision SEL
There is a hardware bug affecting compressed double-precision SEL instructions in align16 mode by which they won't read predication mask properly. The bug does not affect other predicated instructions and it does not affect SEL in Align1 mode either. This was found empirically and verified by Curro in the simulator. Fix this by splitting double-precision SEL in Align16 mode to use an execution size of 4. v2: Check that the dst type is 64-bit, since we can have 16-wide single precision bcsel instructions that also write 2 registers. v3: Replace bcsel by SEL in all the comments as bcsel is the nir opcode but SEL is the actual assembly instruction (Matt). Reviewed-by: Matt Turner <[email protected]>
-rw-r--r--src/mesa/drivers/dri/i965/brw_vec4.cpp6
1 files changed, 6 insertions, 0 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 5fbd3fd8e88..35fabcfe65b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1997,6 +1997,12 @@ get_lowered_simd_width(const struct gen_device_info *devinfo,
* only hardware that implements fp64 in Align16.
*/
if (devinfo->gen == 7 && inst->size_written > REG_SIZE) {
+ /* Align16 8-wide double-precision SEL does not work well. Verified
+ * empirically.
+ */
+ if (inst->opcode == BRW_OPCODE_SEL && type_sz(inst->dst.type) == 8)
+ lowered_width = MIN2(lowered_width, 4);
+
/* HSW PRM, 3D Media GPGPU Engine, Region Alignment Rules for Direct
* Register Addressing:
*