summaryrefslogtreecommitdiffstats
path: root/src/intel/compiler
diff options
context:
space:
mode:
authorMatt Turner <[email protected]>2017-01-20 13:35:32 -0800
committerFrancisco Jerez <[email protected]>2017-04-14 14:56:09 -0700
commit21e8e3a8484241508ac2c250fc4367234fa337df (patch)
treeee0d2896c624d88637ef141b791266680171d972 /src/intel/compiler
parentf030aaf2fb558219a43f286e2ea71c928e49b598 (diff)
i965/vec4: Fix exec size for MOVs {SET,PICK}_{HIGH,LOW}_32BIT.
Otherwise for a pack_double_2x32_split opcode, we emit: vec1 64 ssa_135 = pack_double_2x32_split ssa_133, ssa_134 mov(8) g5<1>UD g5<4>.xUD { align16 1Q compacted }; mov(8) g7<2>UD g5<4,4,1>UD { align1 1Q }; ERROR: When the destination spans two registers, the source must span two registers (exceptions for scalar source and packed-word to packed-dword expansion) mov(8) g8<2>UD g5.4<4,4,1>UD { align1 2N }; ERROR: The offset from the two source registers must be the same mov(8) g5<1>UD g6<4>.xUD { align16 1Q compacted }; mov(8) g7.1<2>UD g5<4,4,1>UD { align1 1Q }; ERROR: When the destination spans two registers, the source must span two registers (exceptions for scalar source and packed-word to packed-dword expansion) mov(8) g8.1<2>UD g5.4<4,4,1>UD { align1 2N }; ERROR: The offset from the two source registers must be the same The intention was to emit mov(4)s for the instructions that have ERROR annotations. See tests/spec/arb_gpu_shader_fp64/execution/vs-isinf-dvec.shader_test for example. v2 (Samuel): - Instead of setting the exec size to a fixed value, don't double it (Curro). - Add PICK_{HIGH,LOW}_32BIT to the condition. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> [ Francisco Jerez: Trivial rebase changes. ] Reviewed-by: Francisco Jerez <[email protected]>
Diffstat (limited to 'src/intel/compiler')
-rw-r--r--src/intel/compiler/brw_vec4_generator.cpp16
1 files changed, 12 insertions, 4 deletions
diff --git a/src/intel/compiler/brw_vec4_generator.cpp b/src/intel/compiler/brw_vec4_generator.cpp
index 09081588400..e786ac6a0ca 100644
--- a/src/intel/compiler/brw_vec4_generator.cpp
+++ b/src/intel/compiler/brw_vec4_generator.cpp
@@ -1526,11 +1526,19 @@ generate_code(struct brw_codegen *p,
assert(inst->group % inst->exec_size == 0);
assert(inst->group % 4 == 0);
+ /* There are some instructions where the destination is 64-bit
+ * but we retype it to a smaller type. In that case, we cannot
+ * double the exec_size.
+ */
+ const bool is_df = (get_exec_type_size(inst) == 8 ||
+ inst->dst.type == BRW_REGISTER_TYPE_DF) &&
+ inst->opcode != VEC4_OPCODE_PICK_LOW_32BIT &&
+ inst->opcode != VEC4_OPCODE_PICK_HIGH_32BIT &&
+ inst->opcode != VEC4_OPCODE_SET_LOW_32BIT &&
+ inst->opcode != VEC4_OPCODE_SET_HIGH_32BIT;
+
unsigned exec_size = inst->exec_size;
- if (devinfo->gen == 7 &&
- !devinfo->is_haswell &&
- (get_exec_type_size(inst) == 8 ||
- inst->dst.type == BRW_REGISTER_TYPE_DF))
+ if (devinfo->gen == 7 && !devinfo->is_haswell && is_df)
exec_size *= 2;
brw_set_default_exec_size(p, cvt(exec_size) - 1);