diff options
author | Lionel Landwerlin <[email protected]> | 2019-02-27 15:53:21 +0000 |
---|---|---|
committer | Lionel Landwerlin <[email protected]> | 2019-02-27 20:06:42 +0000 |
commit | 6e184147ddce11e90c269a47af7d7395f5ed9c12 (patch) | |
tree | e9b34fd9f529bfe6d54644ed15c39189449535ea /src/intel/compiler | |
parent | 61e31886339b167bc85c48521664e456f0cfcf8e (diff) |
intel/compiler: use correct swizzle for replacement
The optimization in 4cd1a0be76883c introduced a replacement of :
cmp(8).z.f0.0 vgrf11.y:D, vgrf10.xxxx:D, vgrf2.xyyy:D
...
cmp(8).nz.f0.0 null.x:D, vgrf11.yyyy:D, 0D
By :
cmp(8).z.f0.0 vgrf15.x:D, vgrf10.xxxx:D, vgrf2.yyyy:D
...
mov(8) vgrf11.y:D, vgrf15.yyyy:D
The first cmp instruction is storing in x while the second mov is
sourcing from y. We need to take into account where the replacement on
the scan_inst destination is going to store thing so that the
replacement mov can source things from the correct location.
Signed-off-by: Lionel Landwerlin <[email protected]>
Fixes: 4cd1a0be76883c ("i965/vec4: Propagate conditional modifiers from more compares to other compares")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109759
Reviewed-by: Ian Romanick <[email protected]>
Diffstat (limited to 'src/intel/compiler')
-rw-r--r-- | src/intel/compiler/brw_vec4_cmod_propagation.cpp | 10 |
1 files changed, 5 insertions, 5 deletions
diff --git a/src/intel/compiler/brw_vec4_cmod_propagation.cpp b/src/intel/compiler/brw_vec4_cmod_propagation.cpp index 760327d559d..a7a3bb8fb06 100644 --- a/src/intel/compiler/brw_vec4_cmod_propagation.cpp +++ b/src/intel/compiler/brw_vec4_cmod_propagation.cpp @@ -173,19 +173,19 @@ opt_cmod_propagation_local(bblock_t *block, vec4_visitor *v) /* Given a sequence like: * - * cmp.ge.f0(8) g21<1>.xF g20<4>.xF g18<4>.xF + * cmp.ge.f0(8) g21<1>.zF g20<4>.xF g18<4>.xF * ... - * cmp.nz.f0(8) null<1>D g21<4>.xD 0D + * cmp.nz.f0(8) null<1>D g21<4>.zD 0D * * Replace it with something like: * - * cmp.ge.f0(8) g22<1>F g20<4>.xF g18<4>.xF - * mov(8) g21<1>.xF g22<1>.xxxxF + * cmp.ge.f0(8) g22<1>.zF g20<4>.xF g18<4>.xF + * mov(8) g21<1>.xF g22<1>.zzzzF * * The added MOV will most likely be removed later. In the * worst case, it should be cheaper to schedule. */ - temp.swizzle = inst->src[0].swizzle; + temp.swizzle = brw_swizzle_for_mask(inst->dst.writemask); temp.type = scan_inst->src[0].type; vec4_instruction *mov = v->MOV(scan_inst->dst, temp); |