diff options
author | Kenneth Graunke <[email protected]> | 2017-09-24 14:24:53 -0700 |
---|---|---|
committer | Kenneth Graunke <[email protected]> | 2017-09-26 15:35:11 -0700 |
commit | 66342c997fb7892034e537d1456c37e6981c2fdb (patch) | |
tree | 238d25d7c09f02b6c9cd2d7d6d893b250c389fa5 /src | |
parent | a62fe340981b56fe54e49d3a6791e568f7f87554 (diff) |
i965/vec4: Fix swizzles on atomic sources.
Atomic operation sources are scalar values, but we were failing to
select the .x component of the second operand. For example,
atomicCounterCompSwapARB(counter, 5u, 10u)
would generate
mov(8) vgrf4.x:D, 5D
mov(8) vgrf5.x:D, 10D
mov(8) vgrf9.x:UD, vgrf4.xyzw:D
mov(8) vgrf9.y:UD, vgrf5.xyzw:D
which wrongly selects the .y component of vgrf5, so the actual 10u value
would get dead code eliminated. The swizzle works for the other source,
but both of them ought to be .xxxx.
Fixes the compare and swap CTS tests in:
KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase
Cc: "17.2 17.1 17.0 13.0" <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Diffstat (limited to 'src')
-rw-r--r-- | src/intel/compiler/brw_vec4_surface_builder.cpp | 13 |
1 files changed, 9 insertions, 4 deletions
diff --git a/src/intel/compiler/brw_vec4_surface_builder.cpp b/src/intel/compiler/brw_vec4_surface_builder.cpp index 5029cdce558..0e02aaf933a 100644 --- a/src/intel/compiler/brw_vec4_surface_builder.cpp +++ b/src/intel/compiler/brw_vec4_surface_builder.cpp @@ -212,10 +212,15 @@ namespace brw { const unsigned size = (src0.file != BAD_FILE) + (src1.file != BAD_FILE); const dst_reg srcs = bld.vgrf(BRW_REGISTER_TYPE_UD); - if (size >= 1) - bld.MOV(writemask(srcs, WRITEMASK_X), src0); - if (size >= 2) - bld.MOV(writemask(srcs, WRITEMASK_Y), src1); + if (size >= 1) { + bld.MOV(writemask(srcs, WRITEMASK_X), + swizzle(src0, BRW_SWIZZLE_XXXX)); + } + + if (size >= 2) { + bld.MOV(writemask(srcs, WRITEMASK_Y), + swizzle(src1, BRW_SWIZZLE_XXXX)); + } return emit_send(bld, SHADER_OPCODE_UNTYPED_ATOMIC, src_reg(), emit_insert(bld, addr, dims, has_simd4x2), |