diff options
author | Ben Widawsky <[email protected]> | 2014-12-04 15:37:17 -0800 |
---|---|---|
committer | Ben Widawsky <[email protected]> | 2014-12-05 12:12:46 -0800 |
commit | f13870db09d7a10141b5ffc24058bb2abceaa035 (patch) | |
tree | 100a5132fdec2bf8581a8ca880d0a109727cd846 /src/mesa/drivers | |
parent | 6f32deb538b1b62ff6d5d1212105bbe8d6adce72 (diff) |
i965/gs: Avoid DW * DW mul
The GS has an interesting use for mul. Because the GS can emit multiple
vertices per input vertex, and it also has a unique count at the top of the URB
payload, the GS unit needs to be able to dynamically specify URB write offsets
(relative to the global offset). The documentation in the function has a very
good explanation from Paul on the mechanics.
This fixes around 2000 piglit tests on BSW.
v2:
Reworded commit message (Ben) no mention of CHV (Matt)
Change SHRT_MAX to USHRT_MAX (Ken, and Matt)
Update comment in code to reflect the use of UW (Ben)
Add Gen7+ assertion for the relevant GS code, since it won't work on Gen6- (Ken)
Drop the bogus hunk in emit_control_data_bits() (Ken)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84777 (with many dupes)
Cc: "10.4 10.3 10.2" <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Diffstat (limited to 'src/mesa/drivers')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 8 |
1 files changed, 6 insertions, 2 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index adbb1617374..74fd8c29f28 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -529,13 +529,17 @@ vec4_generator::generate_gs_set_write_offset(struct brw_reg dst, * * We can do this with the following EU instruction: * - * mul(2) dst.3<1>UD src0<8;2,4>UD src1 { Align1 WE_all } + * mul(2) dst.3<1>UD src0<8;2,4>UD src1<...>UW { Align1 WE_all } */ brw_push_insn_state(p); brw_set_default_access_mode(p, BRW_ALIGN_1); brw_set_default_mask_control(p, BRW_MASK_DISABLE); + assert(brw->gen >= 7 && + src1.file == BRW_IMMEDIATE_VALUE && + src1.type == BRW_REGISTER_TYPE_UD && + src1.dw1.ud <= USHRT_MAX); brw_MUL(p, suboffset(stride(dst, 2, 2, 1), 3), stride(src0, 8, 2, 4), - src1); + retype(src1, BRW_REGISTER_TYPE_UW)); brw_set_default_access_mode(p, BRW_ALIGN_16); brw_pop_insn_state(p); } |