summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi/si_perfcounter.c
diff options
context:
space:
mode:
authorSamuel Pitoiset <[email protected]>2017-05-04 10:35:29 +0200
committerSamuel Pitoiset <[email protected]>2017-05-05 09:48:01 +0200
commit92ab06e782c31fe0209e5d0181967a2ff6739c9b (patch)
tree3b7a6872549b649428989e665a2bc9a0814cf7c2 /src/gallium/drivers/radeonsi/si_perfcounter.c
parent7761cf6d01e97aeb80606e51c11e4885a278ed54 (diff)
st/glsl_to_tgsi: fix renumber_registers() in presence of dead code
The TGSI DCE pass doesn't eliminate dead assignments like MOV TEMP[0], TEMP[1] in presence of loops because it assumes that the visitor doesn't emit dead code. This assumption is actually wrong and this situation happens. However, it appears that the merge_registers() pass accidentally takes care of this for some weird reasons. But since this pass has been disabled for RadeonSI and Nouveau, the renumber_registers() pass which is called *after*, can't do its job correctly. This is because it assumes that no dead code is present. But if there is still a dead assignment, it might re-use the TEMP register id incorrectly and emits wrong code. This patches fixes the issue by recording writes instead of reads, and this has the advantage to be faster. This should fix Unigine Heaven on RadeonSI and Nouveau. shader-db results with RadeonSI: 47109 shaders in 29632 tests Totals: SGPRS: 1923308 -> 1923316 (0.00 %) VGPRS: 1133843 -> 1133847 (0.00 %) Spilled SGPRs: 2516 -> 2518 (0.08 %) Spilled VGPRs: 65 -> 65 (0.00 %) Private memory VGPRs: 1184 -> 1184 (0.00 %) Scratch size: 1308 -> 1308 (0.00 %) dwords per thread Code Size: 60095968 -> 60096256 (0.00 %) bytes LDS: 1077 -> 1077 (0.00 %) blocks Max Waves: 431889 -> 431889 (0.00 %) Wait states: 0 -> 0 (0.00 %) It's still interesting to disable the merge_registers() pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
Diffstat (limited to 'src/gallium/drivers/radeonsi/si_perfcounter.c')
0 files changed, 0 insertions, 0 deletions