diff options
author | Kenneth Graunke <[email protected]> | 2015-11-17 18:24:11 -0800 |
---|---|---|
committer | Kenneth Graunke <[email protected]> | 2015-11-30 00:27:16 -0800 |
commit | 1ac1581f3889d5f7e6e231c05651f44fbd80f0b6 (patch) | |
tree | 1029357df4ff9548f608cb9dfcd12a1e1790581c /docs/conform.html | |
parent | d72299c5315b4c9ff40209a14543e3eae9308081 (diff) |
i965: Fix JIP to properly skip over unrelated control flow.
We've apparently always been botching JIP for sequences such as:
do
cmp.f0.0 ...
(+f0.0) break
...
if
...
else
...
endif
...
while
Normally, UIP is supposed to point to the final destination of the jump,
while in nested control flow, JIP is supposed to point to the end of the
current nesting level. It essentially bounces out of the current nested
control flow, to an instruction that has a JIP which bounces out another
level, and so on.
In the above example, when setting JIP for the BREAK, we call
brw_find_next_block_end(), which begins a search after the BREAK for the
next ENDIF, ELSE, WHILE, or HALT. It ignores the IF and finds the ELSE,
setting JIP there.
This makes no sense at all. The break is supposed to skip over the
whole if/else/endif block entirely. They have a sibling relationship,
not a nesting relationship.
This patch fixes brw_find_next_block_end() to track depth as it does
its search, and ignore anything not at depth 0. So when it sees the
IF, it ignores everything until after the ENDIF. That way, it finds
the end of the right block.
I noticed this while reading some assembly code. We believe jumping
earlier is harmless, but makes the EU walk through a bunch of disabled
instructions for no reason. I noticed that GLBenchmark Manhattan had
a shader that contained a BREAK with a bogus JIP, but didn't measure
any performance improvement (it's likely miniscule, if there is any).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Diffstat (limited to 'docs/conform.html')
0 files changed, 0 insertions, 0 deletions