| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
These instructions don't have pop count.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The source can be a register as well as an immediate, and disassembling
a register as an immediate can have some strange results.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The optimization relies on CMP setting the destination to 0, which is
equivalent to 0.0f. However, early platforms only set the least
significant byte, leaving the other bits undefined. So, we must disable
the optimization on those platforms.
Oddly, Sandybridge wasn't reported as broken. The PRM states that it
only sets the LSB, but the internal documentation says that it follows
the IVB behavior. Since it wasn't reported as broken, we believe it
really does follow the IVB behavior.
v2: Allow the optimization on Sandybridge (requested by Matt).
+32 piglits on Ironlake.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?=79963
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Operating on this code,
B0: ...
cmp.ne.f0(8)
(+f0) if(8)
B1: break(8)
B2: endif(8)
We can delete B2 without attempting to merge any blocks, since the
break/continue instruction necessarily ends the previous block.
After deleting the if instruction, we attempt to merge blocks B0 and B1.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass deletes an IF/ELSE/ENDIF or IF/ENDIF sequence, or the ELSE in
an ELSE/ENDIF sequence.
In the typical case (where IF and ENDIF) aren't the only instructions in
their basic blocks, we can simply remove the instructions (implicitly
deleting the block containing only the ELSE), and attempt to merge
blocks B0 and B2 together.
B0: ...
(+f0) if(8)
B1: else(8)
B2: endif(8)
...
If the IF or ENDIF instructions are the only instructions in their
respective basic blocks (which are deleted by the removal of the
instructions), we'll want to instead merge the next blocks.
Both B0 and B2 are possibly removed by the removal of if & endif.
Same situation for if/endif. E.g., in the following example we'd remove
blocks B1 and B2, and then attempt to combine B0 and B3.
B0: ...
B1: (+f0) if(8)
B2: endif(8)
B3: ...
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... rather than pointing directly to the associated instruction. This
will let us set the block containing the IF statement's else-pointer to
NULL, when we delete a useless ELSE instruction, as in the case
(+f0) if(8)
...
else(8)
endif(8)
Also, remove the pointer to the ENDIF, since it's unused, and it was
also potentially wrong, in the case of a basic block containing both an
ENDIF and an IF instruction:
endif(8)
cmp.ne.f0(8) ...
(+f0) if(8)
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
To avoid invalidating and recreating the control flow graph. Also stop
invalidating the CFG in places we didn't add or remove an instruction.
cfg calculations: 202951 -> 80307 (-60.43%)
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
Will let us avoid invalidating the CFG if the optimization pass has
removed instructions using the new basic block methods.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82846
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82929
|
|
|
|
|
|
|
|
|
| |
Remainder of the dri1 times.
Cc: Marek Olšák <[email protected]>
Cc: Michel Dänzer <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Remove the set-but-unused, and set-but-empty vtable entries.
Most likely a leftover from the dri1 days.
Cc: Marek Olšák <[email protected]>
Cc: Michel Dänzer <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both have been unused for at least a couple of years.
For example the last user of radeon_macros.h was removed with
commit 8c11f0a88300f7bc3f05a12789c781ba0f4b3cc6
Author: Eric Anholt <[email protected]>
Date: Fri Oct 14 13:27:02 2011 -0700
radeon: Drop the legacy BO manager code.
Cc: Marek Olšák <[email protected]>
Cc: Michel Dänzer <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Unlocking the texture is not safe: another thread could come in and grab
it. Now that we use a recursive mutex, this should work. This also fixes
texture lock deadlocks in the new meta fast clear path.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Tested-by: Chris Forbes <[email protected]>
|
|
|
|
|
| |
total instructions in shared programs: 4288033 -> 4266151 (-0.51%)
instructions in affected programs: 930915 -> 909033 (-2.35%)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The docs say "When performing a render target resolve, PIPE_CONTROL with end
of pipe sync must be delivered.", which doesn't actually tell us whether we
need to do it before or after. Blorp did it before and after, and doing it
before certainly makes sense. The resolve operation needs to read from the
MCS and if we don't flush the render cache it won't get up-to-date data.
On the other hand, doing it after should not be necessary, since we call
brw_render_cache_set_check_flush() after the resolve.
Fixes rendering corruption in kwin's cover switch effect and various steam
games.
Missing flush spotted by Ken.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Signed-off-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
| |
The extension requires GL 3.0, so enable on just the generations
exposing that.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
total instructions in shared programs: 4344280 -> 4288033 (-1.29%)
instructions in affected programs: 397468 -> 341221 (-14.15%)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Acked-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
The loop over all instructions is now two-fold, over all of the blocks
and all of the instructions in each block.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Use this as an opportunity to rename 'block_num' to 'num'. block->num is
clear, and block->block_num has always been redundant.
|
|
|
|
|
|
|
|
| |
The next patch adds a foreach_block (block, cfg) macro, which works
better if it provides a direct bblock_t pointer, rather than a
bblock_link pointer that you have to use to find the actual block.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Doesn't use fewer instructions, but it does avoid writing the flag
register and if we want to switch the representation of true for Gen4/5
in the future, we can just delete the AND instruction.
|
|
|
|
|
|
|
| |
total instructions in shared programs: 4288650 -> 4282838 (-0.14%)
instructions in affected programs: 595018 -> 589206 (-0.98%)
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
| |
total instructions in shared programs: 4292303 -> 4288650 (-0.09%)
instructions in affected programs: 299670 -> 296017 (-1.22%)
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
AND, OR, and XOR can generate the conditional code directly.
total instructions in shared programs: 4293335 -> 4292303 (-0.02%)
instructions in affected programs: 121408 -> 120376 (-0.85%)
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
| |
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
| |
Dead since the call to _mesa_generate_parameters_list_for_uniforms
was removed in commit 12751ef2. So this was why all of that code that
was supposed to fix up the value of a uniform bool to wasn't happening.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
JIP/UIP were previously in units of compacted instructions. On Gen8
they're in units of bytes.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
If ->sys is non-null, we might decide that it's where the data is
stored.
Reviewed-by: Francisco Jerez <[email protected]>
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: <[email protected]>
|