| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
CID: 1415909
Fixes: 7a34a0e8903249c41fae "ra: Add a callback for selecting a register
from what's available."
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VC4 has had a tension, similar to pre-Sandybridge Intel, where we want to
use low-numbered registers (more parallelism on Intel, fewer delay slots
on vc4), but in order to give instruction scheduling the most freedom to
avoid delays we want to round-robin between registers of the same cost.
Our two heuristics so far have chosen one end or the other of that
tradeoff.
The callback, instead, hands the driver the set of registers that are
available, and the driver gets to make its own choice. This will be used
in vc4 to round-robin between registers of the same cost, and might be
used in the future for improving bank selection.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
All the paths looping over adjacency had guards against considering
themselves (the non-obvious one was ra_any_neighbors_conflict(), which has
in_stack set).
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
I was going to indent this code another level, and decided it would be
easier to read as a helper.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Fails with old msvc otherwise.
|
|
|
|
|
|
|
|
|
| |
Since i965 is now using make_reg_conflicts_transitive and doesn't need
q-value computations, they are disabled on i965. They are enabled
everywhere else so that they get the old behavior. This reduces the time
spent in eglInitialize() on BDW by around 10-15%.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
They are never used after the set is finalized so there's no reason to keep
them around.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
All this commit does is change an early return to an if with an else
clause.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The round-robin allocation strategy is expected to decrease the amount
of false dependencies created by the register allocator and give the
post-RA scheduling pass more freedom to move instructions around. On
the other hand it has the disadvantage of increasing fragmentation and
decreasing the number of equally-colored nearby nodes, what increases
the likelihood of failure in presence of optimistically colorable
nodes.
This patch disables the round-robin strategy for optimistically
colorable nodes. These typically arise in situations of high register
pressure or for registers with large live intervals, in both cases the
task of the instruction scheduler shouldn't be constrained excessively
by the dense packing of those nodes, and a spill (or on Intel hardware
a fall-back to SIMD8 mode) is invariably worse than a slightly less
optimal scheduling.
Shader-db results on the i965 driver:
total instructions in shared programs: 5488539 -> 5488489 (-0.00%)
instructions in affected programs: 1121 -> 1071 (-4.46%)
helped: 1
HURT: 0
GAINED: 49
LOST: 5
v2: Re-enable round-robin already for the lowest one of the nodes
pushed optimistically onto the sack (Connor).
v3: Use UINT_MAX instead of ~0, open-code MIN2 (Jason, Connor).
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The i965 backends pass something out of 'screen', which is allocated
per-process, making using this as a ralloc context not thread-safe.
All callers ra_alloc_interference_graph() already ralloc_free() its
return value.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This shouldn't be a functional change since reg_belongs_to_class is just a
wrapper around BITSET_TEST. It just makes the code a little easier to
read.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
The r300 gallium driver is using it outside of the Mesa tree, and I wanted
to do so for vc4 as well. Rather than make the multiple-definitions
problem even more complicated, just move it to more-shared code.
v2: Don't forget to delete the symlink in r300 (review by Matt).
Delete more r300-helper references (review by Emil)
Don't prefix util/ header inclusion with "util/" (review by Emil)
Reviewed-by: Matt Turner <[email protected]> (v1)
Reviewed-by: Emil Velikov <[email protected]> (v1)
|