summaryrefslogtreecommitdiffstats
path: root/src/util/register_allocate.c
Commit message (Collapse)AuthorAgeFilesLines
* util/ra: fix memory leakEric Engestrom2017-07-311-0/+2
| | | | | | | | | CID: 1415909 Fixes: 7a34a0e8903249c41fae "ra: Add a callback for selecting a register from what's available." Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ra: Add a callback for selecting a register from what's available.Eric Anholt2017-07-251-14/+76
| | | | | | | | | | | | | | | | VC4 has had a tension, similar to pre-Sandybridge Intel, where we want to use low-numbered registers (more parallelism on Intel, fewer delay slots on vc4), but in order to give instruction scheduling the most freedom to avoid delays we want to round-robin between registers of the same cost. Our two heuristics so far have chosen one end or the other of that tradeoff. The callback, instead, hands the driver the set of registers that are available, and the driver gets to make its own choice. This will be used in vc4 to round-robin between registers of the same cost, and might be used in the future for improving bank selection. Reviewed-by: Nicolai Hähnle <[email protected]>
* ra: Don't put a node in its own adjacency set.Eric Anholt2017-07-251-13/+10
| | | | | | | | All the paths looping over adjacency had guards against considering themselves (the non-obvious one was ra_any_neighbors_conflict(), which has in_stack set). Reviewed-by: Nicolai Hähnle <[email protected]>
* ra: Pull the body of a loop out to a helper function.Eric Anholt2017-07-251-12/+19
| | | | | | | I was going to indent this code another level, and decided it would be easier to read as a helper. Reviewed-by: Nicolai Hähnle <[email protected]>
* util/ra: (trivial) fix c99 loop variable initializationRoland Scheidegger2015-08-191-7/+8
| | | | Fails with old msvc otherwise.
* util/ra: Make allocating conflict lists optionalJason Ekstrand2015-08-181-9/+17
| | | | | | | | | Since i965 is now using make_reg_conflicts_transitive and doesn't need q-value computations, they are disabled on i965. They are enabled everywhere else so that they get the old behavior. This reduces the time spent in eglInitialize() on BDW by around 10-15%. Reviewed-by: Eric Anholt <[email protected]>
* util/ra: Add a function for making all conflicts on a register transitiveJason Ekstrand2015-08-181-0/+23
| | | | Reviewed-by: Eric Anholt <[email protected]>
* ra: Delete the conflict lists in ra_set_finalizeJason Ekstrand2015-08-101-0/+5
| | | | | | | They are never used after the set is finalized so there's no reason to keep them around. Reviewed-by: Matt Turner <[email protected]>
* ra: Refactor ra_set_finalizeJason Ekstrand2015-08-101-26/+25
| | | | | | | All this commit does is change an early return to an if with an else clause. Reviewed-by: Matt Turner <[email protected]>
* util: Avoid double promotion.Matt Turner2015-07-291-1/+1
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* ra: Disable round-robin strategy for optimistically colorable nodes.Francisco Jerez2015-02-231-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The round-robin allocation strategy is expected to decrease the amount of false dependencies created by the register allocator and give the post-RA scheduling pass more freedom to move instructions around. On the other hand it has the disadvantage of increasing fragmentation and decreasing the number of equally-colored nearby nodes, what increases the likelihood of failure in presence of optimistically colorable nodes. This patch disables the round-robin strategy for optimistically colorable nodes. These typically arise in situations of high register pressure or for registers with large live intervals, in both cases the task of the instruction scheduler shouldn't be constrained excessively by the dense packing of those nodes, and a spill (or on Intel hardware a fall-back to SIMD8 mode) is invariably worse than a slightly less optimal scheduling. Shader-db results on the i965 driver: total instructions in shared programs: 5488539 -> 5488489 (-0.00%) instructions in affected programs: 1121 -> 1071 (-4.46%) helped: 1 HURT: 0 GAINED: 49 LOST: 5 v2: Re-enable round-robin already for the lowest one of the nodes pushed optimistically onto the sack (Connor). v3: Use UINT_MAX instead of ~0, open-code MIN2 (Jason, Connor). Reviewed-by: Connor Abbott <[email protected]>
* util: Move Mesa's bitset.h to util/.Eric Anholt2015-02-201-1/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* util: Silence signed-unsigned comparison warningsJan Vesely2014-12-171-6/+6
| | | | | Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* ra: Don't use regs as the ralloc context.Matt Turner2014-12-011-1/+1
| | | | | | | | | | The i965 backends pass something out of 'screen', which is allocated per-process, making using this as a ralloc context not thread-safe. All callers ra_alloc_interference_graph() already ralloc_free() its return value. Reviewed-by: Jason Ekstrand <[email protected]>
* util: Use reg_belongs_to_class instead of BITSET_TESTJason Ekstrand2014-10-241-1/+1
| | | | | | | | | This shouldn't be a functional change since reg_belongs_to_class is just a wrapper around BITSET_TEST. It just makes the code a little easier to read. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: Move register_allocate.c to util.Eric Anholt2014-09-231-0/+654
The r300 gallium driver is using it outside of the Mesa tree, and I wanted to do so for vc4 as well. Rather than make the multiple-definitions problem even more complicated, just move it to more-shared code. v2: Don't forget to delete the symlink in r300 (review by Matt). Delete more r300-helper references (review by Emil) Don't prefix util/ header inclusion with "util/" (review by Emil) Reviewed-by: Matt Turner <[email protected]> (v1) Reviewed-by: Emil Velikov <[email protected]> (v1)