summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4
diff options
context:
space:
mode:
authorJason Ekstrand <[email protected]>2019-05-05 00:13:20 -0500
committerJason Ekstrand <[email protected]>2019-05-10 09:10:17 -0500
commitf8bda81887219d9a56b5427c20be3e63b5c3d136 (patch)
tree9887041161b45cc91f97e432a8fbd2eec21ea2a4 /src/gallium/drivers/vc4
parent20bbc175a4eea5604357bae6690efa1bb1f37feb (diff)
intel/fs/copy-prop: Don't walk all the ACPs for each instruction
In order to set up KILL sets, the dataflow code was walking the entire array of ACPs for every instruction. If you assume the number of ACPs increases roughly with the number of instructions, this is O(n^2). As it turns out, regions_overlap() is not nearly as cheap as one would like and shows up as a significant chunk on perf traces. This commit changes things around and instead first builds an array of exec_lists which it uses like a hash table (keyed off ACP source or destination) similar to what's done in the rest of the copy-prop code. By first walking the list of ACPs and populating the table and then walking instructions and only looking at ACPs which probably have the same VGRF number, we can reduce the complexity to O(n). This takes the execution time of the piglit vs-isnan-dvec test from about 56.4 seconds on an unoptimized debug build (what we run in CI) with NIR_VALIDATE=0 to about 38.7 seconds. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
Diffstat (limited to 'src/gallium/drivers/vc4')
0 files changed, 0 insertions, 0 deletions