diff options
author | Jason Ekstrand <[email protected]> | 2019-05-09 14:44:16 -0500 |
---|---|---|
committer | Jason Ekstrand <[email protected]> | 2019-05-14 12:30:22 -0500 |
commit | 621232694176ea83752505643b106c8d1c719893 (patch) | |
tree | 1d38d58f72eff779ea151b52ba4718b528b7ab99 /src/intel | |
parent | 41b310e2196a89a2fdd05509f8160b207d0e4d9b (diff) |
intel/fs: Stop doing extra RA calls
In the last phase of the schedule and RA loop, the RA call is redundant
if we spill. Immediately afterwards, we're going to see that we
couldn't allocate without spilling and call back into RA and tell it to
go ahead and spill. We've known about it for a while but we've always
brushed over it on the theory that, if you're going to spill, you'll be
calling RA a bunch anyway and what does one extra RA hurt? As it turns
out, it hurts more than you'd expect. Because the RA interference graph
gets sparser with each spill and the RA algorithm is more efficient on
sparser graphs, the RA call that we're duplicating is actually the most
expensive call in the RA-and-spill loop.
There's another extra RA call we do that's a bit harder to see which
this also removes. If we try to compile a shader that isn't the minimum
dispatch width and it fails to allocate without spilling we call fail()
to set an error but then go ahead and do the first spilling RA pass and
only after that's complete do we detect the fail and bail out. By
making minimum dispatch widths part of the spill condition, we side-step
this problem.
Getting rid of these extra spills takes the compile time of a nasty
Aztec Ruins shader from about 28 seconds to about 26 seconds on my
laptop. It also makes shader-db 1.5% faster
Shader-db results on Kaby Lake:
total instructions in shared programs: 15311100 -> 15311100 (0.00%)
instructions in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 355468050 -> 355468050 (0.00%)
cycles in affected programs: 0 -> 0
helped: 0
HURT: 0
Total CPU time (seconds): 2524.31 -> 2486.63 (-1.49%)
Reviewed-by: Kenneth Graunke <[email protected]>
Diffstat (limited to 'src/intel')
-rw-r--r-- | src/intel/compiler/brw_fs.cpp | 46 |
1 files changed, 27 insertions, 19 deletions
diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 66ee7605bea..f9fbffca7ce 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -7246,7 +7246,7 @@ fs_visitor::fixup_3src_null_dest() void fs_visitor::allocate_registers(unsigned min_dispatch_width, bool allow_spilling) { - bool allocated_without_spills; + bool allocated; static const enum instruction_scheduler_mode pre_modes[] = { SCHEDULE_PRE, @@ -7265,15 +7265,28 @@ fs_visitor::allocate_registers(unsigned min_dispatch_width, bool allow_spilling) if (0) { assign_regs_trivial(); - allocated_without_spills = true; - } else { - allocated_without_spills = assign_regs(false, spill_all); + allocated = true; + break; } - if (allocated_without_spills) + + /* We only allow spilling for the last schedule mode and only if the + * allow_spilling parameter and dispatch width work out ok. + */ + bool can_spill = allow_spilling && + (i == ARRAY_SIZE(pre_modes) - 1) && + dispatch_width == min_dispatch_width; + + /* We should only spill registers on the last scheduling. */ + assert(!spilled_any_registers); + + do { + allocated = assign_regs(can_spill, spill_all); + } while (!allocated && can_spill && !failed); + if (allocated) break; } - if (!allocated_without_spills) { + if (!allocated) { if (!allow_spilling) fail("Failure to register allocate and spilling is not allowed."); @@ -7284,21 +7297,16 @@ fs_visitor::allocate_registers(unsigned min_dispatch_width, bool allow_spilling) if (dispatch_width > min_dispatch_width) { fail("Failure to register allocate. Reduce number of " "live scalar values to avoid this."); - } else { - compiler->shader_perf_log(log_data, - "%s shader triggered register spilling. " - "Try reducing the number of live scalar " - "values to improve performance.\n", - stage_name); } - /* Since we're out of heuristics, just go spill registers until we - * get an allocation. - */ - while (!assign_regs(true, spill_all)) { - if (failed) - break; - } + /* If we failed to allocate, we must have a reason */ + assert(failed); + } else if (spilled_any_registers) { + compiler->shader_perf_log(log_data, + "%s shader triggered register spilling. " + "Try reducing the number of live scalar " + "values to improve performance.\n", + stage_name); } /* This must come after all optimization and register allocation, since |