diff options
author | Kenneth Graunke <[email protected]> | 2016-02-04 08:10:02 -0800 |
---|---|---|
committer | Kenneth Graunke <[email protected]> | 2016-02-08 16:59:35 -0800 |
commit | d0e1d6b7e27bf5f05436e47080d326d7daa63af2 (patch) | |
tree | 8ab70cb8c75d8745ef03c1a7685a66ecf2956e6b /src/mesa | |
parent | 6502b3f60e193b314bd20261a8290709a4a56674 (diff) |
i965: Don't add barrier deps for FB write messages.
There are never render target reads, so there are no scheduling hazards.
Giving the extra flexibility to the scheduler makes it possible to do
FB writes as soon as their sources are available, reducing register
pressure. It also makes it possible to do the payload setup for more
than one FB write message at a time, which could better hide latency.
shader-db results on Skylake:
total instructions in shared programs: 9110254 -> 9110211 (-0.00%)
instructions in affected programs: 2898 -> 2855 (-1.48%)
helped: 3
HURT: 0
LOST: 0
GAINED: 1
A reduction in instruction counts is surprising, but legitimate:
the three shaders helped were spilling, and reducing register
pressure allowed us to issue fewer spills/fills.
total cycles in shared programs: 69035108 -> 68928820 (-0.15%)
cycles in affected programs: 4412402 -> 4306114 (-2.41%)
helped: 4457
HURT: 213
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Diffstat (limited to 'src/mesa')
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp | 7 |
1 files changed, 4 insertions, 3 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp index 60f7fd9cfcd..4f97577515a 100644 --- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp +++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp @@ -939,8 +939,9 @@ fs_instruction_scheduler::calculate_deps() foreach_in_list(schedule_node, n, &instructions) { fs_inst *inst = (fs_inst *)n->inst; - if (inst->opcode == FS_OPCODE_PLACEHOLDER_HALT || - inst->has_side_effects()) + if ((inst->opcode == FS_OPCODE_PLACEHOLDER_HALT || + inst->has_side_effects()) && + inst->opcode != FS_OPCODE_FB_WRITE) add_barrier_deps(n); /* read-after-write deps. */ @@ -1195,7 +1196,7 @@ vec4_instruction_scheduler::calculate_deps() foreach_in_list(schedule_node, n, &instructions) { vec4_instruction *inst = (vec4_instruction *)n->inst; - if (inst->has_side_effects()) + if (inst->has_side_effects() && inst->opcode != FS_OPCODE_FB_WRITE) add_barrier_deps(n); /* read-after-write deps. */ |