i965: Don't add barrier deps for FB write messages.

There are never render target reads, so there are no scheduling hazards. Giving the extra flexibility to the scheduler makes it possible to do FB writes as soon as their sources are available, reducing register pressure. It also makes it possible to do the payload setup for more than one FB write message at a time, which could better hide latency. shader-db results on Skylake: total instructions in shared programs: 9110254 -> 9110211 (-0.00%) instructions in affected programs: 2898 -> 2855 (-1.48%) helped: 3 HURT: 0 LOST: 0 GAINED: 1 A reduction in instruction counts is surprising, but legitimate: the three shaders helped were spilling, and reducing register pressure allowed us to issue fewer spills/fills. total cycles in shared programs: 69035108 -> 68928820 (-0.15%) cycles in affected programs: 4412402 -> 4306114 (-2.41%) helped: 4457 HURT: 213 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
author: Kenneth Graunke <[email protected]> 2016-02-04 08:10:02 -0800
committer: Kenneth Graunke <[email protected]> 2016-02-08 16:59:35 -0800
commit: d0e1d6b7e27bf5f05436e47080d326d7daa63af2 (patch)
tree: 8ab70cb8c75d8745ef03c1a7685a66ecf2956e6b /src/mesa
parent: 6502b3f60e193b314bd20261a8290709a4a56674 (diff)
1 files changed, 4 insertions, 3 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
index 60f7fd9cfcd..4f97577515a 100644
--- a/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_schedule_instructions.cpp
@@ -939,8 +939,9 @@ fs_instruction_scheduler::calculate_deps()
    foreach_in_list(schedule_node, n, &instructions) {
       fs_inst *inst = (fs_inst *)n->inst;
 
-      if (inst->opcode == FS_OPCODE_PLACEHOLDER_HALT ||
-         inst->has_side_effects())
+      if ((inst->opcode == FS_OPCODE_PLACEHOLDER_HALT ||
+           inst->has_side_effects()) &&
+          inst->opcode != FS_OPCODE_FB_WRITE)
          add_barrier_deps(n);
 
       /* read-after-write deps. */
@@ -1195,7 +1196,7 @@ vec4_instruction_scheduler::calculate_deps()
    foreach_in_list(schedule_node, n, &instructions) {
       vec4_instruction *inst = (vec4_instruction *)n->inst;
 
-      if (inst->has_side_effects())
+      if (inst->has_side_effects() && inst->opcode != FS_OPCODE_FB_WRITE)
          add_barrier_deps(n);
 
       /* read-after-write deps. */
author	Kenneth Graunke <[email protected]>	2016-02-04 08:10:02 -0800
committer	Kenneth Graunke <[email protected]>	2016-02-08 16:59:35 -0800
commit	d0e1d6b7e27bf5f05436e47080d326d7daa63af2 (patch)
tree	8ab70cb8c75d8745ef03c1a7685a66ecf2956e6b /src/mesa
parent	6502b3f60e193b314bd20261a8290709a4a56674 (diff)