anv: Enable Gen11 Color/Z write merging optimization

TCCNTLREG contains additional L3 cache write merging optimizations. The default value on my system appears to be: - URB Partial Write Merging (bit 0) - L3 Data Partial Write Merging (bit 2) - TC Disable (bit 3) Windows drivers appear to set bit 1 as well to enable "Color/Z Partial Write Merging". This should solve an issue we were seeing where MRT benchmarks were using substantially more bandwidth than they ought. However, we have not observed it to cause measurable FPS gains. It is unclear whether we should be setting bit 0 or bit 3, so for now we leave those at the hardware default value. Acked-by: Jason Ekstrand <[email protected]>
author: Kenneth Graunke <[email protected]> 2019-12-02 17:30:06 -0800
committer: Kenneth Graunke <[email protected]> 2019-12-10 16:19:46 -0800
commit: 0f2f561a1021cd68dcac41f4ca00a5bb40bda6ea (patch)
tree: d2ca47dc7b130fdc9015009be16faf77f3bc0f38 /src/intel
parent: 5cc7636993ca50dd8a602ee5a4fef0f4fbf29cd2 (diff)
1 files changed, 12 insertions, 0 deletions
diff --git a/src/intel/vulkan/genX_state.c b/src/intel/vulkan/genX_state.c
index 81739acf065..8ad048225a6 100644
--- a/src/intel/vulkan/genX_state.c
+++ b/src/intel/vulkan/genX_state.c
@@ -266,6 +266,18 @@ genX(init_device_state)(struct anv_device *device)
       lri.DataDWord      = half_slice_chicken7;
    }
 
+   uint32_t tccntlreg;
+   anv_pack_struct(&tccntlreg, GENX(TCCNTLREG),
+                   .L3DataPartialWriteMergingEnable = true,
+                   .ColorZPartialWriteMergingEnable = true,
+                   .URBPartialWriteMergingEnable = true,
+                   .TCDisable = true);
+
+   anv_batch_emit(&batch, GENX(MI_LOAD_REGISTER_IMM), lri) {
+      lri.RegisterOffset = GENX(TCCNTLREG_num);
+      lri.DataDWord      = tccntlreg;
+   }
+
 #endif
    genX(emit_slice_hashing_state)(device, &batch);
author	Kenneth Graunke <[email protected]>	2019-12-02 17:30:06 -0800
committer	Kenneth Graunke <[email protected]>	2019-12-10 16:19:46 -0800
commit	0f2f561a1021cd68dcac41f4ca00a5bb40bda6ea (patch)
tree	d2ca47dc7b130fdc9015009be16faf77f3bc0f38 /src/intel
parent	5cc7636993ca50dd8a602ee5a4fef0f4fbf29cd2 (diff)