summaryrefslogtreecommitdiffstats
path: root/src/intel/compiler/brw_vec4.h
diff options
context:
space:
mode:
authorFrancisco Jerez <[email protected]>2017-12-12 12:05:04 -0800
committerFrancisco Jerez <[email protected]>2018-03-02 11:28:56 -0800
commitc063e88909e630bb4605037eb0fc072f40f8c2a2 (patch)
treeb67886f38f7584466647081a28209d65a772378b /src/intel/compiler/brw_vec4.h
parente7c9adca5726a8c96de20ae7c5f21a30061db392 (diff)
intel/fs: Handle surface opcode sample masks via predication.
The main motivation is to enable HDC surface opcodes on ICL which no longer allows the sample mask to be provided in a message header, but this is enabled all the way back to IVB when possible because it decreases the instruction count of some shaders using HDC messages significantly, e.g. one of the SynMark2 CSDof compute shaders decreases instruction count by about 40% due to the removal of header setup boilerplate which in turn makes a number of send message payloads more easily CSE-able. Shader-db results on SKL: total instructions in shared programs: 15325319 -> 15314384 (-0.07%) instructions in affected programs: 311532 -> 300597 (-3.51%) helped: 491 HURT: 1 Shader-db results on BDW where the optimization needs to be disabled in some cases due to hardware restrictions: total instructions in shared programs: 15604794 -> 15598028 (-0.04%) instructions in affected programs: 220863 -> 214097 (-3.06%) helped: 351 HURT: 0 The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL laptop with this change. According to Eero this improves performance of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2 desktop. Reviewed-by: Kenneth Graunke <[email protected]> Tested-By: Eero Tamminen <[email protected]>
Diffstat (limited to 'src/intel/compiler/brw_vec4.h')
0 files changed, 0 insertions, 0 deletions