summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/virgl/virgl_protocol.h
diff options
context:
space:
mode:
authorKenneth Graunke <[email protected]>2016-02-17 00:37:04 -0800
committerKenneth Graunke <[email protected]>2016-02-29 16:12:50 -0800
commit24994ae926629ac8521df3cab4a02eb81de15907 (patch)
tree48f50aed74a7e9673c7fe92fddceffccacaac665 /src/gallium/drivers/virgl/virgl_protocol.h
parentc54f38494cb58f13a9a6cc6d0e2d75a7603ba105 (diff)
i965: Push most TES inputs in vec4 mode.
(This is commit 4a1c8a3037cd29938b2a6e2c680c341e9903cfbe for vec4 mode.) Using the push model for inputs is much more efficient than pulling inputs - the hardware can simply copy a large chunk into URB registers at thread creation time, rather than having the thread send messages to request data from the L3 cache. Unfortunately, it's possible to have more TES inputs than fit in registers, so we have to fall back to the pull model in some cases. However, it turns out that most tessellation evaluation shaders are fairly simple, and don't use many inputs. An arbitrary cut-off of 24 vec4 slots (12 registers) should suffice. (I chose this instead of the 32 vec4 slots used in the scalar backend to avoid regressing a few Piglit tests due to the vec4 register allocator being too stupid to figure out what to do. We probably ought to fix that, but it's a separate issue.) Improves performance in GPUTest's tessmark_x64 microbenchmark by 41.5394% +/- 0.288519% (n = 115) at 1024x768 on my Clevo W740SU (with Iris Pro 5200). Improves performance in Synmark's Gl40TerrainFlyTess microbenchmark by 38.3576% +/- 0.759748% (n = 42). v2: Simplify abs/negate handling, as requested by Matt. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
Diffstat (limited to 'src/gallium/drivers/virgl/virgl_protocol.h')
0 files changed, 0 insertions, 0 deletions