summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/llvmpipe/lp_fence.h
diff options
context:
space:
mode:
authorRoland Scheidegger <[email protected]>2016-01-06 01:05:35 +0100
committerRoland Scheidegger <[email protected]>2016-01-13 03:50:57 +0100
commit49ec647c3b724c53f4be45ed97f8d8b1046e7ccd (patch)
treeb1360a9849a29483cdad678e18ed95177cc5374a /src/gallium/drivers/llvmpipe/lp_fence.h
parent16530fdc82cde61495c66abe704e88d7752dfdbf (diff)
llvmpipe: avoid most 64 bit math in rasterization
The trick here is to recognize that in the c + n * dcdx calculations, not only can the lower FIXED_ORDER bits not change (as the dcdx values have those all zero) but that this means the sign bit of the calculations cannot be different as well, that is sign(c + n*dcdx) == sign((c >> FIXED_ORDER) + n*(dcdx >> FIXED_ORDER)). That shaves off more than enough bits to never require 64bit masks. A shifted plane c value could still easily exceed 32 bits, however since we throw out planes which are trivial accept even before binning (and similarly don't even get to see tris for which there was a trivial reject plane)) this is never a problem. The idea isnt't all that revolutionary, in fact something similar was tried ages ago (9773722c2b09d5f0615a47cecf4347859474dc56) back when the values were only 32 bit anyway. I believe now it didn't quite work then because the adjustment needed for testing trivial reject / partial masks wasn't handled correctly. This still keeps the separate 32/64 bit paths for now, as the 32 bit one still looks minimally simpler (and also because if we'd pass in dcdx/dcdy/eo unscaled from setup which would be a good reason to ditch the 32 bit path, we'd need to change the special-purpose rasterization functions for small tris). This passes piglit triangle-rasterization (-fbo -auto -max_size -subpixelbits 8) and triangle-rasterization-overdraw (with some hacks to make it work correctly with large sizes) easily (full piglit as well of course, but most tests wouldn't use triangles large enough to be affected, that is tris with a bounding box over 128x128). The profiler says indeed time spent in rast_tri functions is reduced substantially, BUT of course only if the tris are large. I measured a 3% improvement in mesa gloss demo when supersized to twice the screen size... Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
Diffstat (limited to 'src/gallium/drivers/llvmpipe/lp_fence.h')
0 files changed, 0 insertions, 0 deletions