summaryrefslogtreecommitdiffstats
path: root/docs
diff options
context:
space:
mode:
authorRoland Scheidegger <[email protected]>2016-11-04 05:13:03 +0100
committerRoland Scheidegger <[email protected]>2016-11-08 03:41:26 +0100
commit3fa10ffb496cc4e6d1003891cf0381bb5bec2a74 (patch)
tree5a5da2229f263d8c6d853e64616263d5c3d3c286 /docs
parent29279f44b3172ef3b84d470e70fc7684695ced4b (diff)
draw: use vectorized calculations for fetch
Instead of doing all the math with scalars, use vectors. This means the overflow math needs to be done manually, albeit that's only really problematic for the stride/index mul, the rest has been pretty much moved outside the shader loop (albeit the mul could actually be optimized away too), where things are still scalar. Because llvm is complete fail with the zero-extend widening mul, roll our own even... To eliminate control flow in the main shader loop fetch, provide fake buffers (so index 0 is always valid to fetch). Still uses aos fetch though in the end - mostly because some more code would be needed to handle unaligned fetches in that path, and because for most formats it won't make a difference anyway (we generate some truly horrendous code for things like R16G16_something for instance). Instanced fetch however stays roughly the same as before, except that no longer the same element is fetched multiple times (I've seen a reduction of ~3 times in main shader loop size due to apparently llvm not being able to deduce it's really all the same with a couple instanced elements). Also, for elts gathering, use vectorized code as well - provide a fake elt buffer if there's no valid one bound. The generated shaders are smaller and faster to compile (not entirely sure about execution speed, but generally unless there's just single vertices to handle I would expect it to be faster - there's more opportunities for future improvements by using soa fetch). No piglit change. Reviewed-by: Jose Fonseca <[email protected]>
Diffstat (limited to 'docs')
0 files changed, 0 insertions, 0 deletions