summaryrefslogtreecommitdiffstats
path: root/common.py
diff options
context:
space:
mode:
authorRoland Scheidegger <[email protected]>2013-04-03 03:26:22 +0200
committerRoland Scheidegger <[email protected]>2013-04-04 01:03:42 +0200
commitf621015cb55ed6901f571710c808113129b1b939 (patch)
tree633c6c8f5d8c6b3f5890180bf0228a9f3a1b7acc /common.py
parentbdfbeb9633eb3f8cf1ad76723f6c3839e57a08a3 (diff)
gallivm: do per-pixel cube face selection (finally!!!)
This proved to be tricky, the problem is that after selection/mirroring we cannot calculate reasonable derivatives (if not all pixels in a quad end up on the same face the derivatives could get "randomly" exceedingly large). However, it is actually quite easy to simply calculate the derivatives before selection/mirroring and then transform them similar to the cube coordinates (they only need selection/projection, but not mirroring as we're not interested in the sign bit, of course). While there is a tiny bit more work to do (need to calculate derivs for 3 coords instead of 2, and additional selects) it also simplifies things somewhat for the coord selection itself (as we save some broadcast aos shuffles, and we don't need to calculate the average vector) - hence if derivatives aren't needed this should actually be faster. Also, this has the benefit that this will (trivially) work for explicit derivatives too, which we completely ignored before that (will be in a separate commit for better trackability). Note that while the way for getting rho looks very different, it should result in "nearly" the same values as before (the "nearly" is only because before the code would choose the face based on an "average" vector and hence the derivatives calculated according to this face, where now (for implicit derivatives) the derivatives are projected on the face selected for the first (top-left) pixel in a quad, so not necessarly the same face). The transformation done might not quite be state-of-the-art, calculating length(dx,dy) as max(dx,dy) certainly isn't neither but this stays the same as before (that is I think a better transform would _somehow_ take the "derivative major axis" into account so that derivative changes in the major axis wouldn't get ignored). Should solve some accuracy problems with cubemaps (can easily be seen with the cubemap demo when switching wrapping/filtering), though we still don't do seamless filtering to fix it completely (so not per-sample but per-pixel is certainly better than per-quad and already sufficient for accurate results with nearest tex filter). As for performance, it seems to be a tiny bit faster too (maybe 3% or so with cubemap demo). Which I'd have expected with nearest/nearest filtering where this will be less instructions, but the difference seems to actually be larger with linear/linear_mipmap_linear where it is slightly more instructions, probably the code appears less serialized allowing better scheduling (on a sandy bridge cpu). It actually seems to be now at least as fast as the old path using a conditional when using 128bit vectors too (that is probably more a result of testing with a newer cpu though), for now that old path is still there but unused. No piglit regressions. Reviewed-by: Jose Fonseca <[email protected]>
Diffstat (limited to 'common.py')
0 files changed, 0 insertions, 0 deletions