diff options
author | Paul Berry <[email protected]> | 2013-09-20 09:04:31 -0700 |
---|---|---|
committer | Paul Berry <[email protected]> | 2013-10-03 13:49:15 -0700 |
commit | 800610f9eb6ad24b5fefc9206fb700c7ae2f0ec8 (patch) | |
tree | 6e62eed31079f763378240c0cd6c7863686e8e3e /src/gallium/drivers/radeonsi | |
parent | 9267565ee4248f7bc8efebd8c994a93ff1e0683d (diff) |
i965/fs: Improve accuracy of dFdy() to match dFdx().
Previously, we computed dFdy() using the following instruction:
add(8) dst<1>F src<4,4,0)F -src.2<4,4,0>F { align1 1Q }
That had the disadvantage that it computed the same value for all 4
pixels of a 2x2 subspan, which meant that it was less accurate than
dFdx(). This patch changes it to the following instruction when
c->key.high_quality_derivatives is set:
add(8) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1Q }
This gives it comparable accuracy to dFdx().
Unfortunately, align16 instructions can't be compressed, so in SIMD16
shaders, instead of emitting this instruction:
add(16) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1H }
We need to unroll to two instructions:
add(8) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1Q }
add(8) (dst+1)<1>F (src+1)<4,4,1>.xyxyF -(src+1)<4,4,1>.zwzwF { align16 2Q }
Fixes piglit test spec/glsl-1.10/execution/fs-dfdy-accuracy.
Acked-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Diffstat (limited to 'src/gallium/drivers/radeonsi')
0 files changed, 0 insertions, 0 deletions