i965/fs: Improve accuracy of dFdy() to match dFdx(). - mesa.git - Unnamed repository; edit this file 'description' to name the repository.

diff options

author	Paul Berry <[email protected]>	2013-09-20 09:04:31 -0700
committer	Paul Berry <[email protected]>	2013-10-03 13:49:15 -0700
commit	800610f9eb6ad24b5fefc9206fb700c7ae2f0ec8 (patch)
tree	6e62eed31079f763378240c0cd6c7863686e8e3e /src/gallium/drivers/radeonsi
parent	9267565ee4248f7bc8efebd8c994a93ff1e0683d (diff)

i965/fs: Improve accuracy of dFdy() to match dFdx().

Previously, we computed dFdy() using the following instruction: add(8) dst<1>F src<4,4,0)F -src.2<4,4,0>F { align1 1Q } That had the disadvantage that it computed the same value for all 4 pixels of a 2x2 subspan, which meant that it was less accurate than dFdx(). This patch changes it to the following instruction when c->key.high_quality_derivatives is set: add(8) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1Q } This gives it comparable accuracy to dFdx(). Unfortunately, align16 instructions can't be compressed, so in SIMD16 shaders, instead of emitting this instruction: add(16) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1H } We need to unroll to two instructions: add(8) dst<1>F src<4,4,1>.xyxyF -src<4,4,1>.zwzwF { align16 1Q } add(8) (dst+1)<1>F (src+1)<4,4,1>.xyxyF -(src+1)<4,4,1>.zwzwF { align16 2Q } Fixes piglit test spec/glsl-1.10/execution/fs-dfdy-accuracy. Acked-by: Chris Forbes <[email protected]> Reviewed-by: Eric Anholt <[email protected]>

Diffstat (limited to 'src/gallium/drivers/radeonsi')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: