diff options
author | Francisco Jerez <[email protected]> | 2017-01-20 15:24:30 -0800 |
---|---|---|
committer | Francisco Jerez <[email protected]> | 2017-01-31 10:33:27 -0800 |
commit | 7215375c445f533e3962a09b8e3b075880c1382f (patch) | |
tree | b90e4566900ccebdd1185fe8cb31beed873d72f4 /src/compiler/glsl/builtin_functions.cpp | |
parent | e9ffd12827ac11a2d2002a42fa8eb1df847153ba (diff) |
nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity.
See "glsl: Rewrite atan2 implementation to fix accuracy and handling
of zero/infinity." for the rationale, but note that the instruction
count benefit discussed there is somewhat less important for the SPIRV
implementation, because the current code already emitted no control
flow instructions -- Still this saves us one hardware instruction per
scalar component on Intel SKL hardware.
Fixes the following Vulkan CTS tests on Intel hardware:
dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar
dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2
dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3
dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4
dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2
dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4
Note that most of the test-cases above expect IEEE-compliant handling
of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so
except for the last two the test-cases above weren't expected to pass
yet. The reason they do is that the i965 back-end implementation of
the NIR fmin and fmax instructions is not quite GLSL-compliant (it
complies with IEEE 754 recommendations though), because fmin/fmax of a
NaN and a non-NaN argument currently always return the non-NaN
argument, which causes atan() to flush NaN to one and return the
expected value. The front-end should probably not be relying on this
behavior for correctness though because other back-ends are likely to
behave differently -- A follow-up patch will handle the atan2(±∞, ±∞)
corner cases explicitly.
v2: Fix up argument scaling to take into account the range and
precision of exotic FP24 hardware. Flip coordinate system for
arguments along the vertical line as if they were on the left
half-plane in order to avoid division by zero which may give
unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in
some more comments.
Reviewed-by: Ian Romanick <[email protected]>
Diffstat (limited to 'src/compiler/glsl/builtin_functions.cpp')
0 files changed, 0 insertions, 0 deletions