diff options
author | Roland Scheidegger <[email protected]> | 2012-09-21 17:03:48 +0200 |
---|---|---|
committer | José Fonseca <[email protected]> | 2012-10-12 18:51:18 +0100 |
commit | d366520e8553f4a16151ee946d6e8136cab3de5e (patch) | |
tree | c2f1df1a2710f2c1e4157a40a9a9623d85912bfa /src/gallium/drivers | |
parent | 2a4105cbc067dcd38057a877d606e9493e9ed53a (diff) |
gallivm: fix rsqrt failures
lp_build_rsqrt initially did not do any newton-raphson step. This meant that
precision was only ~11 bits, but this handled both input 0.0 and +infinity
correctly. It did not however handle input 1.0 accurately, and denormals
always generated infinity result.
Doing a newton-raphson step increased precision significantly (but notably
input 1.0 still doesn't give output 1.0), however this fails for inputs
0.0 and infinity (both result in NaNs).
Try to fix this up by using cmp/select but since this is all quite fishy
(and still doesn't handle denormals) disable for now. Note that even with
workarounds it should still have been faster since the fallback uses sqrt/div
(which both use the usually unpipelined and slow divider hw).
Also add some more test values to lp_test_arit and test lp_build_rcp() too while
there.
v2: based on José's feedback, avoid hacky infinity definition which doesn't
work with msvc (unfortunately using INFINITY won't cut it neither on non-c99
compilers) in lp_build_rsqrt, and while here fix up the input infinity case
too (it's disabled anyway). Only test infinity input case if we have c99,
and use float cast for calculating reference rsqrt value so we really get
what we expect.
Reviewed-by: José Fonseca <[email protected]>
Diffstat (limited to 'src/gallium/drivers')
-rw-r--r-- | src/gallium/drivers/llvmpipe/lp_test_arit.c | 42 |
1 files changed, 35 insertions, 7 deletions
diff --git a/src/gallium/drivers/llvmpipe/lp_test_arit.c b/src/gallium/drivers/llvmpipe/lp_test_arit.c index 6e09f7e67b0..99928b8ab6e 100644 --- a/src/gallium/drivers/llvmpipe/lp_test_arit.c +++ b/src/gallium/drivers/llvmpipe/lp_test_arit.c @@ -150,19 +150,42 @@ const float log2_values[] = { }; +static float rcpf(float x) +{ + return 1.0/x; +} + + +const float rcp_values[] = { + -0.0, 0.0, + -1.0, 1.0, + -1e-007, 1e-007, + -4.0, 4.0, + -1e+035, -100000, + 100000, 1e+035, + 5.88e-39f, // denormal +#if (__STDC_VERSION__ >= 199901L) + INFINITY, -INFINITY, +#endif +}; + + static float rsqrtf(float x) { - return 1.0/sqrt(x); + return 1.0/(float)sqrt(x); } const float rsqrt_values[] = { - -1, -1e-007, - 1e-007, 1, - -4, -1, - 1, 4, - -1e+035, -100000, + // http://msdn.microsoft.com/en-us/library/windows/desktop/bb147346.aspx + 0.0, // must yield infinity + 1.0, // must yield 1.0 + 1e-007, 4.0, 100000, 1e+035, + 5.88e-39f, // denormal +#if (__STDC_VERSION__ >= 199901L) + INFINITY, +#endif }; @@ -231,6 +254,7 @@ unary_tests[] = { {"log2", &lp_build_log2, &log2f, log2_values, Elements(log2_values), 20.0 }, {"exp", &lp_build_exp, &expf, exp2_values, Elements(exp2_values), 18.0 }, {"log", &lp_build_log, &logf, log2_values, Elements(log2_values), 20.0 }, + {"rcp", &lp_build_rcp, &rcpf, rcp_values, Elements(rcp_values), 20.0 }, {"rsqrt", &lp_build_rsqrt, &rsqrtf, rsqrt_values, Elements(rsqrt_values), 20.0 }, {"sin", &lp_build_sin, &sinf, sincos_values, Elements(sincos_values), 20.0 }, {"cos", &lp_build_cos, &cosf, sincos_values, Elements(sincos_values), 20.0 }, @@ -330,7 +354,11 @@ test_unary(unsigned verbose, FILE *fp, const struct unary_test_t *test) double error, precision; bool pass; - error = fabs(out[i] - ref); + if (util_inf_sign(ref) && util_inf_sign(out[i]) == util_inf_sign(ref)) { + error = 0; + } else { + error = fabs(out[i] - ref); + } precision = error ? -log2(error/fabs(ref)) : FLT_MANT_DIG; pass = precision >= test->precision; |