diff options
author | Roland Scheidegger <[email protected]> | 2013-11-11 14:29:25 +0000 |
---|---|---|
committer | Roland Scheidegger <[email protected]> | 2013-11-14 12:24:55 +0000 |
commit | 754319490f6946a9ad5ee619822d5fe4254e6759 (patch) | |
tree | aa89800a7fcd530661b8018feaeadba480182f44 /src/gallium/drivers/llvmpipe | |
parent | a15a19f0d1b024f5f18f1dfe878ae8d399e38469 (diff) |
gallivm,llvmpipe: fix float->srgb conversion to handle NaNs
d3d10 requires us to convert NaNs to zero for any float->int conversion.
We don't really do that but mostly seems to work. In particular I suspect the
very common float->unorm8 path only really passes because it relies on sse2
pack intrinsics which just happen to work by luck for NaNs (float->int
conversion in hw gives integer indeterminate value, which just happens to be
-0x80000000 hence gets converted to zero in the end after pack intrinsics).
However, float->srgb didn't get so lucky, because we need to clamp before
blending and clamping resulted in NaN behavior being undefined (and actually
got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp
with defined nan behavior as we can handle the NaN for free this way.
I suspect there's more bugs lurking in this area (e.g. converting floats to
snorm) as we don't really use defined NaN behavior everywhere but this seems
to be good enough.
While here respecify nan behavior modes a bit, in particular the return_second
mode didn't really do what we wanted. From the caller's perspective, we really
wanted to say we need the non-nan result, but we already know the second arg
isn't a NaN. So we use this now instead, which means that cpu architectures
which actually implement min/max by always returning non-nan (that is adhering
to ieee754-2008 rules) don't need to bend over backwards for nothing.
Reviewed-by: Jose Fonseca <[email protected]>
Diffstat (limited to 'src/gallium/drivers/llvmpipe')
-rw-r--r-- | src/gallium/drivers/llvmpipe/lp_state_fs.c | 4 |
1 files changed, 2 insertions, 2 deletions
diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c b/src/gallium/drivers/llvmpipe/lp_state_fs.c index 8223d2ad7eb..b5816e038f1 100644 --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c @@ -1760,11 +1760,11 @@ generate_unswizzled_blend(struct gallivm_state *gallivm, assert(row_type.floating); lp_build_context_init(&f32_bld, gallivm, row_type); for (i = 0; i < src_count; i++) { - src[i] = lp_build_clamp(&f32_bld, src[i], f32_bld.zero, f32_bld.one); + src[i] = lp_build_clamp_zero_one_nanzero(&f32_bld, src[i]); } if (dual_source_blend) { for (i = 0; i < src_count; i++) { - src1[i] = lp_build_clamp(&f32_bld, src1[i], f32_bld.zero, f32_bld.one); + src1[i] = lp_build_clamp_zero_one_nanzero(&f32_bld, src1[i]); } } /* probably can't be different than row_type but better safe than sorry... */ |