diff options
author | Roland Scheidegger <[email protected]> | 2013-08-15 18:40:32 +0200 |
---|---|---|
committer | Roland Scheidegger <[email protected]> | 2013-08-15 18:42:20 +0200 |
commit | 5626a84a002cb8565b527ebc1fca73a8497019db (patch) | |
tree | 545b22ad32851b633dd48ae3920a6f31afa966d9 /src/gallium/auxiliary/gallivm/lp_bld_arit.c | |
parent | 3b2f3f90ace68e0a4777661f8cbf07438855edcb (diff) |
gallivm: do per-sample depth comparison instead of doing it post-filter
Doing the comparisons pre-filter is highly recommended by OpenGL (and d3d9)
and definitely required by d3d10.
This actually doesn't do it pre-filter but more "in-filter" as otherwise
need to push the comparisons even further down into fetch code and this
also trivially allows using a somewhat cheaper lerp.
Doing it pre-filter would actually have some performance advantage for UNORM
formats (because the comparisons should be done in texture format, we'd only
need to convert the shadow ref coord to texture format once, but in turn would
save converting the per-sample texture values to floats) but this gets a bit
messy as this has implications for border color handling as well (which needs
to be done prior to depth comparisons, hence would also need to convert border
color to texture format too or use some other tricks like doing separate border
color / shadow ref comparison and simply using that result directly when doing
border replacement).
Should make no difference for nearest filtering, and performance for linear
filtering should be mostly the same too (essentially have one more comparison
instruction per sample, and replace the sub/mul/add lerp with a sub/and/and/add
special "lerp" which all in all shouldn't be much of a difference).
v2: get rid of old code completely
Reviewed-by: Zack Rusin <[email protected]>
Diffstat (limited to 'src/gallium/auxiliary/gallivm/lp_bld_arit.c')
-rw-r--r-- | src/gallium/auxiliary/gallivm/lp_bld_arit.c | 13 |
1 files changed, 12 insertions, 1 deletions
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c b/src/gallium/auxiliary/gallivm/lp_bld_arit.c index 98409c3be86..ee30a02d78c 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c @@ -1411,8 +1411,19 @@ lp_build_clamp(struct lp_build_context *bld, assert(lp_check_value(bld->type, min)); assert(lp_check_value(bld->type, max)); - a = lp_build_min(bld, a, max); + /* + * XXX dark magic warning: The order of min/max here matters (!). + * The reason is a typical use case is clamp(a, 0.0, 1.0) + * (for example for float->unorm conversion) and on x86 sse2 + * this will give 0.0 for NaNs, whereas doing min first will + * give 1.0 for NaN which makes d3d10 angry... + * This is very much not guaranteed behavior though which just + * happens to work x86 sse2 (and up), and obviously won't do anything + * for other non-zero clamps (say -1.0/1.0 in a SNORM conversion) neither, + * so need to fix this for real... + */ a = lp_build_max(bld, a, min); + a = lp_build_min(bld, a, max); return a; } |