| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
The IR for mix(float, float, bool) was missing a write mask, causing the
IR reader to die horribly. Furthermore, I neglected to add any of the
new prototypes to the 1.30 profiles.
Fixes oglconform's glsl-bif-com advanced.mix test cases.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44477
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the samplerCubeShadow support in GLSL shader compiler.
shader compiler was picking the 'r' texture coordinate for shadow comparison
when the expected behaviour is to use 'q' texture coordinate in case of cube
shadow maps.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These simply don't exist in the 1.30 specification---none of the Offset
variants allow samplerCube. This must have been a cut and paste error
from textureGrad, which /does/ allow cubemaps.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
Due to a cut and paste error, these were accidentally misnamed
textureProj() rather than textureProjOffset().
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
From the GLSL 1.30 spec, section 8.7 "Texture Lookup Functions":
"In all functions below, the bias parameter is optional for fragment
shaders. The bias parameter is not accepted in a vertex shader."
This was a cut and paste mistake.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This extension introduces a new sampler type: samplerExternalOES.
texture2D (and texture2DProj) can be used to do a texture look up in an
external texture.
Reviewed-by: Brian Paul <[email protected]>
Acked-by: Jakob Bornecrantz <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implementations are as follows:
isinf(x) = (abs(x) == +infinity)
isnan(x) = (x != x)
Note: the latter formula is not necessarily obvious. It works because
NaN is the only floating point number that does not equal itself.
Fixes piglit tests "isinf-and-isnan fs_basic" and "isinf-and-isnan
vs_basic".
|
|
|
|
|
|
|
|
| |
This patch adds the extension '.ir' to all the files in
src/glsl/builtins/ir/, and changes generate_builtins.py so that it no
longer globs on '*' to find the files to build. This prevents
spurious files (such as EMACS' infamous *~ backup files) from breaking
the build.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The formula we were previously using for asinh:
asinh x = ln(x + sqrt(x * x + 1))
is numerically unstable: when x is a large negative value, the quantity
x + sqrt(x * x + 1)
is a small positive value (on the order of 1/(2|x|)). Since the
logarithm function is very sensitive in this range, any error in the
computation of the square root manifests as a large error in the
result.
This patch changes to the equivalent formula:
asinh x = sign(x) * ln(abs(x) + sqrt(x * x + 1))
which is only slightly more expensive to compute, and is numerically
stable for all x.
Fixes piglit tests
spec/glsl-1.30/execution/built-in-functions/[fv]s-asinh-*.
Reviewed-by: Chad Versace <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Very simple shaders don't actually use GLSL built-ins. For example:
- gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
- gl_FragColor = vec4(0.0);
Both of the shaders used by _mesa_meta_glsl_Clear() also qualify.
By waiting to initialize the built-ins until the first time we need to
look for a signature, we can avoid the overhead entirely in these cases.
Makes piglit run roughly 18% faster (255 vs. 312 seconds).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Throwing away the extra numbers ought to match the existing behavior.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Each of these vecN constants only provided one component, which is
illegal. The printed IR is meant to contain exactly as many components
as are necessary; the IR reader does not splat single values.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The previous formula for atan(x,y) returned a value of +/- pi whenever
|x|<0.0001, and used a formula based on atan(y/x) otherwise. This
broke in cases where both x and y were small (e.g. atan(1e-5, 1e-5)).
This patch modifies the formula so that it returns a value of +/- pi
whenever |x|<1e-8*|y|, and uses the formula based on atan(y/x)
otherwise.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous formula for asin(x) was algebraically equivalent to:
sign(x)*(pi/2 - sqrt(1-|x|)*(A + B|x| + C|x|^2))
where A, B, and C were arbitrary constants determined by a curve fit.
This formula had a worst case absolute error of 0.00448, an unbounded
worst case relative error, and a discontinuity near x=0.
Changed the formula to:
sign(x)*(pi/2 - sqrt(1-|x|)*(pi/2 + (pi/4-1)|x| + A|x|^2 + B|x|^3))
where A and B are arbitrary constants determined by a curve fit. This
has a worst case absolute error of 0.00039, a worst case relative
error of 0.000405, and no discontinuities.
I don't expect a significant performance degradation, since the extra
multiply-accumulate should be fast compared to the sqrt() computation.
Fixes piglit tests {vs,fs}-asin-float and {vs,fs}-atan-*
|
|
|
|
|
|
|
|
|
|
|
| |
The constant used in the radians() function didn't have enough
precision, causing a relative error of 1.676e-5, which is far worse
than the precision of 32-bit floats. This patch reduces the relative
error to 1.14e-9, which is the best we can do in 32 bits.
Fixes piglit tests {fs,vs}-radians-{float,vec2,vec3,vec4}.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
NOTE: This is a candidate for stable release branches (and don't forget
to re-run "make builtins" after cherry-picking.)
|
|
|
|
|
|
|
|
|
| |
Commit 56ef62d9885f805bbfb2243dc860ff425d5b4d3b
"glsl: Generate readable unique names at print time."
changed ir_print_visitor to not generate @0x1234567 suffixes except
where necessary. So there's no need to manually remove them.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This is necessary for GLSL 1.30+ shadow sampling functions, which return
a single float rather than splatting the value to a vec4 based on
GL_DEPTH_TEXTURE_MODE.
|
| |
|
| |
|
|
|
|
| |
A copy and paste error.
|
|
|
|
|
|
| |
This has probably existed since e5e34ab18eeaffa465 or so.
NOTE: This is a candidate for the 7.9 and 7.10 branches.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Rather than passing "True", pass a bitfield describing the particular
variant's features - either projection or offset.
This should make the code a bit more readable ("Proj" instead of "True")
and make it easier to support offsets in the future.
|
|
|
|
|
| |
For offsets, we'll want the straight sampler dimensionality, without the
+1 for array types. Create a new function to do that; refactor.
|
|
|
|
|
|
|
|
|
|
|
| |
Having these as actual integer values makes it difficult to implement
the texture*Offset built-in functions, since the offset is actually a
function parameter (which doesn't have a constant value).
The original rationale was that some hardware needs these offset baked
into the instruction opcode. However, at least i965 should be able to
support non-constant offsets. Others should be able to rely on inlining
and constant propagation.
|
| |
|
|
|
|
| |
Also removed unnecessary semicolons.
|
| |
|
|
|
|
| |
This isn't strictly necessary, but is definitely nicer.
|
|
|
|
| |
Import sys for sys.exit.
|
|
|
|
|
|
| |
Python is already necessary for other parts of Mesa, so there's no
reason we can't just generate it. This patch updates both make and
SCons to do so.
|
|
|
|
|
| |
I forgot about this file, and it didn't show up until I tried to do
"make builtins" from a clean build.
|
|
|
|
|
|
| |
I think was used long ago, when we actually read the builtins into the
shader's instruction stream directly, rather than creating a separate
shader and linking the two. It doesn't seem to serve any purpose now.
|
|
|
|
|
|
|
|
| |
These mistakenly computed 't' instead of t * t * (3.0 - 2.0 * t).
Also, properly vectorize the smoothstep(float, float, vec) variants.
NOTE: This is a candidate for the 7.9 and 7.10 branches.
|
|
|
|
|
|
|
|
| |
This makes a very simple 1.30 shader go from 196k of memory to 9k.
NOTE: This -may- be a candidate for the 7.9 branch, as the benefit is
substantial. However, it's not a simple change, so it may be wiser to
wait for 7.10.
|
|
|
|
|
|
| |
We are not aware of any GPU that actually implements the cross product
as a single instruction. Hence, there's no need for it to be an opcode.
Future commits will remove it entirely.
|
| |
|
| |
|
|
|
|
|
|
| |
In particular, calling the abs function is silly, since there's already
an expression opcode for that. Also, assigning to temporaries then
assigning those to the final location is rather redundant.
|
|
|
|
| |
For consistency with the vec2/vec3/vec4 variants.
|
|
|
|
|
|
| |
This works around MSVC's 65535 byte limit, unfortunately at the expense
of any semblance of readability and much larger file size. Hopefully I
can implement a better solution later, but for now this fixes the build.
|
| |
|
|
|
|
|
|
|
| |
This implements round() via the ir_unop_round_even opcode, rather than
adding a new opcode. We may wish to add one in the future, since it
might enable a small performance increase on some hardware, but for now,
this should suffice.
|