| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many GPUs have an instruction to do linear interpolation which is more
efficient than simply performing the algebra necessary (two multiplies,
an add, and a subtract).
Pattern matching or peepholing this is more desirable, but can be
tricky. By using an opcode, we can at least make shaders which use the
mix() built-in get the more efficient behavior.
Currently, all consumers lower ir_triop_lrp. Subsequent patches will
actually generate different code.
v2 [mattst88]:
- Add LRP_TO_ARITH flag to ir_to_mesa.cpp. Will be removed in a
subsequent patch and ir_triop_lrp translated directly.
v3 [mattst88]:
- Move changes from the next patch to opt_algebraic.cpp to accept
3-src operations.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For each of the following functions, add a declaration to
builtins/profiles/300es.glsl and create new file
builtins/ir/${funcname}.ir:
packSnorm2x16 unpackSnorm2x16
packUnorm2x16 unpackUnorm2x16
packHalf2x16 unpackHalf2x16
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Tuner <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Nearly all of the builtin functions in GLSL 3.00 ES are already
implemented in Mesa; this patch enables them.
A few functions are not implemented yet; those have been commented
out, with a FIXME comment to act as a reminder of what still needs to
be implemented. Here is the complete list: packSnorm2x16,
unpackSnorm2x16, packUnorm2x16, unpackUnorm2x16, packHalf2x16,
unpackHalf2x16.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
These functions are defined in GLSL 1.50 and GLES 3.00 ES.
The formulas have been extracted from the existing implementation of
inverse().
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we stored the GLSL language version in the
glsl_symbol_table struct. But this was unnecessary--all
glsl_symbol_table needs to know is whether functions and variables
have separate namespaces (they do in GLSL 1.10 only).
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding this now makes it easier to develop and test GLES3 features, since we
can do initial development and testing using desktop GL. Later GLSL compiler
patches check for either ctx->Extensions.ARB_ES3_compatibility or
_mesa_is_gles3 to allow certain features (i.e., "#version 300 es").
[v2, idr]: Just edits to the commit message.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This should help avoid confusion now that we're using the gl_api enum
to distinguishing between core and compatibility API's. The
corresponding enum value for core API's is API_OPENGL_CORE.
Acked-by: Eric Anholt <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds all the new builtins + the new sampler types,
and hooks them up if the extension is supported.
v2: fix missing signatures for grad/lod
fix missing textureSize clarifications
fix compare vs starts with usage
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This isn't used outside the generated file.
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we advertised the extension but the builtin functions
were enabled only for GLSL and not for ES.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52003
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In single precision, 1.5707963 becomes 1.5707962513 which is too
small. However, 1.5707964 becomes 1.5707963705 which is just right.
The value 1.5707964 is already used in asin.ir.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Olivier Galibert <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
The opcodes are bitcast_f2u, bitcast_f2i, bitcast_i2f and bitcast_u2f.
Signed-off-by: Olivier Galibert <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
That adds support for activating the extension. It doesn't actually
*do* anything yet, of course.
Signed-off-by: Olivier Galibert <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were incorrectly assuming that the coordinate's dimensionality is
equal to the gradient's dimensionality. For array types, the coordinate
has one more component.
Fixes 12 subcases of oglconform's glsl-bif-tex-grad test.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is taken from the ogl-math project, with Inverse renamed to adj
(since it's not actually the inverse), transposed, and our types
plugged in. There are potential CSE opportunities in this code
(particularly for hardware with RCP but not DIV), but we should be
doing CSE anyway, so don't hand-optimize.
Fixes piglit inverse tests.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This takes advantage of the builtin compiler to generate IR into a
string, the same way we read GLSL for function prototypes for our
profiles.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Deletes a lot of pointless duplication, as well as some run-time effort.
Conveniently, GLSL 1.40 no longer needs a .vert variant, since it
doesn't define any built-ins specific to the vertex shader stage.
ARB_texture_rectangle and OES_EGL_image_external also only need a single
profile, since the .vert and .frag variants were identical.
I didn't bother with EXT_texture_array and OES_texture_3D because
they're so tiny that the savings would be miniscule.
Cuts the generated builtin_function.cpp from 1.7MB to 1.0MB (41%).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The built-in subsystem uses "profiles," or GLSL shaders containing
prototypes for all built-ins supported within a particular language
version (or extension) and shader stage.
Since profiles were stage-specific, we had to cut and paste almost all
the prototypes between (e.g.) 110.vert and 110.frag. Naturally, this
led to sundry cut and paste bugs, where someone fixed an issue in .frag
but neglected to update .vert, or vice-versa. Geometry shaders would
have only made this worse.
This patch introduces support for a new '.glsl' profile suffix which
contains prototypes common to all shader stages. The existing '.frag'
and '.vert' profiles need only contain the few stage-specific built-ins.
Not only does this remove duplication, it makes built-in setup slightly
faster: we don't need to re-read the common prototypes and function
bodies for both the vertex and fragment shader stage.
Internally, this was trivial. We already create a list of gl_shader
objects to search through for built-ins: one for the core language
version/stage, and additional shaders for any extensions in use. This
patch simply adds another shader to the list: core/common, core/stage,
and extensions.
The next patch will update the profiles to remove the duplication.
It's separated out purely to make review easier.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLSL 1.30 -> 4.10 specs all erroneously say "vec2" for a few
overloads of textureProjGradOffset, while most overloads and all other
texturing functions use ivec types.
The GLSL 4.20 specification corrects these to "ivec2", but doesn't
mention this as being a conscious change in behavior. Nor does the
ARB_shading_language_420pack extension. So presumably it was a typo.
At any rate, our builtin functions all use ivec already, so the fact
that these prototypes use plain vecs will only lead to applications
dying in a fire when trying to use them.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Fixes the new piglit texelFetch() tests on these. Note that the rest
of the new functions are not tested (same as the non-2DRect versions
of most of them).
|
|
|
|
|
|
|
| |
Indirectly caught by Ken's review of my GLSL 1.40 changes where I
copy-and-pasted this line.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes the corresponding new tests in piglit.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
The builtins we have are generally optimized, having been
hand-written. This avoids generating bad code when an optimization
pass prints debug output.
|
|
|
|
|
|
|
|
| |
Fix texelFetch(sampler2DRect) and textureSize(samplerBuffer)
generation to not reference a LOD at the same time because it's easier
than not fixing it.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Aside from ir_call, our IR is cleanly split into two classes:
- Statements (typeless; used for side effects, control flow)
- Values (deeply nestable, pure, typed expression trees)
Unfortunately, ir_call confused all this:
- For void functions, we placed ir_call directly in the instruction
stream, treating it as an untyped statement. Yet, it was a subclass
of ir_rvalue, and no other ir_rvalue could be used in this way.
- For functions with a return value, ir_call could be placed in
arbitrary expression trees. While this fit naturally with the source
language, it meant that expressions might not be pure, making it
difficult to transform and optimize them. To combat this, we always
emitted ir_call directly in the RHS of an ir_assignment, only using
a temporary variable in expression trees. Many passes relied on this
assumption; the acos and atan built-ins violated it.
This patch makes ir_call a statement (ir_instruction) rather than a
value (ir_rvalue). Non-void calls now take a ir_dereference of a
variable, and store the return value there---effectively a call and
assignment rolled into one. They cannot be embedded in expressions.
All expression trees are now pure, without exception.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
valgrind complained about an uninitialised value being used in
glsl_parser_extras.cpp, and this was the one it was giving out about.
Just initialise the value in the fakectx.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By setting lod to 0 in the builtin function implementation, we avoid
needing to update all the visitors to ignore LOD in this case, when
the hardware drivers actually want to ask for LOD 0 for rectangular
textures.
Fixes piglit spec/GLSL-1.40/textureSize-*Rect.
v2: Change style of looking for substrings.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise, when we go to use ir_reader on the generated code, we won't
have the types present.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is the one builtin function claimed to be dropped due to the
ARB_compatibility split.
Fixes piglit spec/GLSL-1.40/compiler/ftransform.vert
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
All that's changed is the #version changing to 140.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This makes the process slightly more debuggable, though it would be
nice if the build just failed immediately instead.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The IR for mix(float, float, bool) was missing a write mask, causing the
IR reader to die horribly. Furthermore, I neglected to add any of the
new prototypes to the 1.30 profiles.
Fixes oglconform's glsl-bif-com advanced.mix test cases.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44477
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the samplerCubeShadow support in GLSL shader compiler.
shader compiler was picking the 'r' texture coordinate for shadow comparison
when the expected behaviour is to use 'q' texture coordinate in case of cube
shadow maps.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These simply don't exist in the 1.30 specification---none of the Offset
variants allow samplerCube. This must have been a cut and paste error
from textureGrad, which /does/ allow cubemaps.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
Due to a cut and paste error, these were accidentally misnamed
textureProj() rather than textureProjOffset().
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
From the GLSL 1.30 spec, section 8.7 "Texture Lookup Functions":
"In all functions below, the bias parameter is optional for fragment
shaders. The bias parameter is not accepted in a vertex shader."
This was a cut and paste mistake.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This extension introduces a new sampler type: samplerExternalOES.
texture2D (and texture2DProj) can be used to do a texture look up in an
external texture.
Reviewed-by: Brian Paul <[email protected]>
Acked-by: Jakob Bornecrantz <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implementations are as follows:
isinf(x) = (abs(x) == +infinity)
isnan(x) = (x != x)
Note: the latter formula is not necessarily obvious. It works because
NaN is the only floating point number that does not equal itself.
Fixes piglit tests "isinf-and-isnan fs_basic" and "isinf-and-isnan
vs_basic".
|
|
|
|
|
|
|
|
| |
This patch adds the extension '.ir' to all the files in
src/glsl/builtins/ir/, and changes generate_builtins.py so that it no
longer globs on '*' to find the files to build. This prevents
spurious files (such as EMACS' infamous *~ backup files) from breaking
the build.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The formula we were previously using for asinh:
asinh x = ln(x + sqrt(x * x + 1))
is numerically unstable: when x is a large negative value, the quantity
x + sqrt(x * x + 1)
is a small positive value (on the order of 1/(2|x|)). Since the
logarithm function is very sensitive in this range, any error in the
computation of the square root manifests as a large error in the
result.
This patch changes to the equivalent formula:
asinh x = sign(x) * ln(abs(x) + sqrt(x * x + 1))
which is only slightly more expensive to compute, and is numerically
stable for all x.
Fixes piglit tests
spec/glsl-1.30/execution/built-in-functions/[fv]s-asinh-*.
Reviewed-by: Chad Versace <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Very simple shaders don't actually use GLSL built-ins. For example:
- gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
- gl_FragColor = vec4(0.0);
Both of the shaders used by _mesa_meta_glsl_Clear() also qualify.
By waiting to initialize the built-ins until the first time we need to
look for a signature, we can avoid the overhead entirely in these cases.
Makes piglit run roughly 18% faster (255 vs. 312 seconds).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Throwing away the extra numbers ought to match the existing behavior.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Each of these vecN constants only provided one component, which is
illegal. The printed IR is meant to contain exactly as many components
as are necessary; the IR reader does not splat single values.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The previous formula for atan(x,y) returned a value of +/- pi whenever
|x|<0.0001, and used a formula based on atan(y/x) otherwise. This
broke in cases where both x and y were small (e.g. atan(1e-5, 1e-5)).
This patch modifies the formula so that it returns a value of +/- pi
whenever |x|<1e-8*|y|, and uses the formula based on atan(y/x)
otherwise.
|