| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
MaxGeometryOutputComponents is used as the value
for gl_MaxGeometryVaryingComponents
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
It seems to have been forgotten during viewports array implementation time.
Cc: "10.4 10.3" <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the hash_table API required the user to do all of the hashing
of keys as it passed them in. Since the hashing function is intrinsically
tied to the comparison function, it makes sense for the hash table to know
about it. Also, it makes for a somewhat clumsy API as the user is
constantly calling hashing functions many of which have long names. This
is especially bad when the standard call looks something like
_mesa_hash_table_insert(ht, _mesa_pointer_hash(key), key, data);
In the above case, there is no reason why the hash table shouldn't do the
hashing for you. We leave the option for you to do your own hashing if
it's more efficient, but it's no longer needed. Also, if you do do your
own hashing, the hash table will assert that your hash matches what it
expects out of the hashing function. This should make it harder to mess up
your hashing.
v2: change to call the old entrypoint "pre_hashed" rather than
"with_hash", like cworth's equivalent change upstream (change by
anholt, acked-in-general by Jason).
Signed-off-by: Jason Ekstrand <[email protected]>
Signed-off-by: Eric Anholt <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Since we have manual build rules and list the .c/.cpp files in SOURCES,
we need to explicitly list these for distribution.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL ES 3.00 spec, chapter 4.6.1 "The Invariant Qualifier",
Only variables output from a shader can be candidates for invariance. This
includes user-defined output variables and the built-in output variables.
As only outputs can be declared as invariant, an invariant output from one
shader stage will still match an input of a subsequent stage without the
input being declared as invariant.
This patch fixes the following dEQP tests:
dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage_precision
dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_interp_storage
dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage_precision
dEQP-GLES3.functional.shaders.qualification_order.variables.valid.invariant_storage
dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_precision_invariant_input
dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_interp_storage_invariant_input
dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_precision_invariant_input
dEQP-GLES3.functional.shaders.qualification_order.variables.invalid.invariant_storage_invariant_input
No piglit regressions observed.
v2:
- Add spec content in the code
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
v3: Fix multi-line comment format (Ian)
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes the piglit test: spec/glsl-es-3.00/compiler/undef-GL_ES.vert
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Cuts an instruction from two shaders in Tesseract, by allowing the
(x+y) cmp 0 -> x cmp -y optimization to take place.
instructions in affected programs: 1198 -> 1194 (-0.33%)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Previously only geometry shader outputs would be assigned locations if
the geometry shader was the only stage in the linked program.
Signed-off-by: Ian Romanick <[email protected]>
Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
producer_var could be NULL if consumer_var is not NULL and
consumer_is_fs is false. This will occur when the producer is NULL and
the consumer is the geometry shader for a program that contains only a
geometry shader. This will occur starting with the next patch.
Signed-off-by: Ian Romanick <[email protected]>
Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82585
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
Note that some of the GLSL specifications explicitly state this as
compile error, some simply state that 'it is an error'.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
As documented in:
https://www.opengl.org/registry/specs/ARB/compute_shader.txt
const uvec3 gl_WorkGroupSize;
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Flex and lex have a special action ‘|’ which means to use the same action as
the next rule. We can use this to reduce a bit of code duplication in the
rules for the various float literal formats.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
According to the GLSL spec float literals like ‘1f’ shouldn't be allowed
without adding a decimal point or an exponent. Apparently the AMD driver also
disallows this so it seems unlikely that anything would be relying on it.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ilia noticed that my lowering pass was converting the constant array
used by textureGatherOffsets' offsets parameter to a uniform. This
broke textureGather for Nouveau, and is generally a horrible plan,
since it violates the GLSL constraint that offsets must be an
immediate constant.
When I wrote this pass, I neglected to consider whole array assignment.
I figured opt_array_splitting would handle constant indexing, so this
pass was really about fixing variable indexing.
textureGatherOffsets is an example of whole array access that we really
don't want to touch. Whole array copies don't appear to benefit from
this either - they're most likely initializers for temporary arrays
which are going to be mutated anyway. Since you're copying, you may
as well copy from immediates, not uniforms.
This patch makes the pass look for ir_dereference_arrays of
ir_constants, rather than looking for any ir_constant directly.
This way, it ignores whole array assignment.
No shader-db changes or Piglit regressions on Haswell. Some Piglit
tests generate different code (fixing textureGatherOffsets on Nouveau).
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Ilia Mirkin <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Caught by clang.
warning: comparison of constant -1 with expression of type
'ir_texture_opcode' is always false
[-Wtautological-constant-out-of-range-compare]
if (op == -1)
~~ ^ ~~
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These were added in commits a760c738 and 43757135 to be used in
implementing C-style aggregate initializers (commit 1b0d6aef). Paul
rewrote that code in commit 0da1a2cc to use GLSL types, rather than
AST types, leaving these copy constructors unused.
Tested by making them private and providing no definition.
|
|
|
|
|
|
|
|
|
|
|
| |
Uniform names (even for hidden uniforms) are required to be unique; some
parts of the compiler assume they can be looked up by name.
Fixes the piglit test: tests/spec/glsl-1.20/linker/array-initializers-1
Signed-off-by: Chris Forbes <[email protected]>
Cc: "10.4" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When using the stand alone compiler, if we try to link a shader with vertex
attributes it will segfault on linking as the binding hash tables are not
included in the shader program. Obviously, we cannot make the linking stage
succeed without the bound attributes but we can prevent the crash and just
let the linker spit its own error.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
On x86-64 this saves 8 bytes of padding in the structure, and this
reduces the size of the structure to 32 bytes.
v2: Fix constructor so that GCC won't warn about the order of
initialization.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Due to the total number of bits used in the bitfield, this does not
increase the size of the structure.
It does, however, reduce the number of instructions required each time
one of these fields is accessed. To access ::matrix_columns with the
bitfield, three instructions were required:
movzbl 0x9(%rdx),%eax
shr %al
and $0x7,%eax
As a uint8_t, only one instruction is required.
movzbl 0xa(%rdx),%eax
These fields are accessed *a lot*.
Valgrind callgrind results for a trace of Tesseract:
_mesa_Uniform4fv _mesa_Uniform4f _mesa_Uniform1i
Before (64-bit): 48,103,497 16,556,096 676,447
After (64-bit): 45,722,616 15,737,964 670,607
_mesa_Uniform4fv _mesa_Uniform4f _mesa_Uniform1i
Before (32-bit): 61,472,611 21,051,222 821,361
After (32-bit): 57,987,421 19,872,226 811,609
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
| |
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes MSVC build error.
shaderapi.c
src\glsl\list.h(535) : error C2143: syntax error : missing ';' before 'type'
src\glsl\list.h(535) : error C2143: syntax error : missing ')' before 'type'
src\glsl\list.h(536) : error C2065: 'node' : undeclared identifier
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86025
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
| |
This can be very useful for trying to debug list corruptions.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A pattern in certain shaders is:
uniform vec4 colors[NUM_LIGHTS];
for (int i = 0; i < NUM_LIGHTS; i++) {
...use colors[i]...
}
In this case, the application author expects the shader compiler to
unroll the loop. By doing so, it replaces variable indexing of the
array with constant indexing, which is more efficient.
This patch extends the heuristic to see if arrays accessed within the
loop are indexed by an induction variable, and if the array size exactly
matches the number of loop iterations. If so, the application author
probably intended us to unroll it. If not, we rely on the existing
loop-too-large heuristic.
Improves performance in a phong shading microbenchmark by 2.88x, and a
shadow mapping microbenchmark by 1.63x. Without variable indexing, we
can upload the small uniform arrays as push constants instead of pull
constants, avoiding shader memory access. Affects several games, but
doesn't appear to impact their performance.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider GLSL code such as:
const ivec2 offsets[] =
ivec2[](ivec2(-1, -1), ivec2(-1, 0), ivec2(-1, 1),
ivec2(0, -1), ivec2(0, 0), ivec2(0, 1),
ivec2(1, -1), ivec2(1, 0), ivec2(1, 1));
ivec2 offset = offsets[<non-constant expression>];
Both i965 and nv50 currently handle this very poorly. On i965, this
becomes a pile of MOVs to load the immediate constants into registers,
a pile of scratch writes to move the whole array to memory, and one
scratch read to actually access the value - effectively the same as if
it were a non-constant array.
We'd much rather upload large blocks of constant data as uniform data,
so drivers can simply upload the data via constbufs, and not have to
populate it via shader instructions.
This is currently non-optional because both i965 and nouveau benefit
from it, and according to Marek radeonsi would benefit today as well.
(According to Tom, radeonsi may want to handle this itself in the long
term, but we can always add a flag when it becomes useful.)
Improves performance in a terrain rendering microbenchmark by about 2x,
and cuts the number of instructions in about half. Helps a lot of
"Natural Selection 2" shaders, as well as one "HOARD" shader.
total instructions in shared programs: 5473459 -> 5471765 (-0.03%)
instructions in affected programs: 5880 -> 4186 (-28.81%)
v2: Use ir_var_hidden to avoid exposing the new uniform via the GL
uniform introspection API.
v3: Alphabetize Makefile.sources properly.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77957
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the compiler, we'd like to generate implicit uniforms for internal
use. These should not be visible via the GL uniform introspection API.
To support that, we add a new ir_variable::how_declared value of
ir_var_hidden, and plumb that through to gl_uniform_storage.
v2 (idr): Fix some memory management issues in
move_hidden_uniforms_to_end. The comment block on the function has more
details.
Signed-off-by: Kenneth Graunke <[email protected]>
Signed-off-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
The CSE pass now prints out why it thinks a value is not a candidate for
adding to the AE set.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
The optimization in commit d056863b covers these cases, which were the
first optimizations I added to the GLSL compiler.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85683
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85691
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
glsl_type has several static hash tables and a static ralloc context. They
need to be protected by a mutex as they are not thread-safe.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69200
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There may be two contexts compiling shaders at the same time, and we want the
anonymous struct id to be globally unique.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Both core mesa and glsl have their own wrappers for strtof_l. Merge
and move them to util/. They are compiled with a C++ compiler so that
we can make them thread-safe in a following commit.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Helps a small number of vertex shaders in the games Dungeon Defenders
and Shank, as well as an internal benchmark.
instructions in affected programs: 2801 -> 2719 (-2.93%)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Just remove the parameter.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
_mesa_link_shader_program already calls _mesa_clear_shader_program_data
before calling link_shaders, so this is already done.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|