| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the ctx->Const.ForceGLSLVersion setting only worked if
the shader lacked a #version directive. Now, the ForceGLSLVersion
setting will override the #version directive too.
This change should be safe since it should be rare to have an app
that has a mix of shader versions and we only wanted to override
the #version for shaders which lacked the #version directive.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL ES 3.00 spec, 4.3.10 (Linking of Vertex Outputs and Fragment Inputs),
page 45 says the following:
"The type of vertex outputs and fragment input with the same name must match,
otherwise the link command will fail. The precision does not need to match.
Only those fragment inputs statically used (i.e. read) in the fragment shader
must be declared as outputs in the vertex shader; declaring superfluous vertex
shader outputs is permissible."
[...]
"The term static use means that after preprocessing the shader includes at
least one statement that accesses the input or output, even if that statement
is never actually executed."
And it includes a table with all the possibilities.
Similar table or content is present in other GLSL specs: GLSL 4.40, GLSL 1.50,
etc but for more stages (vertex and geometry shaders, etc).
This patch detects that case and returns a link error. It fixes the following
dEQP test:
dEQP-GLES3.functional.shaders.linkage.varying.rules.illegal_usage_1
However, it adds a new regression in piglit because the test hasn't a
vertex shader and it checks the link status.
bin/glslparsertest \
tests/spec/glsl-1.50/compiler/gs-also-uses-smooth-flat-noperspective.geom pass \
1.50 --check-link
This piglit test is wrong according to the spec wording above, so if this patch
is merged it should be updated.
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
| |
Correct error with commit 151fb1e where assert was renamed
to unreachable without removing ! from string argument.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These are nir_cf_nodes, not ALU instructions.
Also, use unreachable() to preempt said review feedback.
v2: Do it right (thanks Ilia).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
prog->nir will generate fsub opcodes, but i965 doesn't implement them.
We may as well lower them at the NIR level, since it's trivial to do.
Suggested by Connor Abbott.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These will be useful for prog->nir and tgsi->nir.
v2: Don't forget to mark nir_swizzle as inline (Eric).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Both prog->nir and tgsi->nir will want to use these.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Don't be lazy. Constify the as_foo functions and use those instead
of ugly casts. Suggested by Curro.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that they're all implemented using macros, this is trivial.
v2: Remove redundant parenthesis. Suggested by Curro.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The downcast functions for non-leaf classes were previously implemented
"by hand." Now they are implemented using macros based on the is_foo
functions added in the previous patch.
v2: Remove redundant parenthesis. Suggested by Curro (on the next
patch).
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These functions deteremine when an IR node is one of the non-leaf
classes.
v2: Adjust indentation to line up. Suggested by Matt.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Transform this into b2f(or(a, b)).
instructions in affected programs: 432 -> 430 (-0.46%)
helped: 2
Acked-by: Ian Romanick <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Transform this into b2f(and(a, b)).
total instructions in shared programs: 6205448 -> 6204391 (-0.02%)
instructions in affected programs: 284030 -> 282973 (-0.37%)
helped: 903
HURT: 6
Acked-by: Ian Romanick <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
| |
Transform this into b2f(or(a, b)).
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Transform this into b2f(and(a, b)).
total instructions in shared programs: 6190291 -> 6189225 (-0.02%)
instructions in affected programs: 267247 -> 266181 (-0.40%)
helped: 866
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
They're not accessible from the source language, but optimizations are
allowed to generate them.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
in different fragment shaders. This also applies to a case when gl_FragCoord
is redeclared with no layout qualifiers in one fragment shader and not
declared but used in other fragment shader.
Signed-off-by: Anuj Phogat <[email protected]>
Khronos Bug#12957
Cc: "10.5" <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier commit 53bf7c8fd2e changed the logic to always call
base_alignment on structs. 1ec715ce8b12 hacked the function to return 0
for sampler fields, but didn't handle sampler arrays. Instead of
extending the hack, avoid calling base_alignment in the first place on
non-UBO uniforms.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89726
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Tapani Palli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch changes lowering pass to use unique name for each uniform
so that arrays from different stages cannot end up having same
name.
v2: instead of global counter, use pointer to achieve
unique name (Kenneth Graunke)
Signed-off-by: Tapani Pälli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89590
Reviewed-by: Chris Forbes <[email protected]>
Cc: 10.5 10.4 <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This addresses
...\glsl_parser.cpp(...) : warning C4065: switch statement contains 'default' but no 'case' labels
This is on code generated by bison, which we have little control.
It seems useful to have this warning otherwise enabled.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note that GLboolean is an alias for unsigned char, which lacks the
implicit true/false semantics that C++/C99 bool have.
Reviewed-by: Brian Paul <[email protected]>
v2: Change gl_shader::IsES and gl_shader_program::IsES to be bool as
recommended by Ian Romanick.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use the idiom
ir_foo *x = y->as_foo();
if (x == NULL)
return;
all over the place. GCC generates some quite lovely code for this.
One such example:
340a5b: 83 7d 18 04 cmpl $0x4,0x18(%rbp)
340a5f: 0f 85 06 04 00 00 jne 340e6b
340a65: 48 85 ed test %rbp,%rbp
340a68: 0f 84 fd 03 00 00 je 340e6b
This case used as_expression() (ir_type_expression is 4). Note that it
checks the ir_type, then checks that the pointer isn't NULL. There is
some disconnect in GCC around the condition in the as_foo functions.
return ir_type == ir_type_##TYPE ? (ir_##TYPE *) this : NULL; \
It believes "this" could be NULL, so it emits check outside the function
just for fun.
This patch uses assume() to tell GCC that it need not bother with extra
NULL checking of the pointer returned by the as_foo functions.
text data bss dec hex filename
4836430 158688 26248 5021366 4c9eb6 i965_dri-before.so
4836173 158688 26248 5021109 4c9db5 i965_dri-after.so
v2: Replace 'if (this == NULL) unreachable("this cannot be NULL")' with
assume(this != NULL). Suggested by Ilia Mirkin.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
v2: Delete the set of indirectly accessed variables when we're done with it
v3: Rename from _packed to _scalar
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Previously, we just assigned variable locations in nir_lower_io. Now, we
force the user to assign variable locations for us. This gives the backend
a bit more control over where variables are placed.
v2: Rename from _packed to _scalar
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We never did a single hash table lookup in the entire NIR code base that I
found so there was no real benifit to doing it that way. I suppose that
for linking, we'll probably want to be able to lookup by name but we can
leave building that hash table to the linker. In the mean time this was
causing problems with GLSL IR -> NIR because GLSL IR doesn't guarantee us
unique names of uniforms, etc. This was causing massive rendering isues in
the unreal4 Sun Temple demo.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eric's initial patch adding constant expression evaluation for
ir_unop_round_even used nearbyint. The open-coded _mesa_round_to_even
implementation came about without much explanation after a reviewer
asked whether nearbyint depended on the application not modifying the
rounding mode. Of course (as Eric commented) we rely on the application
not changing the rounding mode from its default (round-to-nearest) in
many other places, including the IROUND function used by
_mesa_round_to_even!
Worse, IROUND() is implemented using the trunc(x + 0.5) trick which
fails for x = nextafterf(0.5, 0.0).
Still worse, _mesa_round_to_even unexpectedly returns an int. I suspect
that could cause problems when rounding large integral values not
representable as an int in ir_constant_expression.cpp's
ir_unop_round_even evaluation. Its use of _mesa_round_to_even is clearly
broken for doubles (as noted during review).
The constant expression evaluation code for the packing built-in
functions also mistakenly assumed that _mesa_round_to_even returned a
float, as can be seen by the cast through a signed integer type to an
unsigned (since negative float -> unsigned conversions are undefined).
rint() and nearbyint() implement the round-half-to-even behavior we want
when the rounding mode is set to the default round-to-nearest. The only
difference between them is that nearbyint() raises the inexact
exception.
This patch implements _mesa_roundeven{f,}, a function similar to the
roundeven function added by a yet unimplemented technical specification
(ISO/IEC TS 18661-1:2014), with a small difference in behavior -- we
don't bother raising the inexact exception, which I don't think we care
about anyway.
At least recent Intel CPUs can quickly change a subset of the bits in
the x87 floating-point control register, but the exception mask bits are
not included. rint() does not need to change these bits, but nearbyint()
does (twice: save old, set new, and restore old) in order to raise the
inexact exception, which would incur some penalty.
Reviewed-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader-db results on HSW:
total instructions in shared programs: 4174156 -> 4157291 (-0.40%)
instructions in affected programs: 145397 -> 128532 (-11.60%)
helped: 383
HURT: 0
GAINED: 20
LOST: 22
There are two more tests lost than gained. However, comparing this with
GLSL IR vs. NIR results, the overall delta is reduced from 85/44
gained/lost on current master to 71/32 with this commit. Therefore, I
think it's probably a boon since we are getting "closer" to where we were
before.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Previously we tried to do poor-man's copy propagation as we created the
select instructions. Instead, this commit just moves the instructions from
the blocks inside the if into the block before. Copy propagation will take
care of making sure we don't have any extra mov's in there for us.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARB_shading_language_packing is part of GLSL 4.2, not 4.0 as I
mistakenly believed. The following functions are available only with
ARB_shading_language_packing, GLSL 4.2 (not GLSL 4.0), or ES 3.0:
- packSnorm2x16
- unpackSnorm2x16
- packHalf2x16
- unpackHalf2x16
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The optimization done by commit 34ec1a24d did not take it into account.
Fixes:
dEQP-GLES3.functional.shaders.random.all_features.fragment.20
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.4 10.5" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we stored derefs in a hash table, using the malloc'd pointer
as the key. Then, we walked through the hash table and generated code,
based on the order of the hash table's elements.
Memory addresses returned by malloc are pretty much random, which meant
that the hash was random, and the hash table's elements would be walked
in some random order. This led to successive compiles of the same
shader using different variable names and slightly different orderings
of phi-nodes. Code could not be diff'd, and the final assembly would
sometimes change slightly too.
It turns out the only point of the hash table was to avoid inserting
the same node multiple times for different dereferences. We never
actually searched the hash table! This patch uses an intrusive
linked list instead. Since exec_list uses head and tail sentinels,
checking prev or next against NULL will tell us whether the node is
already in the list.
Pair programming with Jason Ekstrand.
Signed-off-by: Jason Ekstrand <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
__next and __prev are pointers to the structure containing the exec_node
link, not the embedded exec_node. NULL checks would fail unless the
embedded exec_node happened to be at offset 0 in the parent struct.
v2: Jason Ekstrand <[email protected]>:
Use "(__node)->__field.next != NULL" to check for the end of the list
instead of the "&__next->__field != NULL". The former is far more
obviously correct as it matches what the non-safe versions do. The
original code tried to avoid any use of __next as the client code may
delete it during its execution. However, since the looping condition is
checked after the iteration clause but before the client code is
executed, we know that __node is valid during the looping condition.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Ian and I added these around the time Connor was developing NIR. Now
that both exist, we should make them work together!
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Mark Janes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader-db i965 instructions:
total instructions in shared programs: 1711180 -> 1711159 (-0.00%)
instructions in affected programs: 825 -> 804 (-2.55%)
helped: 9
HURT: 0
GAINED: 3
LOST: 3
Shader-db NIR instructions:
total instructions in shared programs: 606187 -> 606179 (-0.00%)
instructions in affected programs: 298 -> 290 (-2.68%)
helped: 4
HURT: 0
GAINED: 0
LOST: 0
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader-db i965 instructions:
total instructions in shared programs: 1715894 -> 1710802 (-0.30%)
instructions in affected programs: 443080 -> 437988 (-1.15%)
helped: 1502
HURT: 13
GAINED: 4
LOST: 4
Shader-db NIR instructions:
total instructions in shared programs: 607710 -> 606187 (-0.25%)
instructions in affected programs: 208285 -> 206762 (-0.73%)
helped: 769
HURT: 8
GAINED: 0
LOST: 0
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Being able to see both location and driver_location can be useful when
debugging IO mistakes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vertex shaders can have shader inputs where location happens to be
VARYING_SLOT_FACE. Without predicating this on the shader stage,
we suddenly end up with load_front_face intrinsics in vertex shaders,
which is nonsensical.
Fixes spec/arb_vertex_buffer_object/pos-array when using NIR for VS.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The next commit needs to know the shader stage in glsl_to_nir().
To facilitate that, we pass the gl_shader rather than the raw exec_list
of instructions. This has both the exec_list and the stage.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
glsl_to_nir, tgsi_to_nir, and prog_to_nir all want to know whether the
driver supports native integers. Presumably other passes may as well.
Adding this to nir_shader_compiler_options is an easy way to provide
that information, as it's accessible via nir_shader::options.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code in glsl_to_nir is entirely dead, as we translate from GLSL to
NIR at link time, when there isn't a _mesa_glsl_parse_state to pass,
so every caller passes NULL.
glsl_to_nir seems like the wrong place to try and create the shader
compiler options structure anyway - tgsi_to_nir, prog_to_nir, and other
translators all would have to duplicate that code. The driver should
set this up once with whatever settings it wants, and pass it in.
Eric also added a NirOptions field to ctx->Const.ShaderCompilerOptions[]
and left a comment saying: "The memory for the options is expected to be
kept in a single static copy by the driver." This suggests the plan was
to do exactly that. That pointer was not marked const, however, and the
dead code used a mix of static structures and ralloced ones.
This patch deletes the dead code in glsl_to_nir, instead making it take
the shader compiler options as a mandatory argument. It creates an
(empty) options struct in the i965 driver, and makes NirOptions point
to that. It marks the pointer const so that we can actually do so
without generating "discards const qualifier" compiler warnings.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|