| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
instructions in affected programs: 2858 -> 2808 (-1.75%)
helped: 12
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The rcp(log(x)) pattern affects instruction counts.
instructions in affected programs: 144 -> 138 (-4.17%)
helped: 6
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
No changes in shader-db.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 6195924 -> 6195768 (-0.00%)
instructions in affected programs: 4876 -> 4720 (-3.20%)
helped: 58
HURT: 10
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 6197614 -> 6195924 (-0.03%)
instructions in affected programs: 34773 -> 33083 (-4.86%)
helped: 147
HURT: 6
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
shader-db results for fragment shaders on Haswell:
total instructions in shared programs: 4395688 -> 4389623 (-0.14%)
instructions in affected programs: 355876 -> 349811 (-1.70%)
helped: 1455
HURT: 14
GAINED: 5
LOST: 0
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
i965/nir: Use the dedicated ffma peephole
total instructions in shared programs: 4418748 -> 4394618 (-0.55%)
instructions in affected programs: 1292790 -> 1268660 (-1.87%)
helped: 5999
HURT: 457
GAINED: 4
LOST: 9
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 4422307 -> 4422363 (0.00%)
instructions in affected programs: 4230 -> 4286 (1.32%)
helped: 0
HURT: 12
While this does hurt some things, the losses are minor and it prevents the
compare-with-zero optimization from fighting with ffma which is much more
important.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
i965/nir: Use the late optimizations
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
This optimization is repeated verbatim above
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously, we couldn't generate two algebraic passes in the same file
because of multiple structure definitions. To solve this, we play the
age-old header file trick and just #define around it.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously, NIR would just print 4 swizzle components if the swizzle was
anything other than foo.xyzw. This creates lots of noise if, for example,
you have a one-component element with a swizzle of foo.xxxx.
Reviewed-by: Kenneth Grunke <[email protected]>
|
|
|
|
| |
Found by Coverity.
|
|
|
|
|
|
|
|
|
|
|
|
| |
TGSI's conditional discards take float arg and negate it, so GLSL to TGSI
generates a b2f and negates that value. Only, in NIR we want a proper
bool once again, so we compare with 0. This is a lot of pointless extra
instructions.
total instructions in shared programs: 39735 -> 39702 (-0.08%)
instructions in affected programs: 1342 -> 1309 (-2.46%)
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
Since we have patterns based on b2f, generate them if we see the b2f
equivalent using an iand. This is common when generating NIR from TGSI.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NIR uses these enums/#defines in nir_variables and associated intrinsics,
but I want to be able to use them from TGSI->NIR and NIR->TGSI.
Otherwise, we had to pull in all of mtypes.h.
This doesn't cover all of the enums we might want from a shared compiler
core (like varying slots or vert attribs), but it at least covers what I
need at the moment (system values and interp qualifiers).
v2: Move to src/glsl since util/ is really vague. Include in Makefile.am
list. Use plain bitshifts and stdint types instead of undefined
BITFIELD64_BIT.
v3: Rename to shader_enums.h. Move it into Makefile.sources.
Reviewed-by: Kenneth Graunke <[email protected]> (v2, with
recommendation to rename)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The header was added with commit 2a135c470e3(nir: Add an ALU op builder
kind of like ir_builder.h) but did not made it into to the sources list.
Fortunately it remained unused until a recent commit faf6106c6f6(nir:
Implement a Mesa IR -> NIR translator.)
v2: Remove the bogus dependency. Tweak commit message.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The typical case of mat4*mat4*vec4 is 80 scalar multiplications, but
mat4*(mat4*vec4) is only 32.
On HSW (with vec4 vertex shaders):
instructions in affected programs: 4420 -> 3194 (-27.74%)
On BDW (with scalar vertex shaders):
instructions in affected programs: 12756 -> 6726 (-47.27%)
Implementing a general matrix chain ordering is harder (or at least
tedious) because of having to walk the GLSL IR to create a list of
multiplicands. I'm guessing that this patch handles 90+% of cases, but
of course to tell definitively you'd have to implement the general
thing.
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
| |
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
| |
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the ctx->Const.ForceGLSLVersion setting only worked if
the shader lacked a #version directive. Now, the ForceGLSLVersion
setting will override the #version directive too.
This change should be safe since it should be rare to have an app
that has a mix of shader versions and we only wanted to override
the #version for shaders which lacked the #version directive.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL ES 3.00 spec, 4.3.10 (Linking of Vertex Outputs and Fragment Inputs),
page 45 says the following:
"The type of vertex outputs and fragment input with the same name must match,
otherwise the link command will fail. The precision does not need to match.
Only those fragment inputs statically used (i.e. read) in the fragment shader
must be declared as outputs in the vertex shader; declaring superfluous vertex
shader outputs is permissible."
[...]
"The term static use means that after preprocessing the shader includes at
least one statement that accesses the input or output, even if that statement
is never actually executed."
And it includes a table with all the possibilities.
Similar table or content is present in other GLSL specs: GLSL 4.40, GLSL 1.50,
etc but for more stages (vertex and geometry shaders, etc).
This patch detects that case and returns a link error. It fixes the following
dEQP test:
dEQP-GLES3.functional.shaders.linkage.varying.rules.illegal_usage_1
However, it adds a new regression in piglit because the test hasn't a
vertex shader and it checks the link status.
bin/glslparsertest \
tests/spec/glsl-1.50/compiler/gs-also-uses-smooth-flat-noperspective.geom pass \
1.50 --check-link
This piglit test is wrong according to the spec wording above, so if this patch
is merged it should be updated.
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
| |
Correct error with commit 151fb1e where assert was renamed
to unreachable without removing ! from string argument.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These are nir_cf_nodes, not ALU instructions.
Also, use unreachable() to preempt said review feedback.
v2: Do it right (thanks Ilia).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
prog->nir will generate fsub opcodes, but i965 doesn't implement them.
We may as well lower them at the NIR level, since it's trivial to do.
Suggested by Connor Abbott.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These will be useful for prog->nir and tgsi->nir.
v2: Don't forget to mark nir_swizzle as inline (Eric).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Both prog->nir and tgsi->nir will want to use these.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Don't be lazy. Constify the as_foo functions and use those instead
of ugly casts. Suggested by Curro.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that they're all implemented using macros, this is trivial.
v2: Remove redundant parenthesis. Suggested by Curro.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The downcast functions for non-leaf classes were previously implemented
"by hand." Now they are implemented using macros based on the is_foo
functions added in the previous patch.
v2: Remove redundant parenthesis. Suggested by Curro (on the next
patch).
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These functions deteremine when an IR node is one of the non-leaf
classes.
v2: Adjust indentation to line up. Suggested by Matt.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Transform this into b2f(or(a, b)).
instructions in affected programs: 432 -> 430 (-0.46%)
helped: 2
Acked-by: Ian Romanick <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Transform this into b2f(and(a, b)).
total instructions in shared programs: 6205448 -> 6204391 (-0.02%)
instructions in affected programs: 284030 -> 282973 (-0.37%)
helped: 903
HURT: 6
Acked-by: Ian Romanick <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
| |
Transform this into b2f(or(a, b)).
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Transform this into b2f(and(a, b)).
total instructions in shared programs: 6190291 -> 6189225 (-0.02%)
instructions in affected programs: 267247 -> 266181 (-0.40%)
helped: 866
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
They're not accessible from the source language, but optimizations are
allowed to generate them.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
in different fragment shaders. This also applies to a case when gl_FragCoord
is redeclared with no layout qualifiers in one fragment shader and not
declared but used in other fragment shader.
Signed-off-by: Anuj Phogat <[email protected]>
Khronos Bug#12957
Cc: "10.5" <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier commit 53bf7c8fd2e changed the logic to always call
base_alignment on structs. 1ec715ce8b12 hacked the function to return 0
for sampler fields, but didn't handle sampler arrays. Instead of
extending the hack, avoid calling base_alignment in the first place on
non-UBO uniforms.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89726
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Tapani Palli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch changes lowering pass to use unique name for each uniform
so that arrays from different stages cannot end up having same
name.
v2: instead of global counter, use pointer to achieve
unique name (Kenneth Graunke)
Signed-off-by: Tapani Pälli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89590
Reviewed-by: Chris Forbes <[email protected]>
Cc: 10.5 10.4 <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This addresses
...\glsl_parser.cpp(...) : warning C4065: switch statement contains 'default' but no 'case' labels
This is on code generated by bison, which we have little control.
It seems useful to have this warning otherwise enabled.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note that GLboolean is an alias for unsigned char, which lacks the
implicit true/false semantics that C++/C99 bool have.
Reviewed-by: Brian Paul <[email protected]>
v2: Change gl_shader::IsES and gl_shader_program::IsES to be bool as
recommended by Ian Romanick.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|