| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On BDW:
total instructions in shared programs: 13061877 -> 13060965 (-0.01%)
instructions in affected programs: 133569 -> 132657 (-0.68%)
helped: 566
HURT: 0
total cycles in shared programs: 256611784 -> 256599536 (-0.00%)
cycles in affected programs: 861016 -> 848768 (-1.42%)
helped: 379
HURT: 73
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On BDW:
total instructions in shared programs: 13074882 -> 13068703 (-0.05%)
instructions in affected programs: 1823116 -> 1816937 (-0.34%)
helped: 4187
HURT: 537
total cycles in shared programs: 256622718 -> 256425382 (-0.08%)
cycles in affected programs: 123790120 -> 123592784 (-0.16%)
helped: 3823
HURT: 2037
total spills in shared programs: 15276 -> 14929 (-2.27%)
spills in affected programs: 9446 -> 9099 (-3.67%)
helped: 352
HURT: 1
total fills in shared programs: 20496 -> 20144 (-1.72%)
fills in affected programs: 13040 -> 12688 (-2.70%)
helped: 352
HURT: 1
LOST: 2
GAINED: 21
v2: Rely on 'a' being a well-formed boolean (Connor, Eric).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On BDW:
total instructions in shared programs: 13071119 -> 13070371 (-0.01%)
instructions in affected programs: 83424 -> 82676 (-0.90%)
helped: 505
HURT: 45 (all TCS, all hurt by a single instruction)
total cycles in shared programs: 256601322 -> 256588932 (-0.00%)
cycles in affected programs: 819410 -> 807020 (-1.51%)
helped: 450
HURT: 57
total loops in shared programs: 2950 -> 2942 (-0.27%)
loops in affected programs: 8 -> 0
helped: 7
HURT: 0
v2: Drop unnecessary 'a@bool' annotation (Connor, Eric).
Add a comment explaining the rule (Ian).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]> [v1]
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It feels weird using GL_* enums in a Vulkan driver.
v2: Fix the TESS_SPACING -> PIPE_TESS_SPACING conversion.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The vertex order is either clockwise or counterclockwise. We can just
store a "ccw" boolean rather than GLenum values. I don't want to use
GLenums in a Vulkan driver, and even in GL a simple boolean works fine.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I apparently broke mark_whole_variable in ir_set_program_inouts.
It was passing a type that wasn't var->type, so the wrapper didn't
work out. It's all broken, revert it and start over.
Fixes all kinds of things on other drivers.
Revert "glsl: Make is_fixed_function_array actually check for varyings."
This reverts commit 42699e12711668a142b7acf11c168cf4301c1295.
Revert "glsl: Mark whole variable used for ClipDistance and TessLevel*."
This reverts commit 5c580e64cc206ab160e1767c42e4d6c81f67da4d.
Revert "glsl: Override the # of varying slots for ClipDistance and TessLevel*."
This reverts commit 8b5749f65ac434961308ccb579fb8a816e4f29d5.
Revert "glsl: Create and use a new ir_variable::count_attribute_slots() wrapper."
This reverts commit 6aa5cb34d03765b7be8611aa516bc201bd337f73.
|
|
|
|
|
|
|
|
|
|
|
| |
We can't check VARYING_SLOT_* locations until we've determined that
the variable is actually a varying.
Fixes assert failures in drivers which actually use this path,
such as radeonsi and i915.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99314
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Marking operations as redundant if they are equal to the base
range is fine when the tree structure is something like this:
max
/ \
max b
/ \
3 max
/ \
3 a
But the opt falls apart with a tree like this:
max
/ \
max max
/ \ / \
3 a b 3
The problem is that both branches are treated the same: descending in
the left branch will prune the constant, and then descending the right
branch will prune the constant there as well, because limits[0] wasn't
updated to take the change on the left branch into account, and so we
still get [3,\infty) as baserange.
In order to fix the bug we just disable the marking of redundant expressions
when they match the baserange.
NIR algebraic opt will clean up the first tree for anyway, hopefully
other backends are smart enough to do this also.
Cc: "13.0" <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Upcoming reworks in i965 are going to make it easy to handle this
like any other input. Having it as a system value will just require
additional code for no benefit.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no point in trying to mark partial array access for
gl_ClipDistance, gl_TessLevelOuter, or gl_TessLevelInner - they're
special built-in variables that control fixed function hardware,
and will likely be used in an all-or-nothing fashion.
Since these arrays only occupy 1-2 varying slots, we have to avoid
our normal processing which increments the slot value by the array
index.
(I wrote this code before i965 switched from ir_set_program_inouts
to nir_shader_gather_info. It's not used by anyone today, and I'm
not sure how valuable it is...the alternative to GLSL IR lowering
is NIR compact arrays, at which point you should use nir_gather_info.)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now, this shouldn't have any effect, as all drivers use
LowerClipDist and LowerTessFactors to turn the float[] arrays into
vectors.
However, it should help make it possible for drivers to avoid that
lowering.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This wraps glsl_type::count_attribute_slots(), but will soon contain a
couple of overrides for a couple of GLSL built-ins variables.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
This will help allow us to simplify the handling of samplers by
storing them in a single location rather than duplicating them in
both gl_linked_shader and gl_program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Now that we create gl_program earlier there is no need to mess about
copying things to gl_linked_shader then to gl_program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Now that we have the is_arb_asm flag we can just skip the
initialisation.
V2: remove hack from standalone compiler where it was never
needed since it only compiles glsl shaders.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set the flag via the _mesa_init_gl_program() and NewProgram()
helpers.
In i965 we currently check for the existance of gl_shader_program
to decide if this is an ARB assembly style program or not.
Adding a flag makes the code clearer and will help removes a
dependency on gl_shader_program in the i965 codegen functions.
Also this will allow use to skip initialising sampler units for
linked shaders, we currently memset it to zero again during linking.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having it here rather than in gl_linked_shader allows us to simplify
the code.
Also it is error prone to depend on the gl_linked_shader for programs
in current use because a failed linking attempt will free infomation
about the current program. In i965 we could be trying to recompile
a shader variant but may have lost some required fields.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Here we also remove the duplicate field in gl_linked_shader and always
get the value from shader_info instead.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This will help allow us to store pointers to gl_program structs in the
CurrentProgram array resulting in a bunch of code simplifications.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This also removes the duplicate field in gl_linked_shader, and
gets num_ubos from shader_info instead.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having it here rather than in gl_linked_shader allows us to simplify
the code.
Also it is error prone to depend on the gl_linked_shader for programs
in current use because a failed linking attempt will free infomation
about the current program. In i965 we could be trying to recompile
a shader variant but may have lost some required fields.
We drop the memset on ImageUnits because gl_program is already
created using rzalloc().
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
to reduce the amount of GLSL optimizations for drivers that can do better.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
so that backends don't have to run it manually
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Some of the existing tests were using '@' and '"' incidentally within the test
body. Neither of these characters are actually legal for GLSL. And since we
are planning to start generating errors for illegal characters, we need to
first make the test suite clean.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here, each legal character (as defined by GLSL Language Specification version
4.30.6, section 3.1) appears at least once in the input file. Obviously,
characters with special meaning (like '#' and '\') aren't treated exhaustively
with respect to all their possible uses. We have many other tests for that.
Here, we're simply ensuring that the test suite sees every legal character at
least once.
v2 (by Ken): Fix expectations, move to src/compiler, renumber tests.
Carl's .expected: Updated .expected:
.. ..
. . . .
. . . .
. . . .
. . . .
. ..
. .
. .
.
(For some reason, the original test expected ".." to produce two lines.
glcpp, cpp, and mcpp all follow my updated behavior, so I believe it to
be correct.)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Of course, these aren't really useful for anything, but the GLSL language
specification does allow them:
The source character set used for the OpenGL shading languages,
outside of comments, is a subset of UTF-8. It includes the following
characters:
...
White space: the space character, horizontal tab, vertical tab, form
feed, carriage-return, and line- feed.
[GLSL Language Specification 4.30.6, section 3.1]
So treat vertical tab ('\v' or ^K) and form-feed ('\f' or ^L) as horizontal
space characters.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GCC's preprocessor accepts a macro definition where there is no space between
the macro's identifier name and the replacementlist. (GCC does emit a "missing
space" warning that we don't, but that's fine.)
This is an exhaustive test that verifies that all legal GLSL characters that
could possibly be interpreted as separating the macro name from the
replacement list are interpreted as such. So the testing here includes all
valid GLSL symbols except for:
* Characters that can be part of an identifier (a-z, A-Z, 0-9, _)
* Backslash, (allowed only as line continuation)
* Hash, (allowed only to introduce pre-processor directive, or as part of a
paste operator in a replacement list---but not as first token of
replacement list)
* Space characters (since the point of the testing is to have missing space)
* Left parenthesis (which would indicate a function-like macro)
v2 (Ken): Move to src/compiler, renumber tests.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Move relative push constant relative offset computation down to
_vtn_load_store_tail() (Jason)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes performance regression in SynMark PSPom caused by loops with float
counters not always unrolling.
For example:
for (float i = 0.02; i < 0.9; i += 0.11)
...
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This disallows fancy varyings in tessellation and geometry shaders,
as required by ES 3.2.
Fixes:
dEQP-GLES31.functional.tessellation.user_defined_io.negative.per_patch_array_of_structs
dEQP-GLES31.functional.tessellation.user_defined_io.negative.per_patch_structs_containing_arrays
(Not a candidate for stable branches as it only disallows things which
should be working as desktop GL allows them.)
v2: Update error messages to not say "vertex shader" (caught by Iago).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We also add the stubs for the standalone compiler in this change.
By adding a reference here we can now refactor some code to use
gl_program where we were previously awkwardly using gl_shader_program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We were passing around a void *mem_ctx and using that to initialize the
builder which was wrong since that pointed to ralloc_parent(impl) which
is the shader but the builder is supposed to be initialized with the
nir_function_impl.
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
This lets us get rid of the void *mem_ctx parameter and make things a
bit more type safe.
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We rename it to nir_deref_clone, re-order the sources to match the other
clone functions, and expose nir_deref_var_clone. This past part, in
particular, lets us get rid of quite a few lines since we no longer have
to call nir_copy_deref and wrap it in deref_as_var.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I forgot to do this in commit 76b97d544e ("anv: enable storage image
extended formats"). Since both drivers support this now, no need for the
conditional enable.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
This keeps some of Connor's original code. However, while I was at it,
I updated this very old pass to a bit more modern NIR.
|
|
|
|
|
|
|
|
|
| |
After we figure out the value that we are going to return, we have a
loop that walks up the dominance tree and sets the value in each of the
blocks that doesn't have one yet. In the case of the phi, the def is
set to NEEDS_PHI not NULL, so the last one where the phi node actually
goes never gets filled out. This can lead to duplicating the phi node
unnecessarily.
|