| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects."
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we had an enum called gl_shader_type which represented
pipeline stages in the order they occur in the pipeline
(i.e. MESA_SHADER_VERTEX=0, MESA_SHADER_GEOMETRY=1, etc), and several
inconsistently named functions for converting between it and other
representations:
- _mesa_shader_type_to_string: gl_shader_type -> string
- _mesa_shader_type_to_index: GLenum (GL_*_SHADER) -> gl_shader_type
- _mesa_program_target_to_index: GLenum (GL_*_PROGRAM) -> gl_shader_type
- _mesa_shader_enum_to_string: GLenum (GL_*_{SHADER,PROGRAM}) -> string
This patch tries to clean things up so that we use more consistent
terminology: the enum is now called gl_shader_stage (to emphasize that
it is in the order of pipeline stages), and the conversion functions are:
- _mesa_shader_stage_to_string: gl_shader_stage -> string
- _mesa_shader_enum_to_shader_stage: GLenum (GL_*_SHADER) -> gl_shader_stage
- _mesa_program_enum_to_shader_stage: GLenum (GL_*_PROGRAM) -> gl_shader_stage
- _mesa_progshader_enum_to_string: GLenum (GL_*_{SHADER,PROGRAM}) -> string
In addition, MESA_SHADER_TYPES has been renamed to MESA_SHADER_STAGES,
for consistency with the new name for the enum.
Reviewed-by: Kenneth Graunke <[email protected]>
v2: Also rename the "target" field of _mesa_glsl_parse_state and the
"target" parameter of _mesa_shader_stage_to_string to "stage".
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Haswell, POW takes 24 cycles, while EXP2 only takes 14. Plus, using
POW requires putting 2.0 in a register, while EXP2 doesn't.
I believe that EXP2 will be faster than POW on basically all GPUs, so
it makes sense to optimize it.
Looking at the savage2 subset of shader-db:
total instructions in shared programs: 113225 -> 113179 (-0.04%)
instructions in affected programs: 2139 -> 2093 (-2.15%)
instances of 'math pow': 795 -> 749 (-6.14%)
instances of 'math exp': 389 -> 435 (11.8%)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch creates a new generic is_value() method, which checks if an
ir_constant has a particular value. (For vectors, it must have the
single value repeated across all components.)
It then rewrites the is_zero/is_one/is_negative_one methods to use this
generic helper. All three were basically identical except for the value
they checked for. The other difference is that is_negative_one rejects
boolean types. The new is_value function maintains this behavior, only
allowing boolean types when checking for 0 or 1.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
Surprisingly, this helps one vertex shader in 3DMMES.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Evidently, there's some other definition of "min" and "max" that
causes MSVC to choke on these function names. Renaming to min2()
and max2() fixes things.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The preprocessor currently accepts multiple else/elif-groups
per if-section. The GLSL-preprocessor is defined by the C++
specification, which defines the following parse-rule:
if-section:
if-group elif-groups(opt) else-group(opt) endif-line
This clearly only allows a single else-group, that has to come
after any elif-groups.
So let's modify the code to follow the specification. Add test
to prevent regressions.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Cc: 10.0 <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
definition)
The preprocessor has always replaced multi-line comments with a single space
character, (as required by the specification), but as of commit
bd55ba568b301d0f764cd1ca015e84e1ae932c8b the lexer also emitted a NEWLINE
token for each newline within the comment, (in order to preserve line
numbers).
The emitting of NEWLINE tokens within the comment broke the rule of "replace a
multi-line comment with a single space" as could be exposed by code like the
following:
#define FOO a/*
*/b
FOO
Prior to commit bd55ba568b301d0f764cd1ca015e84e1ae932c8b, this code defined
the macro FOO as "a b" as desired. Since that commit, this code instead
defines FOO as "a" and leaves a stray "b" in the output.
In this commit, we fix this by not emitting the NEWLINE tokens while lexing
the comment, but instead merely counting them in the commented_newlines
variable. Then, when the lexer next encounters a non-commented newline it
switches to a NEWLINE_CATCHUP state to emit as many NEWLINE tokens as
necessary (so that subsequent parsing stages still generate correct line
numbers).
Of course, it would have been more clear if we could have written a loop to
emit all the newlines, but flex conventions prevent that, (we must use
"return" for each token we emit).
It similarly would have been clear to have a new rule restricted to the
<NEWLINE_CATCHUP> state with an action much like the body of this if
condition. The problem with that is that this rule must not consume any
characters. It might be possible to write a rule that matches a single
lookahead of any character, but then we would also need an additional rule to
ensure for the <EOF> case where there are no additional characters available
for the lookahead to match.
Given those considerations, and given that the SKIP-state manipulation already
involves a code block at the top of the lexer function, before any rules, it
seems best to me to go with the implementation here which adds a similar
pre-rule code block for the NEWLINE_CATCHUP.
Finally, this commit also changes the expected output of a few, existing glcpp
tests. The change here is that the space character resulting from the
multi-line comment is now emitted before the newlines corresponding to that
comment. (Previously, the newlines were emitted first, and the space character
afterward.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two things make this code confusing:
1. The uncharacteristic manipulation of lexer start state outside of
flex rules.
2. The confusing semantics of the skip_stack (including the
"lexing_if" override and the SKIP_NO_SKIP state).
This new comment is intended to bring a bit more clarity for any readers.
There is no intended beahvioral change to the code here. The actual code
changes include better indentation to avoid an excessively-long line, and
using the more descriptive INITIAL rather than 0.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
I'm not aware of any piglit tests that this fixes, but the old code
was obviously wrong.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch replaces the following pattern:
foo bar[MESA_SHADER_TYPES] = {
...
};
With:
foo bar[] = {
...
};
STATIC_ASSERT(Elements(bar) == MESA_SHADER_TYPES);
This way, when a new shader type is added in a future version of Mesa,
we will get a compile error to remind us that the array needs to be
updated.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This argument was carrying the name of the shader target (as a
string). We can get this just as easily by calling
_mesa_shader_enum_to_string().
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
We already have a function for converting a shader type index to a
string: _mesa_shader_type_to_string().
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, _mesa_glsl_shader_target_name() had an overload for GLenum
and an overload for the gl_shader_type enum, each of which behaved
differently. However, since GLenum is a synonym for unsigned int, and
unsigned ints are often used in place of gl_shader_type (e.g. in loop
indices), there was a big risk of calling the wrong overload by
mistake. This patch gives the two overloads different names so that
it's always clear which one we mean to call.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
If no function signature is found for a function name, report that the
function is not found instead of printing an empty list of candidates.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes the error reporting behavior for incorrect function
invocation (triggered by match_function_by_name() unable to find a
matching function call) from using the line number information
associated to the function name term to using the line number
information of the entire function expression. Fixes bug #72264.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72264
Reviewed-by: Ian Romanick <[email protected]>
Cc: "10.0" <[email protected]>
|
|
|
|
|
|
| |
These enums were redundant.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This will avoid compiler warnings in the patch that follows. There
should be no user-visible effect because the change only affects the
behaviour when an invalid enum is passed to
_mesa_shader_type_to_index(), and that can only happen if there is a
bug elsewhere in Mesa.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Patch copies the whole data structure at once instead of
assigning individual variables.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch moves following bitfields and variables to the data
structure:
explicit_location, explicit_index, explicit_binding, has_initializer,
is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray,
from_named_ifc_block_array, depth_layout, location, index, binding,
max_array_access, atomic
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This patch moves following bitfields in to the data structure:
used, assigned, how_declared, mode, interpolation,
origin_upper_left, pixel_center_integer
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
Data section helps serialization and cloning of a ir_variable. This
patch includes the helper bits used for read only ir_variables.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that loop_controls no longer creates normatively bound loops,
there is no need for ir_loop::normative_bound or the
lower_bounded_loops pass.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, when loop_controls analyzed a loop and found that it had a
fixed bound (known at compile time), it would remove all of the loop
terminators and instead set the loop's normative_bound field to force
the loop to execute the correct number of times.
This made loop unrolling easy, but it had a serious disadvantage.
Since most GPU's don't have a native mechanism for executing a loop a
fixed number of times, in order to implement the normative bound, the
back-ends would have to synthesize a new loop induction variable. As
a result, many loops wound up having two induction variables instead
of one. This caused extra register pressure and unnecessary
instructions.
This patch modifies loop_controls so that it doesn't set the loop's
normative_bound anymore. Instead it leaves one of the terminators in
the loop (the limiting terminator), so the back-end doesn't have to go
to any extra work to ensure the loop terminates at the right time.
This complicates loop unrolling slightly: when deciding whether a loop
can be unrolled, we have to account for the presence of the limiting
terminator. And when we do unroll the loop, we have to remove the
limiting terminator first.
For an example of how this results in more efficient back end code,
consider the loop:
for (int i = 0; i < 100; i++) {
total += i;
}
Previous to this patch, on i965, this loop would compile down to this
(vec4) native code:
mov(8) g4<1>.xD 0D
mov(8) g8<1>.xD 0D
loop:
cmp.ge.f0(8) null g8<4;4,1>.xD 100D
(+f0) if(8)
break(8)
endif(8)
add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD
add(8) g8<1>.xD g8<4;4,1>.xD 1D
add(8) g4<1>.xD g4<4;4,1>.xD 1D
while(8) loop
(notice that both g8 and g4 are loop induction variables; one is used
to terminate the loop, and the other is used to accumulate the total).
After this patch, the same loop compiles to:
mov(8) g4<1>.xD 0D
loop:
cmp.ge.f0(8) null g4<4;4,1>.xD 100D
(+f0) if(8)
break(8)
endif(8)
add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD
add(8) g4<1>.xD g4<4;4,1>.xD 1D
while(8) loop
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
This value is now redundant with
loop_variable_state::limiting_terminator->iterations and
ir_loop::normative_bound.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old logic of loop_unroll_visitor::visit_leave(ir_loop *) was:
heuristics to skip unrolling in various circumstances;
if (loop contains more than one jump)
return;
else if (loop contains one jump) {
if (the jump is an unconditional "break" at the end of the loop) {
remove the break and set iteration count to 1;
fall through to simple loop unrolling code;
} else {
for (each "if" statement in the loop body)
see if the jump is a "break" at the end of one of its forks;
if (the "break" wasn't found)
return;
splice the remainder of the loop into the other fork of the "if";
remove the "break";
complex loop unrolling code;
return;
}
}
simple loop unrolling code;
return;
These tasks have been moved to their own functions:
- splice the remainder of the loop into the other fork of the "if"
- simple loop unrolling code
- complex loop unrolling code
And the logic has been flattened to:
heuristics to skip unrolling in various circumstances;
if (loop contains more than one jump)
return;
if (loop contains no jumps) {
simple loop unroll;
return;
}
if (the jump is an unconditional "break" at the end of the loop) {
remove the break;
simple loop unroll with iteration count of 1;
return;
}
for (each "if" statement in the loop body) {
if (the jump is a "break" at the end of one of its forks) {
splice the remainder of the loop into the other fork of the "if";
remove the "break";
complex loop unroll;
return;
}
}
This will make it easier to modify the loop unrolling algorithm in a
future patch.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the sole responsibility of loop_analysis was to find all
the variables referenced in the loop that are either loop constant or
induction variables, and find all of the simple if statements that
might terminate the loop. The remainder of the analysis necessary to
determine how many times a loop executed was performed by
loop_controls.
This patch makes loop_analysis also responsible for determining the
number of iterations after which each loop terminator will terminate
the loop, and for figuring out which terminator will terminate the
loop first (I'm calling this the "limiting terminator").
This will allow loop unrolling to make use of information that was
previously only visible from loop_controls, namely the identity of the
limiting terminator.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Patches to follow will introduce code into the loop_terminator
constructor. Allocating loop_terminator using new(mem_ctx) syntax
will ensure that the constructor runs.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When loop_control_visitor::visit_leave(ir_loop *) is analyzing a loop
terminator that acts on a certain ir_variable, it doesn't need to walk
the list of induction variables to find the loop_variable entry
corresponding to the variable. It can just look it up in the
loop_variable_state hashtable and verify that the loop_variable entry
represents an induction variable.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These fields were part of some planned optimizations that never
materialized. Remove them for now to simplify things; if we ever get
round to adding the optimizations that would require them, we can
always re-introduce them.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch replaces the ir_loop fields "from", "to", "increment",
"counter", and "cmp" with a single integer ("normative_bound") that
serves the same purpose.
I've used the name "normative_bound" to emphasize the fact that the
back-end is required to emit code to prevent the loop from running
more than normative_bound times. (By contrast, an "informative" bound
would be a bound that is informational only).
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, all of the back-ends (ir_to_mesa, st_glsl_to_tgsi, and the
i965 fs and vec4 visitors) had nearly identical logic for handling
bounded loops. This replaces the duplicate logic with an equivalent
lowering pass that is used by all the back-ends.
Note: on i965, there is a slight increase in instruction count. For
example, a loop like this:
for (int i = 0; i < 100; i++) {
total += i;
}
would previously compile down to this (vec4) native code:
mov(8) g4<1>.xD 0D
mov(8) g8<1>.xD 0D
loop:
cmp.ge.f0(8) null g8<4;4,1>.xD 100D
(+f0) break(8)
add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD
add(8) g8<1>.xD g8<4;4,1>.xD 1D
add(8) g4<1>.xD g4<4;4,1>.xD 1D
while(8) loop
After this patch, the "(+f0) break(8)" turns into:
(+f0) if(8)
break(8)
endif(8)
because the back-end isn't smart enough to recognize that "if
(condition) break;" can be done using a conditional break instruction.
However, it should be relatively easy for a future peephole
optimization to properly optimize this.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, loop analysis would set
this->conditional_or_nested_assignment based on the most recently
visited assignment to the variable. As a result, if a vaiable was
assigned to more than once in a loop, the flag might be set
incorrectly. For example, in a loop like this:
int x;
for (int i = 0; i < 3; i++) {
if (i == 0)
x = 10;
...
x = 20;
...
}
loop analysis would have incorrectly concluded that all assignments to
x were unconditional.
In practice this was a benign bug, because
conditional_or_nested_assignment is only used to disqualify variables
from being considered as loop induction variables or loop constant
variables, and having multiple assignments also disqualifies a
variable from being considered as either of those things.
Still, we should get the analysis correct to avoid future confusion.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, when visiting an ir_call, loop analysis would only mark
the innermost enclosing loop as containing a call. As a result, when
encountering a loop like this:
for (i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
foo();
}
}
it would incorrectly conclude that the outer loop ran three times.
(This is not certain; if foo() modifies i, then the outer loop might
run more or fewer times).
Fixes piglit test "vs-call-in-nested-loop.shader_test".
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, when visiting a variable dereference, loop analysis would
only consider its effect on the innermost enclosing loop. As a
result, when encountering a loop like this:
for (int i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
...
i = 2;
}
}
it would incorrectly conclude that the outer loop ran three times.
Fixes piglit test "vs-inner-loop-modifies-outer-loop-var.shader_test".
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
This function is about to get more complex.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intention is that things like
int;
will generate a warning. However, we were also accidentally emitting
the same warning for things like
struct Foo { int x; };
Signed-off-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68838
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: Aras Pranckevicius <[email protected]>
Cc: "9.2 10.0" <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|