| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, a line such as:
#else garbage
would flag an error if it followed "#if 0", but not if it followed "#if 1".
We fix this by setting a new bit of state (lexing_else) that allows the lexer
to defer switching to the <SKIP> start state until after the NEWLINE following
the #else directive.
A new test case is added for:
#if 1
#else garbage
#endif
which was untested before, (and did not generate the desired error).
This fixes the following Khronos GLES3 CTS tests:
tokens_after_else_vertex
tokens_after_else_fragment
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the test suite was expecting the compiler to allow a redefintion
of a macro with whitespace added, but gcc is more strict and allows only for
changes in the amounts of whitespace, (but insists that whitespace exist or
not in exactly the same places).
See: https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html:
These definitions are effectively the same:
#define FOUR (2 + 2)
#define FOUR (2 + 2)
#define FOUR (2 /* two */ + 2)
but these are not:
#define FOUR (2 + 2)
#define FOUR ( 2+2 )
#define FOUR (2 * 2)
#define FOUR(score,and,seven,years,ago) (2 + 2)
This change adjusts the existing "redefine-macro-legitimate" test to work with
the more strict understanding, and adds a new "redefine-whitespace" test to
verify that changes in the position of whitespace are flagged as errors.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch specifically fixes redefinition condition for white space
changes. #define and #undef functionality in GLSL follows the standard
for C++ preprocessors for macro definitions.
From https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html:
These definitions are effectively the same:
#define FOUR (2 + 2)
#define FOUR (2 + 2)
#define FOUR (2 /* two */ + 2)
but these are not:
#define FOUR (2 + 2)
#define FOUR ( 2+2 )
#define FOUR (2 * 2)
#define FOUR(score,and,seven,years,ago) (2 + 2)
Fixes Khronos GLES3 CTS tests;
invalid_object_whitespace_vertex
invalid_object_whitespace_fragment
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
| |
Currently verifying that an #undef of __FILE__, __LINE__, or __VERSION__ will
generate an error.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes piglit tests in spec/glsl-es-3.00/compile:
undef-__FILE__.vert
undef-GL_ES.vert
undef-__LINE__.vert
undef-__VERSION__.vert
Also, fixes Khronos GLES3 CTS tests:
undefine_invalid_object_1_vertex
undefine_invalid_object_1_fragment
undefine_invalid_object_2_vertex
undefine_invalid_object_2_fragment
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
| |
The _msaa shaders weren't getting freed.
Cc: "10.2" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Spotted by Charmaine Lee.
Cc: "10.2" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Spotted by Charmaine Lee.
Cc: "10.2" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
| |
If the driver doesn't implement get_sample_position(), let's return
some non-garbage values.
|
| |
|
|
|
|
|
|
|
|
|
| |
On nvc0, a counter can have up to 6 sources instead of only one
for nve4+. This fixes a crash when a counter uses more than
one source.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The set of variable uses does not need to be ordered in any way, and
removing/adding elements is a fairly common operation in various
optimization passes.
This shortens runtime of piglit test fp-long-alu to ~22s from ~4h
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
BRW_PREDICATE_ALIGN1_ANY16H was incorrectly being disassembled as
"all16h", and ALL16H would probably print as "(null)".
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We clearly don't want to start at the head and walk backwards; we want
to start at the last real element before the tail sentinel. If the list
is empty, tail_pred will be the head sentinel, and we'll stop.
Nothing uses this function, so I guess nobody noticed it was broken.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81020
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This doesn't fix any known issue. In fact, radeon drivers ignore all
the discard flags for textures and implicitly do "discard range"
for any write transfer.
Cc: [email protected]
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
... on Gen6+. I'm not actually sure which class Gen6 fits into.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously instruction scheduling tracked dependencies on a per-register
basis. This meant that there was an artificial dependency between
interpolation instructions writing into the same virtual register.
Instruction scheduling would insert a number of instructions between the
two instructions in this example, when they are actually independent.
linterp vgrf8+0.0:F, hw_reg2:F, hw_reg3:F, hw_reg6:F
linterp vgrf8+1.0:F, hw_reg2:F, hw_reg3:F, hw_reg6+16:F
This lead to cases where the first texture coordinate is interpolated at
the beginning of the shader, but the second is done immediately before
the texture operation that uses it as a source.
After this change, the artificial dependency is removed and the
interpolation instructions are scheduled together.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
The old code was complicated, and was wrong when *ptr is NULL.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current compute-to-mrf pass doesn't handle blocks of MOVs. Shaders
that end with a texture fetch follwed by an fb write are left like this:
0x00000000: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted };
0x00000008: pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted };
0x00000010: send(8) g2<1>UW g6<8,8,1>F
sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 WE_normal 1Q };
0x00000020: mov(8) g113<1>F g2<8,8,1>F { align1 WE_normal 1Q compacted };
0x00000028: mov(8) g114<1>F g3<8,8,1>F { align1 WE_normal 1Q compacted };
0x00000030: mov(8) g115<1>F g4<8,8,1>F { align1 WE_normal 1Q compacted };
0x00000038: mov(8) g116<1>F g5<8,8,1>F { align1 WE_normal 1Q compacted };
0x00000040: sendc(8) null g113<8,8,1>F
render ( RT write, 0, 4, 12) mlen 4 rlen 0 { align1 WE_normal 1Q EOT };
This patch lets compute-to-mrf recognize blocks of MOVs and match them to
instructions (typically SEND) that writes multiple registers. With this,
the above shader becomes:
0x00000000: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted };
0x00000008: pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted };
0x00000010: send(8) g113<1>UW g6<8,8,1>F
sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 WE_normal 1Q };
0x00000020: sendc(8) null g113<8,8,1>F
render ( RT write, 0, 20, 12) mlen 4 rlen 0 { align1 WE_normal 1Q EOT };
which is the bulk of the shader db results:
total instructions in shared programs: 987040 -> 986720 (-0.03%)
instructions in affected programs: 844 -> 524 (-37.91%)
GAINED: 0
LOST: 0
The optimization also applies to MRT shaders that write the same
color value to multiple RTs, in which case we can eliminate four MOVs in
a similar fashion. See fbo-drawbuffers2-blend in piglit for an example.
No measurable performance impact. No piglit regressions.
Signed-off-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
| |
Apparently TXD wants its offset differently than TEX, accepting it in
the upper bits of the layer index. Unclear what happens when this is
combined with indirect sampler indexing.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Something about how we're implementing offsets for TXD is wrong, just
flip to the generic quadop-based implementation in that case.
This is the minimal fix appropriate for backporting.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
| |
handleTEX moves the layer as the first argument. This makes sure that
the quadops deal with the texture coordinates.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
| |
Unfortunately there's no good way to do this on the nv50 shader isa.
Dropping the bias seems preferable to doing the compare post-filtering.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
| |
This can only happen with texture(samplerCubeShadow, bias), where the
compare will be in the first argument.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
| |
Although the HSW PRM shows it, the BSpec lists this workaround as being
for Ivybridge only.
total instructions in shared programs: 1994951 -> 1993675 (-0.06%)
instructions in affected programs: 27325 -> 26049 (-4.67%)
|
|
|
|
|
|
|
|
| |
Port of commit b16b3c87 to the vec4 code.
No shader-db improvements, but might as well. The fs backend saw an
improvement because it's scalar and multiple identical CMP instructions
were generated by the SEL peepholes.
|
|
|
|
| |
Port of commit 219b43c6 to the vec4 code.
|
|
|
|
| |
Port of commit 5daf867f to the vec4 code.
|
|
|
|
|
|
|
|
|
|
|
| |
[mattst88]: Modified to perform CSE on instructions with
the same writemask. Offered no improvement before.
total instructions in shared programs: 1995633 -> 1995185 (-0.02%)
instructions in affected programs: 14410 -> 13962 (-3.11%)
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We want hex values here, not decimals.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
It's C. Compile it as such.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
The #ifndef include guards already said the right thing :)
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
With a hack to place an exec_node in the struct in C to be at the same
location as the inherited exec_node in C++.
Acked-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Acked-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If another layout qualifier appeared to the left of `invocations` in the
GS input layout declaration, the invocation count would be dropped on
the floor.
Fixes the piglit tests:
spec/ARB_transform_feedback3/arb_transform_feedback3-ext_interleaved_two_bufs_gs_max
spec/ARB_gpu_shader5/arb_gpu_shader5-invocation-id
spec/ARB_gpu_shader5/compiler/correct-multiple-layout-qualifier-invocations.geom
spec/ARB_gpu_shader5/execution/invocations-conflicting
Signed-off-by: Chris Forbes <[email protected]>
Tested-by: Ilia Mirkin <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
|