| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
At a later stage we might want to split out the NIR specific [XXX:
which one was it], as to make things move obvious and rename the files
appropriately. This patch aims to split it out of nir.
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Allows us to remove the SCons workaround :-)
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This way one can reuse it in glsl, nir or other infrastructure without
pulling nir as dependency.
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently it's an empty library, although it'll be used to store common
code between GLSL and NIR that is compiler specific (rather than generic
as the one in src/util).
XXX: strictly speaking we could add a python/mako parser to generate the
relevant files instead including builtin_type_macros.h in such a manner.
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the old hand-writen implementation of atan2, the calculation of
atan(y/x) was performed conditionally in the "then" block of the
outermost if statement. I believe I accidentally lifted this out
into unconditional code when converting to IR builder.
For reference, the original hand-written IR is visible in commit
722eff674b832e2321f791c68358ef52d2a1ff25.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: Erik Faye-Lund <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds glsl support of GL_OES_geometry_shader for
OpenGL ES 3.1.
Signed-off-by: Marta Lofstedt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally there's a producer and consumer, and the producer var gets
picked. In both the vertex->gs and tes->gs cases, that's the un-arrayed
version.
In the SSO case, however, there is no producer. So we picked the arrayed
GS variable, and as a result, used more slots than we should. More
critically, these slots would also no longer line up with the producer's
calculation. To fix this, we need to fix up the type of the variable
based on stage no matter what.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93650
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Cc: "11.0 11.1" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The commit b4e198f47f842 changed the offset and bits parameters of the
bitfield insert operation from scalars to vectors. However, the lowering
of ldexp on doubles operates on each vector component and emits scalar
code (since it has to deal with the lower and upper 32-bit chunks of
each double component), so it needs its bits and offset parameters to
be scalars.
Fixes fp64 regression (crash) in:
spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-ldexp-dvec4.shader_test
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
This reverts commit 4475d8f9169195baefa893b9b147fe20414cda7c.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch moves uniform calculation to happen during link_uniforms, this
is possible with help of UniformRemapTable that has all the reserved
locations.
Location assignment for implicit locations is changed so that we
utilize also the 'holes' that explicit uniform location assignment
might have left in UniformRemapTable, this makes it possible to fit
more uniforms as previously we were lazy here and wasting space.
Fixes following CTS tests:
ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max
ES31-CTS.explicit_uniform_location.uniform-loc-mix-with-implicit-max-array
v2: code cleanups, increment NumUniformRemapTable correctly, fix
find_empty_block to work properly and add some more comments.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marta Lofstedt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes piglit regression after fixes to duplicate layout rules.
Previously catching multiple layouts was relying on the code
meant to catch duplicates within a single layout(...), this
change triggers the rules for multiple layouts.
Cc: Mark Janes <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I have a patch that writes shaders as .shader_test files, and it uses
this function to create the headers (i.e. [vertex shader]).
[tess ctrl shader] isn't a valid shader_runner header - it's spelled
out as [tessellation control shader].
There's no real reason to abbreviate it, so spell it out.
v2: Rebase on Rob's patches to move the code.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the ARB_shading_language_420pack spec:
"More than one layout qualifier may appear in a single
declaration. If the same layout-qualifier-name occurs in
multiple layout qualifiers for the same declaration, the
last one overrides the former ones."
The parser was already failing correctly when the extension is
not available but testing for duplicates within a single layout
qualifier was still causing this to fail when available as both
cases share the same function for merging.
Here we add a parameter to differentiate between the two uses
and apply it to the duplicate test.
Acked-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to only create a single node for each default declaration
we add a new boolean parameter to the in/out merge function to
only create one once we reach the rightmost layout qualifier.
From the ARB_shading_language_420pack spec:
"More than one layout qualifier may appear in a single
declaration. If the same layout-qualifier-name occurs in
multiple layout qualifiers for the same declaration, the
last one overrides the former ones."
Acked-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
| |
Acked-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
This will allow merging of duplicate layout qualifiers as allowed
by ARB_shading_language_420pack
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is added by ARB_enhanced_layouts although it doesn't fit
into any of the six main changes so we enable this independently.
From the ARB_enhanced_layouts spec:
"More than one layout qualifier may appear in a single
declaration. Additionally, the same layout-qualifier-name
can occur multiple times within a layout qualifier or across
multiple layout qualifiers in the same declaration"
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
Print the stream value not the pointer to the expression,
also use the unsigned format specifier.
Cc: 11.1 <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One of the oglconform tests was crashing here, and it was
due to not cloning the actual parameters before creating the
new call. This makes a call clone function that does the right
things to make sure we clone all the needed info, and points
the callee at it. (It differs from ->clone due to this).
this may fix https://bugs.freedesktop.org/show_bug.cgi?id=93722, I had this
patch in my cts fixes tree, but hadn't had time to make sure I liked it.
Cc: "11.0 11.1" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Any duplicates in a single declaration will already fail the
generic duplicates test due to the explicit_stream flag being set.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This will allow the ARB_shading_language_420pack rules in
glsl_parser.yy for catching duplicate layout qualifiers to be
triggered for the stream identifier rather than relying on the
code meant to catch duplicates within a single layout(...)
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
Cc: "11.0 11.1" [email protected]
|
|
|
|
|
|
|
| |
Noticed this with $piglit/bin/vp-address-01
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nir.h is a bit inconsistent about 'typedef struct {} nir_foo' vs
'typedef struct nir_foo {} nir_foo'. But missing struct name tags is
inconvenient when you need a fwd declaration without pulling in all
of nir.
So add missing struct name tag for nir_variable, and a couple other
spots where it would likely be useful.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ARB has decided that implicit conversions should be performed for
bitwise operators in future language revisions. Implementations of
current language revisions may or may not perform them.
This patch makes Mesa apply implicti conversions even on current
language versions. Applications appear to expect this behavior,
and there's really no downside to doing so.
Fixes shader compilation in Shadow of Mordor.
Bugzilla: https://www.khronos.org/bugzilla/show_bug.cgi?id=1405
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only modify interpolation type for integer-based varyings or when the
consumer is known and different than fragment shader.
If we are linking separate shader programs and the consumer is unknown,
the consumer could be added later and be a fragment shader. If we
modify the interpolation type in this case, we could read wrong
values in the fragment shader inputs, as shown in bug 93320.
Fixes the following CTS test:
ES31-CTS.vertex_attrib_binding.advanced-bindingUpdate
Fixes the following dEQP tests:
dEQP-GLES31.functional.separate_shader.random.102
dEQP-GLES31.functional.separate_shader.random.111
dEQP-GLES31.functional.separate_shader.random.115
dEQP-GLES31.functional.separate_shader.random.17
dEQP-GLES31.functional.separate_shader.random.22
dEQP-GLES31.functional.separate_shader.random.23
dEQP-GLES31.functional.separate_shader.random.3
dEQP-GLES31.functional.separate_shader.random.32
dEQP-GLES31.functional.separate_shader.random.39
dEQP-GLES31.functional.separate_shader.random.64
dEQP-GLES31.functional.separate_shader.random.73
dEQP-GLES31.functional.separate_shader.random.91
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93320
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
nir_build_ivec4 is more readable and succinct than using nir_build_imm
directly, even if you have C99.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If shader declares uniform explicit location in one stage but
implicit in another, explicit location should be used. Patch marks
implicit uniforms as explicit if they were explicit in previous stage.
This makes sure that we don't treat them implicit later when assigning
locations.
Fixes following CTS test:
ES31-CTS.explicit_uniform_location.uniform-loc-implicit-in-some-stages3
v2: move check to cross_validate_globals (Timothy)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OpenGL specifications for bitfieldExtract() says:
The result will be undefined if <offset> or <bits> is negative, or if
the sum of <offset> and <bits> is greater than the number of bits
used to store the operand.
Therefore passing bits=32, offset=0 is legal and defined in GLSL.
But the earlier SM5 ubfe/ibfe opcodes are specified to accept a bitfield width
ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits
of the width operand, making them not able to implement the GLSL-specified
behavior directly.
This commit adds ubfe/ibfe operations from SM5 and a lowering pass for
bitfield_extract to to handle the trivial case of <bits> = 32 as
bitfieldExtract:
bits > 31 ? value : bfe(value, offset, bits)
Fixes:
ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595
Reviewed-by: Connor Abbott <[email protected]>
Tested-by: Marta Lofstedt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OpenGL specifications for bitfieldInsert() says:
The result will be undefined if <offset> or <bits> is negative, or if
the sum of <offset> and <bits> is greater than the number of bits
used to store the operand.
Therefore passing bits=32, offset=0 is legal and defined in GLSL.
But the earlier SM5 bfi opcode is specified to accept a bitfield width
ranging from 0-31. As such, Intel and AMD instructions read only the low
5 bits of the width operand, making them not able to implement the
GLSL-specified behavior directly.
This commit fixes the lowering of bitfield_insert to handle the trivial
case of <bits> = 32 as
bitfieldInsert:
bits > 31 ? insert : bfi(bfm(bits, offset), insert, base)
Fixes:
ES31-CTS.shader_bitfield_operation.bitfieldInsert.uint_2
ES31-CTS.shader_bitfield_operation.bitfieldInsert.uvec4_3
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595
Reviewed-by: Connor Abbott <[email protected]>
Tested-by: Marta Lofstedt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Intel/AMD's hardware instructions do not handle arguments of 32.
Constant evaluation should not produce a result different from the
hardware instruction.
The s/1ull/1u/ change is intentional: previously we wanted defined
behavior for the "1 << 32" case, but we're making this case undefined so
we can make it 1u and save ourselves a 64-bit operation.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Shifting into the sign bit is undefined, as is shifting by 32.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
If a Python codegen script failed, it would write a zero-byte file,
which on subsequent invocations of make would trick it into thinking the
file was appropriately generated.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We would like to be able to combine
result.x = bitfieldExtract(src0.x, src1.x, src2.x);
result.y = bitfieldExtract(src0.y, src1.y, src2.y);
result.z = bitfieldExtract(src0.z, src1.z, src2.z);
result.w = bitfieldExtract(src0.w, src1.w, src2.w);
into a single ivec4 bitfieldInsert operation. This should be possible
with most drivers.
This patch changes the offset and bits parameters from scalar ints
to ivecN or uvecN. The type of all three operands will be the same,
for simplicity.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We would like to be able to combine
result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x);
result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y);
result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z);
result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w);
into a single ivec4 bitfieldInsert operation. This should be possible
with most drivers.
This patch changes the offset and bits parameters from scalar ints
to ivecN or uvecN. The type of all four operands will be the same,
for simplicity.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TGSI doesn't use these - it just translates ir_quadop_bitfield_insert
directly. NIR can handle ir_quadop_bitfield_insert as well.
These opcodes were only used for i965, and with Jason's recent patches,
we can do this lowering in NIR (which also gains us SPIR-V handling).
So there's not much point to retaining this GLSL IR lowering code.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
NIR's bfm, like Intel/AMD's hardware instructions and GLSL IR's
ir_binop_bfm takes <bits> as src0 and <offset> as src1.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes CTS test:
ES31-CTS.shader_image_load_store.negative-linkErrors
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93410
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 8926dc8 added a check where we add packed varyings of output
stage only when we have multiple stages, however duplicates are already
handled by changes in commit 0508d950 and we want to add outputs also in
case where we have only one stage.
Fixes regression caused by 8926dc8 for following test:
ES31-CTS.program_interface_query.separate-programs-vertex
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marta Lofstedt <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
.length() on an unsized SSBO variable doesn't actually read any data
from the SSBO, and is allowed on variables marked 'writeonly'.
Fixes compute shader compilation in Shadow of Mordor.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Patch changes linker to allocate gl_shader_variable instead of using
ir_variable. This makes it possible to get rid of ir_variables and ir
in memory after linking.
v2: check that we do not create duplicate entries with
packed varyings
v3: document 'patch' bit (Ilia Mirkin)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Linker missed a check for situation where we exceed max amount of
uniform locations with explicit + implicit locations. Patch adds this
check to already existing iteration over uniforms in linker.
Fixes following CTS test:
ES31-CTS.explicit_uniform_location.uniform-loc-negative-link-max-num-of-locations
v2: use var->type->uniform_locations() (Timothy)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
The lower_named_interface_blocks() pass is called before we try
assign locations to varyings so this shouldn't be reachable.
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit 98270fd20d4d58db8ae5af3b6f10ed6a81c058a6.
Something went terribly wrong the commit is not what the commit
message says.
|
|
|
|
|
|
|
| |
The lower_named_interface_blocks() pass is called before we try
assign locations to varyings so this shouldn't be reachable.
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, opt_vectorize() tries to combine:
result.x = bitfieldInsert(src0.x, src1.x, src2.x, src3.x);
result.y = bitfieldInsert(src0.y, src1.y, src2.y, src3.y);
result.z = bitfieldInsert(src0.z, src1.z, src2.z, src3.z);
result.w = bitfieldInsert(src0.w, src1.w, src2.w, src3.w);
into a single ir_quadop_bitfield_insert opcode, which operates on
ivec4s. However, GLSL IR's opcodes currently require the bits and
offset parameters to be scalar integers. So, this breaks.
We want to be able to vectorize this eventually, but for now, just
chicken out and make opt_vectorize() bail by marking all the bitfield
insert/extract related opcodes as horizontal. This is a relatively
uncommon case today, so we'll do the simple fix for stable branches,
and fix it properly on master.
Fixes assertion failures when compiling Shadow of Mordor vertex shaders
on i965 in vec4 mode (where OptimizeForAOS enables opt_vectorize()).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
|