| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit introduces a new function nir_assign_linked_io_var_locations
which is intended to help with assigning driver locations to shaders
during linking, primarily aimed at the VS->TCS->TES->GS stages.
It ensures that the linked shaders have the same driver locations,
and it also packs these as close to each other as possible.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4388>
|
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4767>
|
|
|
|
|
|
|
|
| |
We can't remove volatile writes and we can't combine them with other
volatile writes so all we can do is clear the unused bits.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4767>
|
|
|
|
|
|
|
| |
Fixes: 62332d139c8f6 "nir: Add a local variable-based copy prop..."
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4767>
|
|
|
|
|
|
|
|
|
| |
For deref_store, we can still delete invalid stores that write to
statically OOB data. For everything, we need to make sure that we kill
aliases of destinations even if it's volatile.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4767>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the mask value 0x80000000, the other operand must be 32-bit. This
fixes failures in
dEQP-VK.subgroups.ballot_mask.ext_shader_subgroup_ballot.*.gl_subgroupgemaskarb_*
tests from Vulkan 1.2.2 CTS.
Checking one of the tests, it appears that the tests are doing 64-bit
iand with 0x0000000080000000, then comparing the result with zero.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2834
Fixes: 88eb8f190bd ("nir/algebraic: Simplify logic to detect sign of an integer")
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4770>
|
|
|
|
|
|
|
|
|
|
|
| |
The new option replaces the two other _split lowering options, since
there's no need for separate options.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4738>
|
|
|
|
|
|
|
|
| |
(Original code from ir3)
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4716>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I spent over an hour trying to debug a problem if a condition on a
variable not being applied. The problem turned out to be
"a(is_not_negative" instead of "a(is_not_negative)". This commit would
have detected that problem and failed to build.
v2: Just add $ to the end of the existing regex, and it will fail to
match a malformed string. Suggested by Jason.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4720>
|
|
|
|
|
|
|
|
| |
So that the GLSL compiler doesn't have to use the GLenum conversion
functions.
Reviewed-by: Timothy Arceri <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4756>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit completely rewrites the way we extract a structured CFG from
SPIR-V. The new approach is different in a few ways:
1. It does a breadth-first search instead of depth-first. This means
that we've visited the merge node for a construct before we visit
any of the nodes inside the construct. This makes it easier to
validate things like loop and switch nesting.
2. We record more information in the CFG. Earlier commits added a
parent pointer to vtn_cf_node but we now record all of the merge and
other special blocks for each CFG node. This lets us validate
things more precisely.
3. It makes heavy use of merge blocks for walking the CFG. Previously,
we sort of used them as hints for trying to guess the CFG structure
but things got dicey whenever a merge was missing. We had some
heuristics for how to handle short-circuiting if statements but it
was a bunch of special cases.
Now, we make them a fundamental part of walking the CFG. When we
encounter a control-flow construct, we add the body components of
the construct to the BFS work list and then jump to the merge block
if one exists to continue scanning the current CFG nesting level.
If no merge block exists, we assume that means that control-flow
never re-converges in a normal way and that the only way to get back
to normality is with a direct jump such as a loop break or continue.
This should make things far more robust when trying to deal with the
more creative placement (or lack thereof) of merge instructions.
Reviewed-by: Alan Baker <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3820>
Closes: #2760
Acked-by: Samuel Pitoiset <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4446>
|
|
|
|
|
|
|
|
|
| |
Thanks to VK_EXT_subgroup_size_control, we can end up with
gl_SubgroupSize being as low as 8 on Intel.
Fixes: d10de253097 "anv: Implement VK_EXT_subgroup_size_control"
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4694>
|
|
|
|
|
|
|
|
|
| |
Fixes dEQP-VK.spirv_assembly.instruction.function_params.sampler_param
cc: [email protected]
Reviewed-by: Dave Airlie <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4684>
|
|
|
|
|
|
|
|
|
|
|
| |
The SPIR-V parser sometimes generates casts from specific sampler types
like sampler2D to the bare sampler type. This results in a cast which
causes heartburn for drivers but is harmless to remove.
cc: [email protected]
Reviewed-by: Dave Airlie <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4684>
|
|
|
|
|
|
|
|
|
|
|
| |
When we originally wrote spirv_to_nir we didn't have a good scalar value
union to handily use so we rolled our own thing for spec constants. Now
that we have nir_const_value, we can use that and simplify a bunch of
the spec constant logic.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
|
|
|
|
|
|
|
|
|
|
| |
We were accidentally asserting that the value had to be a vtn_ssa_value
which isn't true if it, for instance, comes from a spec constant.
Fixes: fb282a68bc46 "spirv: Implement OpConvertPtrToU and OpConvertUToPtr"
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
|
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4721>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes find_and_update_named_uniform_storage() for subroutines
and also updates num_shader_uniform_components for non opaque
uniforms.
The following patch will ensure this type of bug won't happen again.
Reviewed-by: Alejandro Piñeiro <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4721>
|
|
|
|
|
|
|
|
|
| |
This corresponds to 2ad0492fb00919d99500f1da74abf5ad3c870e4e ("Discuss
generator magic number reservations.") in
https://github.com/KhronosGroup/SPIRV-Headers.
Acked-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4682>
|
|
|
|
|
|
|
| |
Same solution as done in spirv_info generation.
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4682>
|
|
|
|
|
|
|
|
| |
This fixes things when the last input/output is a struct or matrix.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4670>
|
|
|
|
|
|
|
|
|
| |
We have shader->num_inputs for "last used input + 1" already, which
respects struct/matrix varyings.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4670>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So far only the singed versions are defined.
v2: Make umad24 and umul24 non-driver specific (Eric Anholt)
v3: Take care of nir_builder and automatic lowering of the
opcodes if they are not supported by the backend.
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4610>
|
|
|
|
|
|
| |
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4610>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Erik Faye-Lund <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4387>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Erik Faye-Lund <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4387>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Erik Faye-Lund <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4387>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nir_opt_load_store_vectorize has an is_strided_vector() function that
looks for types with weird explicit strides. It does so by comparing
the explicit stride against the type-size-derived typical stride.
This had a subtle bug. Simple vector types (vec2/3/4) have no explicit
stride, so glsl_get_explicit_stride() returns 0. This never matches the
typical stride for a vector, so is_strided_vector() would return true
for basically any vector type, causing the vectorizer to bail.
I found this by looking at a compute shader with scalar SSBO loads at
offsets 0x220, 0x224, 0x228, 0x22c. nir_opt_load_store_vectorize would
properly vectorize the first two into a vec2 load, but would refuse to
extend it to a vec3 and ultimately vec4 load because is_strided_vector()
saw a vec2 and freaked out.
Neither ACO nor ANV do load/store vectorization before lowering derefs,
so this shouldn't affect them. However, I'd like to fix this bug to
avoid the trap for anyone who decides to in the future. In a branch
where anv used this lowering, this cut an additional 38% of the send
messages in the shader by properly vectorizing more things.
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4255>
|
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3204>
|
|
|
|
|
|
|
|
| |
That is basically nir_tex_instr sampler_index documentation comment
expressed as a helper.
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4677>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In SPIRV of compute shader in Aztec Ruins benchmark there is:
OpControlBarrier %uint_1 %uint_1 %uint_0
// ControlBarrier(Device, Device, rdcspv::MemorySemantics(0));
which is an incorrect translation of glsl barrier().
GLSLang, prior to c3f1cdfa, emitted the OpControlBarrier with
Device instead of Workgroup for execution scope.
2365520c covers similar case but isn't applied when execution_scope
is SpvScopeDevice.
Cc: <[email protected]>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2742
Signed-off-by: Danylo Piliaiev <[email protected]>
Tested-by: Rafael Antognolli <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4660>
|
|
|
|
|
|
|
|
|
|
|
| |
This moves the fi_types to a new mesa_private.h and removes the
imports.c file. The vast majority of this patch is just removing
pound includes of imports.h and fixing up the recursive includes.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3024>
|
|
|
|
|
|
|
|
|
| |
As of the previous commit, they are never used.
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4624>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Version 4.4 of the GLSL spec changed the definition of noise*() to
always return zero and earlier versions of the spec allowed zero as a
valid implementation.
All drivers, as far as I can tell, unconditionally call lower_noise()
today which turns ir_unop_noise into zero. We've got a 10-year-old
comment in there saying "In the future, ir_unop_noise may be replaced by
a call to a function that implements noise." Well, it's the future now
and we've not yet gotten around to that. In the mean time, the GLSL
spec has made doing so illegal.
To make things worse, we then pretend to handle the opcode in
glsl_to_nir, ir_to_mesa, and st_glsl_to_tgsi even though it should never
get there given the lowering. The lowering in st_glsl_to_tgsi defines
noise*() to be 0.5 which is an illegal implementation of the noise
functions according to pre-4.4 specs. We also have opcodes for this in
NIR which are never used because, again, we always call lower_noise().
Let's just kill the whole opcode and make builtin_builder.cpp build a
bunch of functions that just return zero.
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4624>
|
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4395>
|
|
|
|
|
|
|
|
| |
We need to skip opaque variables inside blocks, this is handled
elsewhere and will cause a crash here.
Reviewed-by: Alejandro Piñeiro <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4395>
|
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4395>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: D Scott Phillips <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
|
|
|
|
|
|
|
|
|
|
|
| |
Copied from anv, replaced state with passing model/range directly.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Reviewed-by: D Scott Phillips <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4528>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have a clip dist float[1] compact followed by a tess factor
float[2] we don't want to overlap them, but the partial check
only happens for non-compact vars.
This fixes some issues seen with my sw vulkan layer with
dEQP-VK.clipping.user_defined.clip_distance*
v2: v1 failed with clip/cull mixtures, since in that
case the cull has a location_frac to follow after the clip
so only reset if we get a location_frac of 0 in a subsequent
clip var
Reviewed-by: Timothy Arceri <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4635>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After the decorations of a variable are evaluated, propagate the
access flag to the associated vtn_pointer. This was done when
creating the pointer but at that point there was no access flags for
the variable.
Inline the pointer creation to make this point clearer, in isolation
the helper made the impression that the value was being propagated.
Issue found by Ken.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4620>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This unconditionally lowers 64-bit fmin3/fmax3/fmed3 because
AMD hardware doesn't have native instructions, and no drivers
except RADV uses these instructions.
Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.*.f64.*
with ACO.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4570>
|
|
|
|
|
|
|
|
|
|
| |
Fixes dEQP-VK.spirv_assembly.instruction.amd_trinary_minmax.*.i64.*
with ACO because this backend compiler expects most of the 64-bit
operations to be lowered.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4570>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This helps us avoid moving the movs outside if branches when there
src can't be scalarized.
For example it avoids:
vec4 32 ssa_7 = tex ssa_6 (coord), 0 (texture), 0 (sampler),
if ... {
r0 = imov ssa_7.z
r1 = imov ssa_7.y
r2 = imov ssa_7.x
r3 = imov ssa_7.w
...
} else {
...
if ... {
r0 = imov ssa_7.x
r1 = imov ssa_7.w
...
else {
r0 = imov ssa_7.z
r1 = imov ssa_7.y
...
}
r2 = imov ssa_7.x
r3 = imov ssa_7.w
}
...
vec4 32 ssa_36 = vec4 r0, r1, r2, r3
Becoming something like:
vec4 32 ssa_7 = tex ssa_6 (coord), 0 (texture), 0 (sampler),
r0 = imov ssa_7.z
r1 = imov ssa_7.y
r2 = imov ssa_7.x
r3 = imov ssa_7.w
if ... {
...
} else {
if ... {
r0 = imov r2
r1 = imov r3
...
else {
...
}
...
}
While this is has a smaller instruction count it requires more work
for the same result. With more complex examples we can also end up
shuffling the registers around in a way that requires more registers
to use as temps so that we don't overwrite our original values along
the way.
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
|
|
|
|
|
|
|
|
|
| |
Here we only pull instructions further up control flow if they are
constant or texture instructions. See the code comment for more
information.
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't move them later as we could move them into non-uniform
control flow, but moving them earlier should be fine.
This helps avoid a bunch of spilling in unigine shaders due to
moving the tex instructions sources earlier (outside if branches)
but not the instruction itself.
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Classically, global code motion is also a dead code pass. However, in
the initial implementation, the decision was made to place every
instruction and let conventional DCE clean up the dead ones. Because
any uses of a dead instruction are unreachable, we have no late block
and the dead instructions are always scheduled early. The problem is
that, because we place the dead instruction early, it pushes the
placement of any dependencies of the dead instruction earlier than they
may need to be placed. In order prevent dead instructions from
affecting the placement of live ones, we need to delete them.
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
|
|
|
|
|
|
|
|
|
| |
Now that the GCM pass is more conservative and only moves instructions
to different blocks when it's advantageous to do so, we can have a
proper notion of what it means to make progress.
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4636>
|