| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
These come from the disasm tests, and fix our disasm of blob's
uniform/nonuniform cat6 operands. We also now include human-readable names
for all the modes we know about (though bindless gets distinguished by its
.baseN, like Connor's original disasm).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
|
|
|
|
|
|
|
|
| |
With this I also brought in a few new control flow instruction disasm
tests that I'd made back when I wrote the disasm test, but which were too
far from correct to include until now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
|
|
|
|
|
|
| |
I put these together from traces I had while trying to do LDC for GL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
|
|
|
|
|
|
|
|
| |
Necessary to avoid compiler assertion failures in:
dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
| |
It's unused and overwritten by ir3_compile_shader_nir().
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can remove a bunch of conditional code at key comparison time by
computing a bitmask of used key bits at ir3_shader creation time. This
also gives us a nice place to put additional key simplification to reduce
how many variants we create (like skipping rastflat if we don't read
colors in the FS, or skipping vclamp_color if we don't write colors).
It does mean walking the whole key to AND it, but the key is just 28 bytes
so far so that seems pretty fine.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
| |
Right now that's "always" unless you have shaderdb set.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
| |
The code using it was removed in 4af86bd0b933 ("freedreno/ir3: remove
half-precision output")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We weren't filling in the tess mode of the key, or setting has_gs on GS
shaders, resulting in assertion failures when NIR intrinsics didn't get
lowered.
We have to make a guess at prim mode for TCS, but it should be better to
have some shader-db coverage than none, and it will avoid these failures
happening when we start precompiling shaders.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
|
| |
Some of the negative API tests make shaders for tess stages that don't do
all the stores they need to. Once we start precompiling (or doing
shader-db of tess), we need to at least not segfault when generating them.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were failing to tell the allocator about the restriction that scalar
texture instructions (allocated as scalar regs) couldn't be allocated such
that the start of the full unwritemasked vector started before r0. There
was a patch in select_reg_callback on a6xx that tried to work around that,
but you could still end up backed into a corner you shouldn't be because
we didn't tell the RA what it needed.
Fixes compiler assertion failures on a300-a400's blit_z shader, used for
Z32F gmem blits.
Looks like as a result we get tighter register allocation but more nops:
instructions in affected programs: 757945 -> 760356 (0.32%)
nops in affected programs: 317983 -> 320468 (0.78%)
non-nops in affected programs: 27525 -> 27451 (-0.27%)
mov in affected programs: 3098 -> 3023 (-2.42%)
dwords in affected programs: 109664 -> 110656 (0.90%)
last-baryf in affected programs: 112701 -> 112847 (0.13%)
full in affected programs: 4326 -> 4011 (-7.28%)
sstall in affected programs: 120550 -> 120836 (0.24%)
(ss) in affected programs: 13939 -> 13918 (-0.15%)
(sy) in affected programs: 3006 -> 2786 (-7.32%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
|
|
| |
When the GS lowering was working on store_output intrinsics, we had to
clean up the split vars to avoid getting confused. Now that we shadow
the output vars instead, there's no confusion and we can drop this
hack.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We mostly got away with replacing a store_output with a store_var, but
for complex types like structs, that doesn't work. Once the IO has
been lowered from vars to intrinsic, we've lost the deref chains and
can't properly shadow the outputs.
This commits moves the GS lowering up so we do it before the output
variables get lowered to store_output. This way the pass works much
like nir_lower_io_to_temporaries() and cleanly shadows the outputs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
|
|
|
| |
This pass lowers per-vertex input intrinsics to load_shared_ir3. This
was open coded in the TCS and GS lowering passes before - this way we
can share it. Furthermore, we'll need to run the rest of the GS
lowering earlier (before lowering IO) so we need to split off this
part that operates on the IO intrinsics first.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
| |
We rename it to ir3_nir_lower_to_explicit_output, since it only
handles output and we'll add a lowering pass for input next.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
|
| |
We need shader->stream_output filled out when we layout the push
constants in ir3_setup_const_state(). Otherwise
const_state->offsets.tfbo ends up as ~0, which doesn't work.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
| |
We were trying to deref the vector-collected outputs[] array before it's
been set up, but we want the per-component outputs anyway.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
| |
Fixes a segfault in ir3_legalize.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If use NIR's 1-bit bool representation , we get exactly the bool behavior
the hardware provides: CMPS produces true or false, AND/OR/XOR work as
intended without extra absnegs, and we can pass those half values directly
to other CMPS. We emit an absneg for b2b1 ("turn a memory load into a
1-bit NIR boolean"), but we would have done so for the ir3_n2b() on the
use of that value anyway. The most awkward bit is that inot(a@1) is now a
sub(1, a), but we can encode the 1 as an immediate so it's fine.
No significant changes to GL_TIME_ELAPSED on my set of traces (n=21).
instructions in affected programs: 1570638 -> 1548702 (-1.40%)
nops in affected programs: 624053 -> 611381 (-2.03%)
non-nops in affected programs: 959061 -> 949797 (-0.97%)
mov in affected programs: 5258 -> 5252 (-0.11%)
cov in affected programs: 15099 -> 15902 (5.32%)
dwords in affected programs: 469600 -> 452768 (-3.58%)
last-baryf in affected programs: 162211 -> 154726 (-4.61%)
full in affected programs: 4881 -> 4797 (-1.72%)
sstall in affected programs: 173953 -> 174545 (0.34%)
(ss) in affected programs: 10922 -> 10934 (0.11%)
(sy) in affected programs: 728 -> 745 (2.34%)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>
|
|
|
|
|
|
| |
It's set by ir3_put_dst() immediately after.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>
|
|
|
|
|
|
|
|
| |
Pull in the `SP_xS_BRANCH_COND` regs to keep the mesa and envytools
copies from getting out of sync.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to OUT_REG(), this has the benefits of:
1. No more messing up pkt size
2. Detects errors of mixing up the order of dwords in the packet
3. Optimizes to more efficient code
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>
|
|
|
|
|
|
|
|
|
| |
The existing structure dates back to when this code was part of libdrm,
and we wanted some of this not to be exposed as ABI between libdrm and
mesa. Now that this is no longer a constraint, inline things.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>
|
|
|
|
|
|
|
|
| |
Since we dropped the async flush_queue, we no longer need the refcnts to
be atomic.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>
|
|
|
|
|
|
|
| |
Avoids having spurious differences (and weird values to look at!) in
traces from uninitialized memory.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4824>
|
|
|
|
|
|
|
|
|
| |
Missed updating this when adding tu6_emit_sample_locations
Fixes: a92d2e11095 ("turnip: implement VK_EXT_sample_locations")
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4795>
|
|
|
|
|
|
|
|
| |
These are the draw-stream and primitive-stream, so lets give them more
descriptive names.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4750>
|
|
|
|
|
|
|
|
|
|
| |
In particular, we specifically don't want to let the base change between
passes, as it could end up conflicting with registers assigned in the
first pass.
Mostly-closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2838
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
|
|
|
|
|
|
|
|
| |
Just something I hacked together to help figure out which instructions
can fold in a wideing/narrowing conversion.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
|
|
|
|
|
|
| |
More lines of code, but they're much more intelligible.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
|
|
|
|
|
|
|
| |
Makes sure that we can maintain consistent output from our disassembly as
we refactor. I've only included stuff that matches qcom's disasm so far.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
|
|
|
|
|
|
|
|
| |
This lowers pack_32_2x16/unpack_32_2x16 into the scalar versions of those
instructions.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4738>
|
|
|
|
|
|
|
|
|
|
|
| |
The new option replaces the two other _split lowering options, since
there's no need for separate options.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4738>
|
|
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4716>
|
|
|
|
|
|
| |
We shouldn't need to invert them, and the Vulkan blob doesn't either.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4733>
|
|
|
|
|
|
| |
The st now does this for us.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4732>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4704>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hardware is capable of automatically filling in certain values in
the VPC without writing them from the last geometry stage, like
gl_PointCoord or gl_PrimitiveID when there is no GS. However, we *do*
have to enable these outputs (i.e. set the VPC_VAR_DISABLE bit to 0) as
VPC_VAR_DISABLE is really about FS inputs rather than VS outputs. To do
this, we move the computation of the enable bits to ir3_link_add(),
which is also a nice refactor anyway. In addition we detect the PrimID
case specifically so that the driver can program the location.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4704>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4704>
|
|
|
|
|
|
| |
That way we can parse it again with the assembler.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>
|
|
|
|
|
|
| |
For easier reuse.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>
|