| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
|
|
|
|
|
|
|
| |
v2: fix usage in ACO (by Daniel Schürmann)
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
|
|
|
|
|
|
| |
Trivial.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
|
|
|
|
|
|
|
| |
Acked-by: Eric Anholt <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
|
|
|
|
|
|
|
| |
Acked-by: Eric Anholt <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
|
|
|
|
|
|
|
| |
Acked-by: Eric Anholt <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
|
|
|
|
|
|
|
|
|
|
|
| |
Also, rewrite the format decision code so that we correctly decide when
the linear fallback is needed, even if UBWC is disabled. As part of
that, I also moved around some of the code to handle compressed formats
to make sure that copying compressed formats with a linear staging blit
works (this is now possible since we started allowing tiled compressed
textures).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
|
|
|
|
|
|
|
|
|
| |
This is simpler than a full-blown memory reuse mechanism, but is good
enough to make sure that repeatedly doing a copy that requires the
linear staging buffer workaround won't use excessive memory or be slowed
down due to repeated allocations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Totals from 5860 (4.59% of 127638) affected shaders:
VGPRs: 460212 -> 460216 (+0.00%)
CodeSize: 65554356 -> 65464816 (-0.14%)
Instrs: 12655972 -> 12633578 (-0.18%)
Copies: 1309994 -> 1292163 (-1.36%)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Totals from 8487 (6.65% of 127638) affected shaders:
CodeSize: 62061988 -> 62058020 (-0.01%); split: -0.01%, +0.01%
Instrs: 11910757 -> 11885409 (-0.21%); split: -0.21%, +0.00%
Copies: 1065244 -> 1040945 (-2.28%); split: -2.30%, +0.02%
Branches: 349665 -> 348914 (-0.21%)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Totals from 14340 (11.23% of 127638) affected shaders:
SGPRs: 1251648 -> 1251512 (-0.01%)
VGPRs: 994556 -> 994104 (-0.05%); split: -0.06%, +0.01%
CodeSize: 122894528 -> 121099604 (-1.46%); split: -1.49%, +0.03%
MaxWaves: 106039 -> 106103 (+0.06%); split: +0.06%, -0.00%
Instrs: 23860066 -> 23414317 (-1.87%); split: -1.90%, +0.03%
Copies: 2448228 -> 2049305 (-16.29%); split: -16.37%, +0.07%
Branches: 789381 -> 757921 (-3.99%); split: -4.62%, +0.64%
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If one VMEM instruction uses a sampler and the other doesn't, we can't do
this optimization.
Totals from 47 (0.04% of 127638) affected shaders:
CodeSize: 271744 -> 271656 (-0.03%); split: -0.04%, +0.01%
Instrs: 52783 -> 52761 (-0.04%); split: -0.05%, +0.01%
Cycles: 5547040 -> 5546952 (-0.00%); split: -0.00%, +0.00%
VMEM: 10022 -> 9887 (-1.35%)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>
|
|
|
|
|
|
|
|
| |
This was unnecessary and messed with statistics
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[dump_trace_images] Info: Dumping trace /tmp/tracie.test.ap5pshYcsg/traces-db/trace1/magenta.testtrace... ERROR
[dump_trace_images] Debug: === Failure log start ===
invalid literal for int() with base 16: 'in'
[dump_trace_images] Debug: === Failure log end ===
[check_image] Trace /tmp/tracie.test.ap5pshYcsg/traces-db/trace1/magenta.testtrace couldn't be replayed. See above logs for more information.
Traceback (most recent call last):
File "/tmp/tracie.test.ap5pshYcsg/tracie.py", line 176, in <module>
main()
File "/tmp/tracie.test.ap5pshYcsg/tracie.py", line 164, in main
ok, result = gitlab_check_trace(project_url, commit_id, args.device_name, trace, expectation)
TypeError: cannot unpack non-iterable bool object
Fixes: efbbf8bb81e ("tracie: Print results in a machine readable format")
Signed-off-by: Andres Gomez <[email protected]>
Reviewed-by: Rohan Garg <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4839>
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, we will fail when the traces description file doesn't
contain any checksum for the specified device.
Fixes: efbbf8bb81e ("tracie: Print results in a machine readable format")
Signed-off-by: Andres Gomez <[email protected]>
Reviewed-by: Rohan Garg <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4839>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the LLVM version is too old or missing, SotTR applies shader
workarounds and that reduces performance by 2-5% with ACO.
SotTR workarounds are applied with LLVM 8 and older, so reporting
LLVM 9.0.1 should be fine.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4984>
|
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
|
|
|
|
|
|
|
|
|
|
|
| |
ANV and RADV have similar Python code for generating extensions
and dispatch tables.
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GC880 on iMX6DL indicates in it's minorFeatures2 register that it
does support SEAMLESS_CUBE_MAP, however when the TE.SAMPLER_CONFIG1
VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit is set on GC880 on iMX6DL,
the result is corrupted image. In particular, the following ~112 dEQPs
are affected and fail:
dEQP-GLES2.functional.texture.filtering.cube.*
This only happens on MX6DL GC880, MX6Q GC2000 and STM32MP1 GC400(GCnano)
do not report the minorFeatures2 SEAMLESS_CUBE_MAP bit and ignore the
TE_SAMPLER_CONFIG1 VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit (note
that ss->seamless_cube_map is unconditionally set by mesa at times even
PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE returns 0), so there is no visible
problem and there are no failing dEQP tests on the GC2000 and GCnano.
This might imply that the minorFeatures2 SEAMLESS_CUBE_MAP has some
different meaning on GC880 or the SEAMLESS_CUBE_MAP behaves differently
on the GC880.
This patch does not set the SEAMLESS_CUBE_MAP bit on hardware which does
not indicate support for seamless cube map and on GC880, which results
in reduction in failed dEQPs: 635 to 186 on GC880, 274 to 270 on GC2000
and no change on GC400(GCnano).
Fixes: 8dd26fa2f06 ("etnaviv: support GL_ARB_seamless_cubemap_per_texture")
Reviewed-by: Christian Gmeiner <[email protected]>
Signed-off-by: Marek Vasut <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4865>
|
|
|
|
|
|
|
|
|
|
| |
On a6xx we need a 0,0 based scissor in the binning pass, but can use the
blit-scissor to avoid restore/resolve of untouched pixels, and use the
conditional execution if the IB to bin to skip bins with no geometry
(due to the scissor).
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5021>
|
|
|
|
|
|
|
|
|
|
| |
Similar to what we do in postsched. It is useful for pre-RA sched to be
a bit aware of things that would cause syncs. In particular for the tex
fetches, since the vecN src/dst tends to limit postsched's ability to
re-order them.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an instruction's only use is as an output, and it increases register
pressure, then try to avoid scheduling it until there are no other
options.
A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these
immed loads to the end of the shader.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
|
| |
Similar to avoidance of `(ss)` syncs, it turns out to be helpful to
avoid `(sy)` syncs as well. This helps us turn an tex, (sy)alu, tex,
(sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in
gfxbench gl_fill2.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
| |
Once we schedule an instruction that will require an `(ss)` sync flag,
there is no need to delay any further instructions that consume an
SFU result (until the next SFU instruction is scheduled).
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems for short frag shaders, too much prefetch can be detrimental.
I think what we *really* want to do is decide after pre-RA sched, when
we also know about nop's and what the actual ir3 instruction count is.
But that will require re-working how prefetch lowering works. For now
this is a super crude heuristic to attempt to approximate a good
solution.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can no longer assume that `state->ranges[0]` is block 0. It *often*
is, but when we encounter a "real" ubo that we lower to `load_uniform`
before a block 0 `load_ubo`, it could end up another entry in the table.
Resulting in the second pass after gathering ubo ranges, not finding a
valid range. Which results in a `load_ubo` for a thing that is not
actually a ubo making it's way into ir3 frontend. Resulting in grabbing
what we think is a ubo address out of some unrelated const register, and
trying to dereference that. Which as you can imagine, fails in amusing
ways.
Fixes: fc850080ee3 ("ir3: Rewrite UBO push analysis to support bindless")
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
|
|
|
|
|
|
|
|
| |
Should be stable now, and should pass except for MSAA tests
(multisampling is still a todo overall).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
|
|
|
|
|
|
|
| |
Closes #2900
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
|
|
|
|
|
|
|
|
| |
Was used as a crude approximation of the terminate flag, which we now
can do properly.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Corresponds roughly to what we analyze. Note that "terminate AND
execute" is a contradiction (rather: it's equivalent to just
terminating), hence why there are only three possibilities for the
states of the flags:
.cont = continue, don't execute
.last = don't continue, don't execute
.cont.last = continue and execute
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
|
|
|
|
|
|
|
|
| |
Fixes: c643979228 (intel/fs: Choose memory message type based on bit size)
Fixes: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_i8vec2
Reviewed-by: Jason Ekstrand <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5000>
|
|
|
|
|
|
|
| |
Improves nohw drawoverhead 8-ubos update throughput by 13.493% +/-
0.391444% (n=15).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5011>
|
|
|
|
|
|
|
|
|
| |
Looking at perf, the drawoverhead test case was now spending 13% CPU (89%
in that function) on stack management.
nohw drawoverhead throughput 1.03902% +/- 0.380257% (n=13).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
|
|
|
|
|
|
| |
nohw drawoverhead 8 UBOs test throughput 1.06093% +/- 0.363376% (n=10).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
|
|
|
|
|
|
|
|
|
|
| |
This is for an optimization I plan in a following commit. I found I had
to add likely()s to avoid a perf regression from branch prediction.
On the drawoverhead 8 UBOs test, the HW can't quite keep up with the CPU,
but if I set nohw then this change is 1.32023% +/- 0.373053% (n=10).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
|
|
|
|
|
|
|
|
| |
For some CPU-side-only optimizations, it can be nice to disable rendering
so that we can see what the impact is even on cases where the GPU can't
quite keep up.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes a GS regression introduced in !4562 where
ir3's GS lowering pass was moved from common code (ir3_nir) to
freedreno-specific code (ir3_shader). For GS support in turnip, we
need to add the GS lowering pass back in, this time in tu_shader.
As for the nir_gather_info change, the GS lowering pass has always
introduced a discard_if intrinsic into the GS. Previously, we simply
ran nir_shader_gather_info before GS lowering, but now since we lower
the GS before we need to remove the assertion that only a FS can use
the discard_if intrinsic.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
|
|
|
|
|
|
|
|
|
| |
And try a bit harder to find an optimal layout. Improves on a sub-
optimal layout we arrive at in the 4 MRT pass in manhattan, picking
up a bit more than 3%.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>
|