| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Later when we do this pass iteratively, we can drop some of the internal
iteration and just rely on this pass getting run until there is no more
progress.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
|
|
|
|
|
|
|
|
| |
Eventually we'll pull the iteration out of the pass itself, but the
first step is to just report progress.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
|
|
|
|
|
|
|
|
|
| |
In a later patch, this will get folded into an IR3_PASS() macro, at
least for most passes. But to do that, it is better to standardize
on printing the ir3 after the pass.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
|
|
|
|
|
|
|
| |
We haven't used this for a while.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Older kernels did not support `MSM_INFO_GET_IOVA`. But this is only
required for (a) clover (ie. `fd_set_global_binding()`) and drm paths
that are limited to newer kernels. So move the location of the assert
to fix new userspace on old kernels.
Fixes: c9e8df61dc8 ("freedreno: Initialize the bo's iova at creation time.")
Signed-off-by: Rob Clark <[email protected]>
Tested-by: Ilia Mirkin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5081>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"softpin" mode was introduced in the same kernel as the 'DUMP' flag. So
if we are using the legacy non-softpin path, clear the dump flag. OTOH
the 'DUMP' flag isn't quite so needed on older kernels, since we would
get all cmdstream, even SDS stateobjs, dumped regardless, as they would
have cmd table entries.
Fixes: b2c23b1e48f ("freedreno: Mark all ringbuffer BOs as to be dumped on crash.")
Signed-off-by: Rob Clark <[email protected]>
Tested-by: Ilia Mirkin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5081>
|
|
|
|
|
|
|
|
|
| |
To fix an issue reported here:
https://bugs.chromium.org/p/chromium/issues/detail?id=1083815
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5088>
|
|
|
|
|
|
|
|
|
|
|
| |
A3xx apparently has higher alignment requirements than later gens for
indirect const uploads. It also has fewer of them. Add compiler
parameters for both settings, and set accordingly for a3xx and a4xx+.
This fixes all the ubo test failures caused by this optimization.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5077>
|
|
|
|
|
|
|
|
|
|
| |
This causes failures on a3xx resulting in the non-sensical dEQP failures
on packUnorm2x16. The same test uses ldlv on a4xx+, so just disallow
(sat) on bary.f on all generations.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5074>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a number of dEQP tests:
dEQP-GLES3.functional.fbo.blit.conversion.r8*
dEQP-GLES3.texture.specification.basic_teximage2d.r8*
and others. The reason why this enum showed up in traces for R8 is that
it was an "upgraded" texture to R8G8.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5073>
|
|
|
|
|
|
|
|
|
|
|
| |
Device UUID becomes SHA1('freedreno' + gpu_id).
Driver UUID becomes SHA1(mesa-version + git-head-sha1).
v2: Don't use build_id for driver UUID since it generates different
values for vulkan and gl shared objects. (Kristian)
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>
|
|
|
|
|
|
|
|
|
| |
The new files are created under a 'common' folder under 'src/freedreno',
where shared functionality between GL and Vulkan drivers (that is not
registers, layout or compiler) will be placed.
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>
|
|
|
|
|
|
|
|
| |
Whoops. After fixing dual-source blending, dEQP-VK.pipeline.blend.* all
go from skipped to pass, and fixes a bunch of
dEQP-VK.api.info.format_properties.* tests where blending is required.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
|
|
|
|
|
|
|
| |
This needs to be pipeline state because it can change when dual-source
blending is active.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
|
|
|
|
|
|
|
|
| |
The hardware expects that where MRT0 and MRT1 would normally go are the
dual sources for MRT0, whereas GLSL has an extra "index" parameter that
indicates which source it is. Remap it when handling FS outputs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
|
|
|
|
|
|
|
|
|
|
|
| |
For the piglit drawoverhead case, 5/18 of the objects' relocs were
duplicated. We can dedupe them at object create time (since objects are
long-lived) and avoid repeated relocation work at emit time.
nohw drawoverhead program statechange throughput 2.34082% +/- 0.645832%
(n=10).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5020>
|
|
|
|
|
|
|
| |
Apparently I've never dumped a fully populated slices array, so the 0-init
always terminated the loop.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5020>
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It seems less brittle to not assume they are in the same order for src
and dst instructions.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Cleans up a couple spots that were still open-coding this.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
These intrinsics are supposed to map to the underlying hardware
instructions, which don't have wrmask. We use them when we lower
store_output in the geometry pipeline and since store_output gets
lowered to temps, we always see full wrmasks there.
|
|
|
|
|
|
|
|
|
|
|
|
| |
It saves addressing math, but may cause multiple loads to be done and
bcseled due to NIR not giving us good address alignment information
currently. I don't have any workloads I know of using non-const-uploaded
UBOs, so I don't have perf numbers for it
This makes us match the GLES blob's behavior, and turnip (other than being
bindful).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
|
|
|
|
|
|
|
|
|
| |
With the upcoming LDC usage in the GL driver, we don't want to be
uploading descriptors for every UBO when they aren't actually in use.
Trimming NIR's num_ubos will avoid that, and cleans up num_ubo handling
elsewhere right now.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
|
|
|
|
|
|
|
|
| |
I found that when moving more UBOs to load_ubo_ir3, analyze_ubo_ranges
would move things back in a broken way. We can just run this pass later
and drop the _ir3 path.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
|
|
|
|
|
|
|
| |
Otherwise, we might end up inserting the nir_intrinsic_load_ubo_ir3()
after the non-offset src's definition, leading to nir_validate() failures.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
|
|
|
|
|
|
| |
Just copy the src through.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
|
|
|
|
|
|
|
| |
After fixing the power of two sizing, pitches worked, but 1-pixel high and
unaligned height miplevels were off.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
|
|
|
|
|
|
|
|
|
| |
The HW requires a log2 width/height of the level 0 meta_* size in the
descriptors, making it pretty clear that UBWC mipmapping is all
power-of-two sized. Fixes a bunch of failures in the upcoming unit UBWC
layout unit tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
|
|
|
|
|
|
|
|
| |
Using texturator on a P3A at 1024x1024, RG8 has log2w/h of 6x7 instead of
R16I/UI's 6x8. The other blockw/h I verified other than cpp=1
(R8/R8I/R8UI didn't use UBWC) and 32 (would need a bigger type).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
|
|
|
|
|
|
|
| |
The r8g8 case UBWC alignment will be changing in the next commit, so
fdl6_get_ubwc_blockwidth needs to start paying attention to r8g8 too.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
|
|
|
|
|
|
|
| |
These offsets are hand-computed referencing msm_media_info.h, and match
our driver's current behavior.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
|
|
|
|
|
|
|
|
| |
Noticed when poking around with texture layouts and found that my big
texture layout from the blob buffer overflowed. Values come from
http://vulkan.gpuinfo.org for Adreno 418, 512, 630.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
|
|
|
|
|
|
|
|
|
|
|
| |
Also, rewrite the format decision code so that we correctly decide when
the linear fallback is needed, even if UBWC is disabled. As part of
that, I also moved around some of the code to handle compressed formats
to make sure that copying compressed formats with a linear staging blit
works (this is now possible since we started allowing tiled compressed
textures).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
|
|
|
|
|
|
|
|
|
| |
This is simpler than a full-blown memory reuse mechanism, but is good
enough to make sure that repeatedly doing a copy that requires the
linear staging buffer workaround won't use excessive memory or be slowed
down due to repeated allocations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
|
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
|
|
|
|
|
|
|
|
|
|
| |
Similar to what we do in postsched. It is useful for pre-RA sched to be
a bit aware of things that would cause syncs. In particular for the tex
fetches, since the vecN src/dst tends to limit postsched's ability to
re-order them.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an instruction's only use is as an output, and it increases register
pressure, then try to avoid scheduling it until there are no other
options.
A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these
immed loads to the end of the shader.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
|
| |
Similar to avoidance of `(ss)` syncs, it turns out to be helpful to
avoid `(sy)` syncs as well. This helps us turn an tex, (sy)alu, tex,
(sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in
gfxbench gl_fill2.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
| |
Once we schedule an instruction that will require an `(ss)` sync flag,
there is no need to delay any further instructions that consume an
SFU result (until the next SFU instruction is scheduled).
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems for short frag shaders, too much prefetch can be detrimental.
I think what we *really* want to do is decide after pre-RA sched, when
we also know about nop's and what the actual ir3 instruction count is.
But that will require re-working how prefetch lowering works. For now
this is a super crude heuristic to attempt to approximate a good
solution.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can no longer assume that `state->ranges[0]` is block 0. It *often*
is, but when we encounter a "real" ubo that we lower to `load_uniform`
before a block 0 `load_ubo`, it could end up another entry in the table.
Resulting in the second pass after gathering ubo ranges, not finding a
valid range. Which results in a `load_ubo` for a thing that is not
actually a ubo making it's way into ir3 frontend. Resulting in grabbing
what we think is a ubo address out of some unrelated const register, and
trying to dereference that. Which as you can imagine, fails in amusing
ways.
Fixes: fc850080ee3 ("ir3: Rewrite UBO push analysis to support bindless")
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes a GS regression introduced in !4562 where
ir3's GS lowering pass was moved from common code (ir3_nir) to
freedreno-specific code (ir3_shader). For GS support in turnip, we
need to add the GS lowering pass back in, this time in tu_shader.
As for the nir_gather_info change, the GS lowering pass has always
introduced a discard_if intrinsic into the GS. Previously, we simply
ran nir_shader_gather_info before GS lowering, but now since we lower
the GS before we need to remove the assertion that only a FS can use
the discard_if intrinsic.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
|
|
|
|
|
|
|
| |
Now that layout code supports this, we can enable it.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems this is actually a "minimum pitch" value. For example
TFETCH6_2_BYTE means a minimum pitch of 128 bytes for mipmap levels.
This fixes breakage with compressed formats. For example this test:
dEQP-VK.pipeline.sampler.view_type.2d.format.eac_r11_snorm_block.mipmap.linear.lod.equal_min_3_max_3
Fixes: a34b3fa198a4f ("freedreno/fdl: Align after dividing by block size")
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>
|