| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
The routines in this file may be shared with Vulkan.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
This is a blob quirk; in so much as I know, the hardware doesn't care.
But we're trying to be bit-identical to take as much entropy out of
traces as possible, so let's introduce the quirk.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
The routines in this file have no dependency on Gallium. Let's share
them so they can be used for a theoretical future Vulkan driver or, more
immediately, consulted when tracing.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
To help make sure we are running tests in the ideal number of threads,
print load stats to make obvious when there's a problem with
utilization.
This will be specially useful when we run tests on a wider variety of
devices.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Some runners may be configured such that the qemu binary might not be
available by the time we need to start running commands within the
chroot.
So make sure that it's there to avoid suprising problems in that case.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
There's lots of locking changes going into the Panfrost kernel driver,
so better be prepared.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
A number of things can go wrong when building the rootfs from within a
non-native chroot, so make sure to print the bootstrap.log so we can
tell what's going on.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's able to run tests in parallel, fully utilizing the HW and
shortening considerable the time it takes.
Needed to disable tests in RK3288 for now because Volt doesn't support
armhf yet, though this should be fixed soon.
Tests are now run with --deqp-gl-config-name=rgba8888d24s8ms0, so we are
hitting a few more failures in tests that previously were being skipped.
The time to run the tests decreases from around 8 minutes to 1:45
minutes, allowing for extending coverage without increasing CI times too
much.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel now supports madvise ioctl to indicate which BOs can be freed
when there is memory pressure. Mark BOs purgeable when they are in the
BO cache. The BOs must also be munmapped when they are in the cache or
they cannot be purged.
We could optimize avoiding the madvise ioctl on older kernels once the
driver version bump lands, but probably not worth it given the other
driver features also being added.
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Our hardware supports independent (per-RT) blending, but we need to
route those settings through from Gallium.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes DATA_INVALID_FAULTs with multiple render targets.
We do always allocate space for 4 cbufs just to keep things sane. This
may not be strictly necessary.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
It's a chicken bit, as far as I can tell. Buck buck.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Otherwise we'll get memory junk.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
I don't think the hardware cares, but this adds a lot of noise to traces
that we would rather not need to look at.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
It doesn't... make a ton of sense to need to assert and this routine is
hotter than you might expect. Doesn't matter for release builds, of
course.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We can't intersect with empty regions.
Fixes: 65ae86b8542 ("panfrost: Add support for KHR_partial_update()")
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Midgard has no hardware support for transform feedback, so we simulate
it in software. Lucky us.
What Midgard does do is write out vertex shader outputs to main memory
unconditonally. Fragment shaders read varyings back from main memory;
there's no on-chip storage for varyings. Whether this was a reasonable
design is a question I will not be engaging in this commit message.
What that does mean is that, in some sense, Midgard *always* does
transform feedback uncondtionally, and there's no way to turn off
transform feedback. Normally, we would allocate some scratch memory
every frame to store the varyings in an arbitrary format (interleaved
for simplicity), and then feed that scratch to the fragment shader and
discard when the rendering completes.
The only difference now is that sometimes, for some buffers, we use a BO
provided to us by Gallium and a format provided by Gallium, instead of
allocating the memory and choosing the format ourselves. This has some
limitations -- in particular, it only works at vec4 granularity, so a
corresponding GLSL linkage patch is needed to correctly implement
transform feedback for non-vec4 types. Nevertheless, given the hardware
already works in this admittedly-bizarre fashion, transform feedback is
"free". Or, at least, it's no more expensive than any other rendering.
Specifically not implemented is dynamically-sized transform feedback
(i.e. with geometry/tesselation shaders).
Spoiler alert: Midgard has no support for geometry *or* tessellation
shaders, despite advertising support. They get compiled to *massive*
compute shaders. How's that for checkbox compliance?
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
We have to maintain the internal offset ourselves. Per v3d.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
| |
We could probably get away with doing this once per pipe_shader_state
but let's not jump down that rabbit hole quite yet.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
It's there in shader_info, but we need to access it from pan_context.c
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We'll need this in a moment. Ken's implementation, lightly edited for
Panfrost.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Suggested-by: Kenneth Graunke <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is a huge hack to workaround incomplete BO flushing logic, but it's
enough for the dEQP transform feedback tests, and doing the resource
management to get this right is out-of-scope for this patch series.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It doesn't really make sense, since we don't have special texture
coordinate varyings, but it'll make some code simpler for XFB and it
doesn't hurt us, even if I lose a bit of my soul setting it.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN should now be handled.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
| |
We're just going to compute them in the driver but let's get the
structures setup to handle them. Implementation from v3d.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implement ->set_damage_region() region to support partial updates.
This is a dummy implementation in that it does not try to merge
damage rects. It also does not deal with distinct regions and instead
pick the largest quad as the only damage rect and generate up to 4
reload rects out of it (the left/right/top/bottom regions surrounding
the biggest damage rect).
We also do not try to reduce the number of draws by passing all quad
vertices to the blit request (would require extending u_blitter)
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Just a sysval to route through.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is of course suboptimal for performance, forcing each
glDispatchCompute call to be submitted separately to the kernel and
finish to completion. However, for the initial bring-up of compute jobs,
this simplifies quite a bit.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For each SSBO index we get from Gallium/NIR, we need two pieces of
information in the shader:
1. The address of the SSBO in GPU memory. Within the shader, we'll be
accessing it with raw memory load/store, so we need the actual address,
not just an index.
2. The size of the SSBO. This is not strictly necessary, but at some
point, we may like to do bounds checking on SSBO accesses.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Rather than hardcoding certain varying buffer indices "by convention",
work it out at draw time. This added flexibility is needed for
futureproofing and will be enable streamout.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This will allow us to shuffle buffers.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This code is fairly self-contained, so let's factor it out of the giant
pan_context.c monster.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Just as easy/hard as the rest of XFB.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Pretty much copypasted from v3d to jumpstart us.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
What we call GROWABLE in Mesa corresponds to the HEAP BO flag in the
kernel. These buffers cannot be memory mapped in the CPU side at the
moment, so make sure they are also marked INVISIBLE.
This allows us to allocate a big heap upfront (16MB) without actually
reserving space unless it's needed.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Unless a BO has the EXECUTABLE flag, mark it as NOEXEC.
v2: - Rework version detection (Alyssa).
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will be useful right now so we avoid retrieving a non-executable
buffer when a executable one is needed.
As we support more flags, this logic will need to be extended to
consider the different trade-offs to be made when matching BO
specifications to BOs in the cache.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of all shaders being stored in a single BO, have each shader in
its own.
This removes the need for a 16MB allocation per context, and allows us
to place transient blend shaders in BOs marked as executable (before
they were allocated in the transient pool, which shouldn't be
executable).
v2: - Store compiled blend shaders in a malloc'ed buffer, to avoid
reading from GPU-accessible memory when patching (Alyssa).
- Free struct panfrost_blend_shader (Alyssa).
- Give the job a reference to regular shaders when emitting
(Alyssa).
v3: - Split out the allocation flags change (Rob).
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Midgard does not accept a index_bias directly and relies instead on a
bias correction offset (offset_bias_correction) in order to calculate
the unbiased vertex index.
We need to make sure we adjust offset_start and vertex_count in order to
take into account the index_bias as required by a
glDrawElementsBaseVertex call and then supply a additional
offset_bias_correction to the hardware.
Signed-off-by: Rohan Garg <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
These tests have been fixed by:
b514f411837b ("glcpp: use pre-expansion line number for __LINE__")
Signed-off-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation for an initial 19.2 release, add a blacklist for apps
known to be buggy under Panfrost to protect users. Panfrost is NOT a
conformant implementation at this time.
Distros: please do not revert this patch. If blacklisted apps are run
using Panfrost, dragons will bite you. Thanks :)
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Acked-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than alloacting a huge (64MB) polygon list on context creation
and sharing it across framebuffers, we instead allocate polygon lists as
BOs (which consistently hit the cache) sized appropriately; for about a
month, we've known how to calculate the polygon list size so this has
only recently become possible.
The good news is we can render to truly massive framebuffers without
crashing and, more importantly, we eliminate the 64MB upfront overhead.
If a list that size isn't actually needed, it's not allocated.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
| |
Allows us to pass BOs without checking if they're NULL or not.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
The only user of this function always passes true.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
The FB desc will be emitted/attached on the first draw targetting
this new FB.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The wallpaper blit is a bit special in that the operation is targetting
the current FB, but the u_blitter logic creates a new surface for it
which makes util_framebuffer_state_equal() return false. In that case
we don't want a new FB descriptor to be emitted/attached, so let's just
copy the new state into ctx->pipe_framebuffer and exit the function.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If the current FB matches the new one there's nothing to be done in
panfrost_set_framebuffer_state(). By bailing out early in that case we
avoid emitting new FB descriptors (the old ones are still valid).
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
No need to emit SFBD/MFBD at frame invalidation. They can be emitted
when the framebuffer is attached, which saves us a potential FB desc
re-allocation if a new FB is bound after the swap.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|