| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Flip the FD_MESA_DEBUG flag to a disable rather than enable, drop the
obsolete comment (and bonus, drop unused softpin debug flag)
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is slightly annoying because it *mostly* works.. but we have some
issues to sort out about how to blit z24s8/x24s8/z24x8 with UBWC before
we can enable UBWC by default. For now it is a step forward to at least
enable it for non-z/s while we figure out how to blit z24s8+UBWC.
(The basic issue is that pretending z24s8 is an equivalently sized rgba
format for the purpose of blitting falls apart when UBWC is in the
picture.)
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
No functional change, the registers have the same layout as MRT flags
pitch reg.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
An older blob claims to support UBWC w/ r32ui an r32i, but not r32f.
Results from deqp indicate that it doesn't work with r32ui and r32i.
This *could* also just mean that use as "IBO" (image) is more limited
than as texture, although blob also doesn't seem to bother to try to use
UBWC with images at all, so hard to know for sure.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We'll need this for a few edge cases, like image/sampler view that uses
a format that UBWC does not support with a resource originally created
in a format that UBWC does support.
NOTE we *could* in some cases do an in-place uncompress. But that has
a couple potential sharp edges:
1) the uncompressed buffer could have different layout, ie. a5xx
with meta and pixel data of layers/levels interleaved.
2) if it comes mid-batch, it would force flush, or somehow fixing
up cmdstream for draws already emitted. But with the resource
shadowing approach we can rely on batch re-ordering to avoid
splitting things.. older draws see the older compressed version,
newer draws see the new uncompressed version of the rsc.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
When uncompressing a UBWC buffer, we don't want to discard anything.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It doesn't come up yet, as so far we only hit this path with linear
buffers. But it will when we start re-using the shadow path for
uncompressing UBWC buffers.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
To uncompress UBWC, I want to re-use the shadow path, but we'll need a
way to request that the new buffer is not compressed.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
A newly created resource can be regarded as idle. We don't care if
the RESOURCE_CREATE command has been retired, unless it is used for
fencing.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Alexandros Frantzis <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The round trip to the kernel is expensive. Add a local cache to
avoid it when possible.
There is a race condition when two contexts access the same resource
at the same time (e.g., ctx1 submits a cmdbuf that accesses a
resource while ctx2 maps the resource). But that is probably an app
bug in the first place.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Alexandros Frantzis <[email protected]>
|
|
|
|
|
|
|
|
| |
Helpers to work with resource list. virgl_drm_release_all_res is
removed.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Alexandros Frantzis <[email protected]>
|
|
|
|
|
|
|
|
| |
We should not reuse a resource for other purposes when it can still
be accessed by another process or device.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Alexandros Frantzis <[email protected]>
|
|
|
|
|
|
|
|
| |
This seems to be a performance win, but more rigorous testing is
necessary to figure out the exact circumstances when this is good/bad.
Incidentally, this fixes non-aligned ZS.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
We might render to it.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
For constant LODs/biases, we can use an immediate embedded in the
texture (already decoded); for non-constant, we have to use a register
squeezed into the usual immediate field, which is decoded here.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
It's not at all clear why this work for texelFetch but not texture.
Maybe the top bits are dual-purpose on other texturing ops...?
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Its encoding differs slightly from the LOD used in normal texture calls.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This is clear for texelFetch, hence the confusion with Bifrost's filter
field, but it's much more general in reality.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This allows us to show a call to textureLod in a reasonable way.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
With an extra flag, we're able to do a perspective division "for free"
while loading a varying.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
...on the load/store unit, not the ALUs. Looks goofy but hey.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch identifies the two modes of offsets in a texture instruction
(immediate and register, disambiguated by the bit-once-known-as
"has_offset") and implements disassembly for both.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This eliminates some unknowns, clarifies 3D textures, and will maybe
help with array/shadow textures?
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
(cherry picked from commit 2a5b4e2b9ffc07f32a7ff5f89176cb892b179c5f)
|
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
(cherry picked from commit 1517811f4f75cd628dd7122d63092f3954a81a7d)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the Vulkan spec, inline uniform blocks are not allowed
to be updated through vkCmdPushDescriptorSetKHR().
These are the spec quotes from "13.2.1. Descriptor Set Layout"
that are relevant for this case:
"VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies
that descriptor sets must not be allocated using this layout, and
descriptors are instead pushed by vkCmdPushDescriptorSetKHR."
"If flags contains
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all
elements of pBindings must not have a descriptorType of
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT".
There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid
this case but it is implied in the creation of the descriptor set
layout as aforementioned.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the Vulkan spec, inline uniform blocks are not allowed
to be updated through vkCmdPushDescriptorSetKHR().
These are the spec quotes from "13.2.1. Descriptor Set Layout"
that are relevant for this case:
"VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies
that descriptor sets must not be allocated using this layout, and
descriptors are instead pushed by vkCmdPushDescriptorSetKHR."
"If flags contains
VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all
elements of pBindings must not have a descriptorType of
VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT".
There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid
this case but it is implied in the creation of the descriptor set
layout as aforementioned.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
`memcmp()` compares a given number of bytes, but `EGLAttrib` is larger than a byte.
Fixes: 8e991ce5397598ceb422 "egl: handle the full attrib list in display::options"
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The number of elements to draw should not be affected by the offset.
A similar fix was submitted for a6xx at 79180a05.
Fixes these dEQP tests on a5xx:
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_8
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_2500
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_separate_grid_500x500_drawcount_2500
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_combined_grid_500x500_drawcount_2500
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_8
dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_2500
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
baseArrayLayer is defined twice, trivial.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
When decompressing resolve source images, we should rely on the
framebuffer layer count instead of resolving all images layers.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Use an explicit pipeline barrier for doing layout transitions
instead of duplicating some code.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When resolving inside a subpass, we should rely on the framebuffer
layer count instead of resolving all images layers. This should
improve performance of layered resolves a bit.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-By: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This skips GLSL IR lowering of pack/unpackHalf operations, allowing
the NIR optimizer to see them
Improves performance in Synmark2's OglCSDof by about 2x, by cutting
about 90% of the cycles from one of the compute shaders.
shader-db statistics on Skylake:
4 compute shaders went from SIMD8 to SIMD16.
total instructions in shared programs: 15598871 -> 15542568 (-0.36%)
instructions in affected programs: 143016 -> 86713 (-39.37%)
helped: 144
HURT: 0
helped stats (abs) min: 17 max: 4669 x̄: 390.99 x̃: 164
helped stats (rel) min: 7.48% max: 85.28% x̄: 30.17% x̃: 24.22%
95% mean confidence interval for instructions value: -510.50 -271.49
95% mean confidence interval for instructions %-change: -32.70% -27.65%
Instructions are helped.
total cycles in shared programs: 371973958 -> 368902103 (-0.83%)
cycles in affected programs: 5557722 -> 2485867 (-55.27%)
helped: 144
HURT: 0
helped stats (abs) min: 106 max: 1026600 x̄: 21332.33 x̃: 1697
helped stats (rel) min: 0.53% max: 88.98% x̄: 36.12% x̃: 34.67%
95% mean confidence interval for cycles value: -41570.02 -1094.64
95% mean confidence interval for cycles %-change: -38.44% -33.80%
Cycles are helped.
total spills in shared programs: 11936 -> 11903 (-0.28%)
spills in affected programs: 110 -> 77 (-30.00%)
helped: 3
HURT: 2
total fills in shared programs: 25644 -> 25178 (-1.82%)
fills in affected programs: 677 -> 211 (-68.83%)
helped: 5
HURT: 0
total loops in shared programs: 4830 -> 4829 (-0.02%)
loops in affected programs: 1 -> 0
helped: 1
HURT: 0
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Was watching a presentation on YT where this was used and it turns
out it is not invalid.
The only case it is actually valid as format in the creation of an
image or image view is with Android Hardware Buffers which have
their format specified externally.
So we can just ignore all entries with VK_FORMAT_UNDEFINED.
Reviewed-by: Samuel Pitoiset <[email protected]>
|