| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This just makes the logic of the checks for this enum the same for
gl{Enable,Disable} and for glIsEnabled. They are already functionally
the same, so this is just a minor code-cleanup.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
{En,Dis}ableClientState(PRIMITIVE_RESTART_NV) should only work on
compatibility contextxs. While we're at it, modernize the code a bit,
by using the extension helpers instead of open-coding.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It makes sense to use the image view formats when resolving
inside subpasses, while we have to use the image formats for
normal resolves.
Original patch by Philip Rebohle.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110348
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure to sync all previous work if the given command buffer
has pending active queries. Otherwise the GPU might write queries
data after the reset operation.
This fixes a bunch of new dEQP-VK.query_pool.* CTS failures.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the following packets use actual driver provided sizes rather
than guessing an arbitrary number:
- CC_VIEWPORT
- SF_CLIP_VIEWPORT
- BLEND_STATE
- COLOR_CALC_STATE
- SCISSOR_RECT
Reviewed-by: Sagar Ghuge <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Link all Gallium drivers with ld_args_build_id to prevent failures in
Iris that uses GNU_BUILD_ID
Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=110757
Fixes: 4756864cdc5f "iris: Start wiring up on-disk shader cache"
Signed-off-by: Mike Lothian <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a problem where the same instruction gets replaced twice.
This was happening when the replaced instruction would be at the end
of a block.
Replacement of :
if ssa_8 {
....
intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
}
Would be :
if ssa_8 {
loop {
vec1 32 ssa_47 = intrinsic read_first_invocation (ssa_44) ()
vec1 1 ssa_48 = ieq ssa_47, ssa_44
if ssa_48 {
loop {
vec1 32 ssa_49 = intrinsic read_first_invocation (ssa_44) ()
vec1 1 ssa_50 = ieq ssa_49, ssa_44
if ssa_50 {
intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
break
} else {
....
}
Signed-off-by: Lionel Landwerlin <[email protected]>
Fixes: 3bd545764151 ("nir: Add a lowering pass for non-uniform resource access")
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If the driver waits for CP DMA to be idle and emit an EOP event
we need more space.
This fixes a crash with Quake Champions.
Cc: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
(Technically this is common code, but it doesn't affect i965 or anv.)
Improves performance of GFXBench5/gl_tess_off on Skylake GT4e at 1080p
by 9.3933% +/- 0.0305157% by eliminating all spilling in the GS.
Improves performance of GFXBench5/gl_4_off (Car Chase) on Skylake GT4e
at 1080p by 0.325208% +/- 0.0842233% (n=18).
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We scalarize IO to enable further optimizations, such as propagating
constant components across shaders, eliminating dead components, and
so on. This patch attempts to re-vectorize those operations after
the varying optimizations are done.
Intel GPUs are a scalar architecture, but IO operations work on whole
vec4's at a time, so we'd prefer to have a single IO load per vector
rather than 4 scalar IO loads. This re-vectorization can help a lot.
Broadcom GPUs, however, really do want scalar IO. radeonsi may want
this, or may want to leave it to LLVM. So, we make a new flag in the
NIR compiler options struct, and key it off of that, allowing drivers
to pick. (It's a bit awkward because we have per-stage settings, but
this is about IO between two stages...but I expect drivers to globally
prefer one way or the other. We can adjust later if needed.)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes the egl_mesa_platform_surfaceless piglit test as well
as the new egl_ext_device_base piglit test on classic swrast.
v2: Fix swrast surfaceless contexts on the driver side.
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Mathias Fröhlich <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes use of radv_meta_resolve_compute_image() by filling
a VkImageResolve region instead of duplicating code.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110711
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 55376cb31e2f495a4d872b4ffce2135c3365b873.
It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the
original issue. It seems i965 only ever applied this workaround to the
18.0 branch.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using the binding tables to access arrays of YCbCr descriptors we
did not consider the offset of the accessed element. We can't do a
simple multiple because the binding table entries are tightly packed.
For example element 0 of the array could use 2 entries/planes and
element 1 could use 2 entries/planes.
Signed-off-by: Lionel Landwerlin <[email protected]>
Fixes: 3bb8768b9d62 ("anv: toggle on support for VK_EXT_ycbcr_image_arrays")
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
| |
- unify the code
- choose radeon or amdgpu based on the DRM version, not based on which one
succeeds first
|
| |
|
|
|
|
| |
it's the same design
|
|
|
|
|
|
|
|
| |
Fixes following piglit and does not introduce any regressions.
spec@ext_packed_depth_stencil@fbo-depth-gl_depth24_stencil8-blit
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Lucas Stach <[email protected]>
|
|
|
|
|
|
|
|
| |
This commit also adds codegen for branch since we need it
for discard_if.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
| |
Based on ANV.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The old code was not wrong because the transitions performed
after the resolves should re-emit the framebuffer if needed.
This change is mostly a no-op but it improves consistency
regarding other meta operations that need to save/restore subpasses.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This helper will be useful for clearing HTILE after some
depth/stencil resolves.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The libmesa_anv_gen* modules require anv_extensions.h, patch makes sure
it gets generated as a dependency before building them.
Signed-off-by: Chenglei Ren <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Cc: <[email protected]>
|
|
|
|
|
|
|
|
| |
Just move the block that checks the availability bit into the
switch like other query types.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
We already flag IRIS_DIRTY_URB when we change it, but we were
additionally flagging it on every BLORP operation, even if we didn't.
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 12bf7cfecf52083c484602f971738475edfe497e.
This commits caused lots of problems:
https://bugs.freedesktop.org/show_bug.cgi?id=110721
https://bugs.freedesktop.org/show_bug.cgi?id=110761
Fixes: 12bf7cfecf52 ("mesa: unreference current winsys buffers when unbinding winsys buffers")
Pushing without review as we need to get it into next stable.
|
|
|
|
|
|
|
|
|
| |
Fix a regression I inadvertently caused by acking typeless movs before
implementing/pushing this *whistles*
Nothing to see here, move along folks.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
lima_blit will do blit between resources with different levels.
When blit from a level!=0 source, it will sample from that level
of resource as texture.
Current texture setup won't respect level when not mipmap filter.
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
| |
Current implementation won't respect level of surface to render.
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Imagine this
resource_copy_region(ctx, dst, ..., src, ...);
transfer_map(ctx, src, 0, PIPE_TRANSFER_WRITE, ...);
at the beginning of a cmdbuf. We need to flush in transfer_map so
that the transfer is not reordered before the resource copy. The
check for "vctx->num_draws == 0 && vctx->num_compute == 0" is not
enough. Removing the optimization entirely.
Because of the more precise resource tracking in the previous
commit, I hope the performance impact is minimized. We will have to
go with perfect resource tracking, or attempt a more limited
optimization, if there are specific cases we really need to optimize
for.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Gurchetan Singh <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This gives us more precise resource tracking. It can be beneficial
because glFlush is often followed by state changes. We don't want
to reemit resources that are going to be unbound.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Gurchetan Singh <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Gurchetan Singh <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The default null_output really needs to be static, otherwise the values
we'll eventually get later are doubly random (they are not initialized,
and even if they were it's a pointer to a local stack variable).
VMware bug 2349556.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
We are currently leaking resources if they were sampled from. Once we
are done with a sampler, we should dereference that resource.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Jump over the container stage if we haven't changed any of the files
that involved in building the container images.
This saves 1-2 minutes in each run and helps conserve resources.
Signed-off-by: Tomeu Vizoso <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The difference between imov and fmov has been a constant source of
confusion in NIR for years. No one really knows why we have two or when
to use one vs. the other. The real reason is that they do different
things in the presence of source and destination modifiers. However,
without modifiers (which many back-ends don't have), they are identical.
Now that we've reworked nir_lower_to_source_mods to leave one abs/neg
instruction in place rather than replacing them with imov or fmov
instructions, we don't need two different instructions at all anymore.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Unless source modifiers are present, fmov and imov are the same.
There's no good reason for having two helpers.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
It's potentially a tiny bit less efficient but the helpers make it much
easier to sort out the rules for updating source modifiers.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
A few of our very late passes can end up generating vectors accidentally
so we need to get rid of them. The only known case of this is the ffma
peephole which generates fneg and fabs as vectors. Currently, they're
not a problem because they get turned into fmov which the back-end
compiler knows how to handle as a vector. That's about to change.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This flag has caused more confusion than good in most cases. You can
validly use imov for floats or fmov for integers because, without source
modifiers, neither modify their input in any way. Using imov for floats
is more reliable so we go that direction.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Both of these passes predate the nir_channel helper. We should just use
it instead of hand-rolling it in both passes.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Found during code inspection.
Cc: [email protected]
Signed-off-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If descriptorType is VK_DESCRIPTOR_TYPE_STORAGE_IMAGE
or VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT, the imageView member of each
element of pImageInfo must have been created with the identity swizzle.
Fixes: d2aa65eb
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Verified to work properly with Iris driver on Android Celadon. Cache
files get generated as 'com.android.opengl.shaders_cache' for each
application.
v2: check that cache was returned (Eric Engestrom)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that the online compiler and pandecode are reliable and upstreamed,
nobody is using this. If somebody does need it, it should be easy enough
to bring back, I suppose. At the moment, it's just a maintenance hazard,
since meson is silly and does double builds for compiler updates (triple
for disassembler changes).
If people need the standalone _disassembler_, that can be added
trivially into pandecode (pandecode already includes the disassembler).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Ryan Houdek <[email protected]>
|