| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Andreas Baierl <[email protected]>
Signed-off-by: Qiang Yu <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3264>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When setting the vertex buffers, lima calls
util_set_vertex_buffers_mask() to reference and copy buffers. That
function
function adds dst with start_slot internally, so lima should not offset
the destination address again.
This is discovered when comparing with other drivers, and fixed by
removing the extra offset in lima_set_vertex_buffers().
This fixes draws that get translated in u_vbuf, because u_vbuf adds
extra vertex buffers when translating.
Signed-off-by: Icenowy Zheng <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3620>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3620>
|
|
|
|
|
|
|
|
|
| |
Gets some basic tests under "dEQP-VK.synchronization.*event*" passing
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3123>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3123>
|
|
|
|
|
|
|
|
|
| |
It gets emitted when needed.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3631>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3631>
|
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3627>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3627>
|
|
|
|
|
|
|
|
|
|
| |
Based on ANV. This removes a bunch of duplicated code for properties.
Fixes: 1b8d99e2885 ("radv: bump conformance version to 1.2.0.0")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3626>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3626>
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The gmem state is split out now, so it does not require synchronization.
But gmem rendering still accesses vsc state from the context.
TODO maybe there is a better way? For gen's that don't do vsc resizing,
this is probably easier.. but for a6xx there isn't really a great
position for more fine grained locking. Maybe it doesn't matter since
in practice the lock shouldn't be contended.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
|
|
| |
Which also has the benefit of getting rid of fd_context::gmem.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
|
|
| |
Prep work to reduce churn in next patch.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
|
|
|
|
| |
In a following patch, when we cache the gmem state, we will want to
treat the gmem state as immuatable. So start converting things to
const to make this more clear.. fd_tile is a good place to start.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
|
|
|
| |
The tile and vsc_pipe arrays are really part of the GMEM configuration.
So pull these out of fd_context and into fd_gmem_stateobj.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
|
|
|
|
| |
Prep work for reorganizing GMEM state and extracting out of fd_context.
The vsc pipe bo was the one thing that doesn't change with GMEM/tile
config.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3503>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was only used to be initialized to zero. Not even updated as
descriptor sets are bind.
As far as I understand, setting the bit TU_CMD_DIRTY_DESCRIPTOR_SET on
tu_cmd_state.dirty is used instead.
Reviewed-by: Jonathan Marek <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3624>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3624>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, is_atomic really meant "is not atomic", contrary to its name.
This commit fixes it to mean what one would think it means.
Fixes: 69bed1c9186c3e24ad54089218d58c5f7b83befe
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3618>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3618>
|
|
|
|
|
|
|
|
|
|
|
| |
There was never much point in artificially limiting chaining to two
batches - we can trivially support arbitrary length chains.
Currently, we should only ever have 1 or 2, but this may change.
Reviewed-by: Tapani Pälli <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>
|
|
|
|
|
|
|
| |
No need to pass it, we can just use batch->screen->devinfo.
Reviewed-by: Tapani Pälli <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the moment, everything is I915_EXEC_RENDER, so this isn't necessary.
But even should that change, I don't think we want to handle multiple
engines in this manner.
Nowadays, we have batch->name (IRIS_BATCH_RENDER, IRIS_BATCH_COMPUTE,
possibly an IRIS_BATCH_BLIT for blorp batches someday), which describes
the functional usage of the batch. We can simply check that and select
an engine for that class of work (assuming there ever is more than one).
Reviewed-by: Tapani Pälli <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3613>
|
|
|
|
|
|
|
|
|
|
| |
This fixes a regression in "vkcube -m headless" rendering, but upsettingly
none of my CTS tests I've been using.
Fixes: 59f29fc845ce ("turnip: Convert the rest of tu_cmd_buffer.c over to the new pack macros.")
Caught-by: Jonathan Marek <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3609>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3609>
|
|
|
|
|
|
|
|
|
|
|
| |
This commit enables the occlusionQueryPrecise feature. No additonal
work is required as occlusion queries are already implemented to
track exact sample counts.
Also enables a number of extra tests on the Vulkan CTS.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3605>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3605>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch has a slight effect on pipelinedb:
Totals from affected shaders:
SGPRS: 23616 -> 21504 (-8.94 %)
VGPRS: 15088 -> 14444 (-4.27 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 662660 -> 664600 (0.29 %) bytes
LDS: 49 -> 49 (0.00 %) blocks
Max Waves: 3079 -> 3204 (4.06 %)
Reviewed-by: Rhys Perry <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
|
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
|
|
|
|
|
|
|
|
| |
This patch fixes register allocation if multiple live-range splits
occur to the same variable within one instruction.
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
|
|
|
|
|
|
|
|
|
|
|
| |
For all VMEM instructions, the resource constant is now
in operands[0]. For MIMG instructions, the sampler shares
operands[1] with write data in case this instruction writes memory.
Moving the VADDR to be the last operand for MIMG is the first step to
support Navi NSA encoding.
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3602>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a NIR generated using SPIR-V initializers to variables, copy
propagation can end up transforming
vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
vec1 32 ssa_35 = mov ssa_33
vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_35 (shared mat2x4) /* ptr_stride=0 */
into
vec1 32 ssa_33 = deref_var &@1 (shared mat2x4)
vec1 32 ssa_7 = deref_cast (mat2x4 *)ssa_33 (shared mat2x4) /* ptr_stride=0 */
Before the optimization, the "head" of a path of deref that uses ssa_7
will be the cast. After, it will be the variable in ssa_33. Since
the types are the same, this is a trivial cast that would be picked up
by nir_opt_deref.
If we need to compare such deref-chain after optimization with another
deref-chain for the same variable, the compare function will get
confused by the cast in the middle.
One alternative would be to add nir_opt_deref to places that compare
derefs, but that might not scale well, so skip the trivial casts when
generating the paths instead.
Motivated by the discussion in
https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3047#note_383660.
Reviewed-by: Jason Ekstrand <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3420>
|
|
|
|
|
|
|
|
|
|
| |
There seems to be more, these are just the ones found in
Detroit: Become Human shaders.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
|
|
|
| |
It can be used later and we want any uses to not be fixed to exec, so it's
definition can't be fixed to exec because of how exec masks interact with
register demand calculation.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
|
|
|
|
| |
process_block() will use this to determine the register demand of the
before the current instruction. Previously, it was filled with zeroes
which could result in process_block() only using the register demand
of after the current instruction.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
| |
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
|
| |
This would have caught the liveness error fixed in the previous commit.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, code like this will be broken:
loop {
if (...) {
break;
} else {
break;
}
}
The continue_or_break block doesn't have any logical predecessors but it's
a logical predecessor of the header block. This liveness error breaks the
spiller in init_live_in_vars() (under "keep variables spilled on all
incoming paths") and eventually creates garbage reloads.
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The operand isn't fixed to exec, which can mess up the spiller. This also
adds a new situation where a phi is needed.
Fixes dEQP-VK.ssbo.layout.random.descriptor_indexing.2 and an assertion
when compiling a Detroit: Become Human shader.
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
|
| |
We don't need to update it since it won't be used later.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
|
|
| |
Loops without continues create header blocks with only 1 predecessor.
CC: <[email protected]>
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A shader might require vgpr spilling but not require sgpr spilling. In
that case, the spiller lowers the sgpr target by 5 which could mean sgpr
spilling is then required. Then the vgpr target has to be lowered to make
space for the linear vgprs. Previously, space wasn't make for the linear
vgprs.
Found while testing the spiller on the pipeline-db with a lowered limit
Fixes: a7ff1bb5b9a78cf57073b5e2e136daf0c85078d6
('aco: simplify calculation of target register pressure when spilling')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3257>
|
|
|
|
|
|
|
|
|
|
|
| |
Now that NGG GS queries are implemented, it should be safe enough
to enable NGG GS by default. It can be disabled with RADV_DEBUG=nongg
if necessary.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
|
|
|
|
|
|
|
|
|
|
|
| |
The number of generated primitives is only counted by the hardware
if GS uses the legacy path. For NGG GS, we need to accumulate that
value in the NGG GS itself. To achieve that, we use a plain GDS
atomic operation.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
|
|
|
|
|
|
|
|
|
| |
For implementing NGG GS queries, we decided to use GDS but GDS OA
is only required for NGG streamout.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a BO or amdgpu_screen_winsys is destroyed.
Should fix leaking such BOs in other DRM file descriptions.
v2:
* Pass the correct file descriptor to drmIoctl (Pierre-Eric
Pelloux-Prayer)
* Use _mesa_hash_table_remove
v3:
* Close handles in amdgpu_winsys_unref as well
v4:
* Adapt to amdgpu_winsys::sws_list_lock.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2270
Fixes: 11a3679e3aba "winsys/amdgpu: Make KMS handles valid for original
DRM file descriptor"
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Namely, if os_same_file_description determined that the DRM file
descriptor references the same file description.
v2:
* Adapt to amdgpu_winsys::sws_list_lock.
v3:
* Fix comparison of amdgpu_screen_winsys file descriptions, see
https://gitlab.freedesktop.org/mesa/mesa/issues/2413 .
* Lock amdgpu_winsys::sws_list_lock for traversing the sws_list in
amdgpu_winsys_create.
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582>
|
|
|
|
|
| |
The name "desc" shadows another variable. Name it "desc_data" like all
of the other descriptor data variables in this file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because softpin block pools are made up of a set of BOs with different
maps, it was possible for a single state to end up straddling blocks.
To fix this, we pass a contiguous size to anv_block_pool_grow and it
ensures that the next allocation in the pool will have at least that
size.
We also add an assert in anv_block_pool_map to ensure we always get
contiguous maps. Prior to the changes to anv_block_pool_grow, the unit
tests failed with this assert. With this patch, the tests pass.
This was causing problems on Gen12 where we allocate the pages for the
AUX table from the dynamic state pool. The first chunk, which gets
allocated very early in the pool's history, is 1MB which was enough that
it was getting multiple BOs. This caused the gen_aux_map code to write
outside of the map and overwrite the instruction state pool buffer which
lead to GPU hangs.
Fixes: 731c4adcf9b "anv/allocator: Add support for non-userptr"
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We intentionally throw away all but one BT block but then we set
cmd_buffer->bt_block to ANV_STATE_NULL instead of the one we hung on to.
This causes the command buffer to immediately re-emit STATE_BASE_ADDRESS
the first time a BT is needed for no good reason.
Reviewed-by: Lionel Landwerlin <[email protected]>
|