| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
ARB_shader_ballot introduces 7 new system values that can be used
in all shader stages.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: add documentation (Nicolai)
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2 (Nicolai):
- BALLOT isn't per-channel
- expand the documentation (also for VOTE_*)
v3:
- only BALLOT returns a 64-bit lanemask (Boyan)
- relax the requirement on READ_INVOC: the invocation number to read
from must be uniform within a sub-group. This matches the
GL_ARB_shader_ballot spect (and the v_readlane instruction of AMD
GCN)
v4:
- hopefully really fix the doc of VOTE_* returns (Ilia)
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]> (v2)
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Fixes: 7f160efcde4 ("amd/addrlib: import gfx9 support")
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
- fill in DRM version requirement
- disable on SI due to CP DMA faults
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
VM faults cannot be disabled for SDMA on <= VI.
We could still use SDMA by asking the winsys about which parts of the
buffers are committed. This is left as a potential future improvement.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Sparse buffers can never be mapped by the CPU.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We never add fences to backing buffers during submit. When we free a
backing buffer, it must inherit the sparse buffer's fences, so that it
doesn't get re-used prematurely via the cache.
v2:
- remove pipe_mutex_*
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
... and implement the corresponding fence handling.
v2:
- add missing bit in amdgpu_bo_is_referenced_by_cs_with_usage
- remove pipe_mutex_*
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the bulk of the buffer allocation logic. It is fairly simple and
stupid. We'll probably want to use e.g. interval trees at some point to
keep track of commitments, but Mesa doesn't have an implementation of those
yet.
v2:
- remove pipe_mutex_*
- fix total_backing_pages accounting
- simplify by using the new VA_OP_CLEAR/REPLACE kernel interface
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
- remove pipe_mutex_*
- use a simple page commitment array
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This probably has only minor performance effects, but it simplifies some
subsequent code slightly.
Ideally, it could also be used to simplify the handling of slab buffers
in the same way, but unfortunately that's not possible as long as we need
indices for relocations.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
We will use it for delayed adding of sparse buffers' backing buffers.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
v2: fix return type to bool (Marek)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
v2: fix return type to bool (Marek)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
v2:
- explain the resource_commit interface in more detail
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
v2:
- spec quote and style (Ian)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
The sparse buffer implementation requires amdgpu_bo_va_op_raw.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of just advertising the aperture size, we do something more
intelligent. On systems with a full 48-bit PPGTT, we can address 100%
of the available system RAM from the GPU. In order to keep clients from
burning 100% of your available RAM for graphics resources, we have a
nice little heuristic (which has received exactly zero tuning) to keep
things under a reasonable level of control.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds support for using the full 48-bit address space on
Broadwell and newer hardware. Thanks to certain limitations, not all
objects can be placed above the 32-bit boundary. In particular, general
and state base address need to live within 32 bits. (See also
Wa32bitGeneralStateOffset and Wa32bitInstructionBaseOffset.) In order
to handle this, we add a supports_48bit_address field to anv_bo and only
set EXEC_OBJECT_SUPPORTS_48B_ADDRESS if that bit is set. We set the bit
for all client-allocated memory objects but leave it false for
driver-allocated objects. While this is more conservative than needed,
all driver allocations should easily fit in the first 32 bits of address
space and keeps things simple because we don't have to think about
whether or not any given one of our allocation data structures will be
used in a 48-bit-unsafe way.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes issues seen when adding support for full 48-bit addresses.
The 48-bit addresses themselves have nothing to do with it other than
that it caused the kernel to place buffers slightly differently so they
interacted differently with the caches.
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible. In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering. In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
It's possible that the device could have been lost while we were
waiting. We should let the user know if this has happened.
Reviewed-by: Kenneth Graunke <[email protected]>
|