| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This path will be eventually improved later but as it's only
used on SI (or with RADV_DEBUG=noibs), I'm not sure if that
matters much.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
The chained submission is the fastest path and it should now
be used more often than before. This removes some EOP events.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
I don't think we want to wait for something that hasn't been
correctly submitted. This is similar to the fallback path.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
In case we failed to submit the CS correctly.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the number of unique BO is 0, we optimize the list creation
by copying all buffers of the current CS directly into it. But
this is only valid if the CS doesn't have virtual buffers,
otherwise they are not added and hw might report VM faults.
This fixes VM faults with:
dEQP-VK.sparse_resources.image_sparse_binding.2d.rgba8ui.1024_128_1
CC: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We mostly use the same priority for all buffer objects, so
I don't think that matter much. This should reduce CPU
overhead a little bit.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pthread types are used in some files without explicitely including pthread.h.
This leads to compile errors on Android 7.x nougat-x86
e.g. in src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h
In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c:31:
In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h:32:
external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h:52:2: error: unknown type name 'pthread_mutex_t'
pthread_mutex_t global_bo_list_lock;
^
1 error generated.
Including pthread.h explicitely solves the building error
Signed-off-by: Mauro Rossi <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Acked-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This introduces a new flag called RADEON_FLAG_32BIT.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This is needed for 32-bit GPU pointers. Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It doesn't support GFX9.
Acked-by: Dave Airlie <[email protected]>
Acked-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
A bo's ref_count was not being initialized when imported from an fd.
Therefore, we would fail to free the resource during VkFreeMemory().
This patch fixes applications like hifi VR in threaded mode, which
perform frequent imports/releases of IPC shared memory.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
CC: 18.0 18.1 <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SI family doesn't support chaining which means the maximum
size in dwords per CS is limited. When that limit was reached
we failed to submit the CS and the application crashed.
This patch allows to submit up to 4 IBs which is currently the
limit, but recent amdgpu supports more than that.
Please note that we can reach the limit of 4 IBs per submit
but currently we can't improve that. The only solution is to
upgrade libdrm. That will be improved later but for now this
should fix crashes on SI or when using RADV_DEBUG=noibs.
Fixes: 36cb5508e89 ("radv/winsys: Fail early on overgrown cs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Ported from RadeonSI.
Local BOs ignore BO priorities, and we don't need those on APUs.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With update after bind we can't attach bo's to the command buffer
from the descriptor set anymore, so we have to have a global BO
list.
I am somewhat surprised this works really well even though we have
implicit synchronization in the WSI based on the bo list associations
and with the new behavior every command buffer is associated with
every swapchain image. But I could not find slowdowns in games because
of it.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
- remove mtypes.h from most header files
- add main/menums.h for often used definitions
- remove main/core.h
v2: fix radv build
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Handles the !waitAll and signal after the start of the wait cases correctly.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Ported from the radeonsi GL_AMD_pinned_memory implementation.
Signed-off-by: Fredrik Höglund <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This also used boolean, so nice to kill that.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Needed for the following commit.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These seem mildly unstable on vega, crashing CTS in various fun ways,
and looks like leaking memory.
Disable for now, but leave the option to enable them.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It uses slightly more memory (though still bounded by the number
of mapped ranges), but gives less quadratic behavior.
Cuts 4 minutes from the runtime of the CTS *.sparse.* tests.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Might be useful to know the VRAM/GTT usage, the number of VRAM
CPU page faults, etc. Nothing is currently using that new
interface, but it's a first step.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This happens when all BOs have the RADEON_FLAG_NO_INTERPROCESS_SHARING
(DRM version >= 3.23) flag set. This flag is mainly used for reducing
overhead on the userspace side because we don't have to put those BOs
inside the list.
Though, if the driver tries to create a list with 0 buffers inside it,
libdrm returns -EINVAL and the app just crashes.
This fixes a bunch of CTS dEQP-VK.sparse_resources.* fails (~100).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses C++11 initializer lists.
I just overwrote all Mesa files with internal addrlib and discarded
hunks that we should probably keep, but I might have missed something.
The code depending on ADDR_AM_BUILD is removed. We can add it back next
time if needed.
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
We can avoid adding the buffer in the non-local case, this will
avoid all the overhead of the indirect call.
Reviewed-by: Samuel Pitoiset <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This uses the new kernel interfaces for reduced cs overhead,
We only set the local flag for memory allocations that don't have
a dedicated allocation and ones that aren't imports.
v2: add to all the internal buffer creation paths.
v3: missed some command submission paths, handle 0/empty bo lists.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implicit sync kicks in when a buffer is used by two different amdgpu
contexts simultaneously. Jobs that use explicit synchronization
mechanisms end up needlessly waiting to be scheduled for long periods
of time in order to achieve serialized execution.
This patch disables implicit synchronization for all radv allocations
except for wsi bos. The only systems that require implicit
synchronization are DRI2/3 and PRIME.
v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC
v3: Add drm version check (Bas)
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This extension allows the caller to change a queue's system wide
priority. This is useful for applications with specific
latency constraints.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Older kernels fail the va_op with this flag set. If the kernel
supports GFX9 usefully, it will also support this flag.
Fixes: e8d57802fea "radv/gfx9: allocate events from uncached VA space"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We are really not going to use a winsys which does not need to store
the va, so might as well store it in a standard field.
Not sure this helps perf much though, as most of the cost is in the
cache miss accessing the bo anyway, which we stil need to do.
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
To dump some status MMIO registers when a hang is detected.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We are seeing apps that sometimes rely on Windows behaviour, add
a flag to rule out vram zeroing.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Fixes: e8d57802f (radv/gfx9: allocate events from uncached VA space)
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This copies what amdgpu-pro does, and allocates the memory
for an event with an uncached mtype.
This fixes hangs with:
dEQP-VK.api.command_buffers.record_simul_use_primary
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is a precursor to the gfx9 fix to use uncached for the event
memory. Move to the interface which allows setting the flags,
but wrap it to avoid having to copy it around the place.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
1168->1160.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 611076a41aac3095a82dff2432943d7f8d429822.
With the two previous commits, vega shouldn't be unstable,
doesn't pass CTS, but can do a complete run, and games shouldn't
hang anymore, so bring it back online.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "17.2" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|