diff options
author | Nicolai Hähnle <[email protected]> | 2018-11-21 18:17:02 +0100 |
---|---|---|
committer | Nicolai Hähnle <[email protected]> | 2018-11-28 18:24:14 +0100 |
commit | eb94b6bd5c99ef9540f16d1ea8d19c3ac54aed84 (patch) | |
tree | bc3327e11073e3c7db801f6492651ce84638afdd /src/gallium/drivers/r600/evergreen_compute.c | |
parent | 35eb81987c0f93215680138fab6595602b7c49a4 (diff) |
winsys/amdgpu: explicitly declare whether buffer_map is permanent or not
Introduce a new driver-private transfer flag RADEON_TRANSFER_TEMPORARY
that specifies whether the caller will use buffer_unmap or not. The
default behavior is set to permanent maps, because that's what drivers
do for Gallium buffer maps.
This should eliminate the need for hacks in libdrm. Assertions are added
to catch when the buffer_unmap calls don't match the (temporary)
buffer_map calls.
I did my best to update r600 for consistency (r300 needs no changes
because it never calls buffer_unmap), even though the radeon winsys
ignores the new flag.
As an added bonus, this should actually improve the performance of
the normal fast path, because we no longer call into libdrm at all
after the first map, and there's one less atomic in the winsys itself
(there are now no atomics left in the UNSYNCHRONIZED fast path).
Cc: Leo Liu <[email protected]>
v2:
- remove comment about visible VRAM (Marek)
- don't rely on amdgpu_bo_cpu_map doing an atomic write
Reviewed-by: Marek Olšák <[email protected]>
Diffstat (limited to 'src/gallium/drivers/r600/evergreen_compute.c')
-rw-r--r-- | src/gallium/drivers/r600/evergreen_compute.c | 4 |
1 files changed, 3 insertions, 1 deletions
diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index a77f58242e3..9085be4e2f3 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -438,7 +438,9 @@ static void *evergreen_create_compute_state(struct pipe_context *ctx, /* Upload code + ROdata */ shader->code_bo = r600_compute_buffer_alloc_vram(rctx->screen, shader->bc.ndw * 4); - p = r600_buffer_map_sync_with_rings(&rctx->b, shader->code_bo, PIPE_TRANSFER_WRITE); + p = r600_buffer_map_sync_with_rings( + &rctx->b, shader->code_bo, + PIPE_TRANSFER_WRITE | RADEON_TRANSFER_TEMPORARY); //TODO: use util_memcpy_cpu_to_le32 ? memcpy(p, shader->bc.bytecode, shader->bc.ndw * 4); rctx->b.ws->buffer_unmap(shader->code_bo->buf); |