summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/r600/evergreen_compute.c
diff options
context:
space:
mode:
authorNicolai Hähnle <[email protected]>2018-11-21 18:17:02 +0100
committerNicolai Hähnle <[email protected]>2018-11-28 18:24:14 +0100
commiteb94b6bd5c99ef9540f16d1ea8d19c3ac54aed84 (patch)
treebc3327e11073e3c7db801f6492651ce84638afdd /src/gallium/drivers/r600/evergreen_compute.c
parent35eb81987c0f93215680138fab6595602b7c49a4 (diff)
winsys/amdgpu: explicitly declare whether buffer_map is permanent or not
Introduce a new driver-private transfer flag RADEON_TRANSFER_TEMPORARY that specifies whether the caller will use buffer_unmap or not. The default behavior is set to permanent maps, because that's what drivers do for Gallium buffer maps. This should eliminate the need for hacks in libdrm. Assertions are added to catch when the buffer_unmap calls don't match the (temporary) buffer_map calls. I did my best to update r600 for consistency (r300 needs no changes because it never calls buffer_unmap), even though the radeon winsys ignores the new flag. As an added bonus, this should actually improve the performance of the normal fast path, because we no longer call into libdrm at all after the first map, and there's one less atomic in the winsys itself (there are now no atomics left in the UNSYNCHRONIZED fast path). Cc: Leo Liu <[email protected]> v2: - remove comment about visible VRAM (Marek) - don't rely on amdgpu_bo_cpu_map doing an atomic write Reviewed-by: Marek Olšák <[email protected]>
Diffstat (limited to 'src/gallium/drivers/r600/evergreen_compute.c')
-rw-r--r--src/gallium/drivers/r600/evergreen_compute.c4
1 files changed, 3 insertions, 1 deletions
diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c
index a77f58242e3..9085be4e2f3 100644
--- a/src/gallium/drivers/r600/evergreen_compute.c
+++ b/src/gallium/drivers/r600/evergreen_compute.c
@@ -438,7 +438,9 @@ static void *evergreen_create_compute_state(struct pipe_context *ctx,
/* Upload code + ROdata */
shader->code_bo = r600_compute_buffer_alloc_vram(rctx->screen,
shader->bc.ndw * 4);
- p = r600_buffer_map_sync_with_rings(&rctx->b, shader->code_bo, PIPE_TRANSFER_WRITE);
+ p = r600_buffer_map_sync_with_rings(
+ &rctx->b, shader->code_bo,
+ PIPE_TRANSFER_WRITE | RADEON_TRANSFER_TEMPORARY);
//TODO: use util_memcpy_cpu_to_le32 ?
memcpy(p, shader->bc.bytecode, shader->bc.ndw * 4);
rctx->b.ws->buffer_unmap(shader->code_bo->buf);