summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorKenneth Graunke <[email protected]>2017-07-17 12:46:58 -0700
committerKenneth Graunke <[email protected]>2017-07-22 19:34:42 -0700
commit38e2142f392f9b6ac78eab72a1f92dd37553e1d8 (patch)
tree313c1b1b015e3429b6a522f3743252d2646dbc99
parenteb1497e968bd4a0edc1606e8a6f708fab3248828 (diff)
i965/bufmgr: Explicitly wait instead of using I915_GEM_SET_DOMAIN.
With the advent of asynchronous maps, domain tracking doesn't make a whole lot of sense. Buffers can be in use on both the CPU and GPU at the same time. In order to avoid blocking, we stopped using set_domain for asynchronous mappings, which means that the kernel's tracking has lies. We can't properly track it in userspace either, as the kernel can change domains on us spontaneously (for example, when un-swapping). According to Chris Wilson, I915_GEM_SET_DOMAIN does the following: 1. pins the backing storage (acquiring pages outside of the struct_mutex) 2. waits either for read/write access, including inter-device waits 3. updates the domain, clflushing as required 4. marks the object as used (for swapping) 5. turns off FBC/PSR/fancy scanout caching Item (1) is not terribly important. Most BOs are recycled via the BO cache, so they already have pages. Regardless, we fixed this via an initial set_domain in the previous patch. We implement item (2) with I915_GEM_WAIT. This has one downside: we'll stall unnecessarily if we do a read-only mapping of a buffer that the GPU is reading. I believe this is pretty uncommon. We may want to extend the wait ioctl at some point. Mesa already does item (3) itself. For cache-coherent buffers (most on LLC systems), we don't need to do any clflushing - the CPU and GPU views are coherent. For non-coherent buffers (most on non-LLC systems), we currently only use the CPU for read-only maps, and we explicitly clflush when necessary. We don't care about item (4)...swapping has already killed performance. Plus, with async maps, the kernel's domain tracking is already bogus, so it can't do this accurately regardless. Item (5) should be okay because we avoid cached maps of scanout buffers. Reviewed-by: Matt Turner <[email protected]>
-rw-r--r--src/mesa/drivers/dri/i965/brw_bufmgr.c23
1 files changed, 6 insertions, 17 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c
index cd85874ea52..fc2a2276236 100644
--- a/src/mesa/drivers/dri/i965/brw_bufmgr.c
+++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c
@@ -670,22 +670,13 @@ brw_bo_unreference(struct brw_bo *bo)
}
static void
-set_domain(struct brw_context *brw, const char *action,
- struct brw_bo *bo, uint32_t read_domains, uint32_t write_domain)
+bo_wait_with_stall_warning(struct brw_context *brw,
+ struct brw_bo *bo,
+ const char *action)
{
- struct drm_i915_gem_set_domain sd = {
- .handle = bo->gem_handle,
- .read_domains = read_domains,
- .write_domain = write_domain,
- };
-
double elapsed = unlikely(brw && brw->perf_debug) ? -get_time() : 0.0;
- if (drmIoctl(bo->bufmgr->fd, DRM_IOCTL_I915_GEM_SET_DOMAIN, &sd) != 0) {
- DBG("%s:%d: Error setting memory domains %d (%08x %08x): %s.\n",
- __FILE__, __LINE__, bo->gem_handle, read_domains, write_domain,
- strerror(errno));
- }
+ brw_bo_wait_rendering(bo);
if (unlikely(brw && brw->perf_debug)) {
elapsed += get_time();
@@ -755,8 +746,7 @@ brw_bo_map_cpu(struct brw_context *brw, struct brw_bo *bo, unsigned flags)
print_flags(flags);
if (!(flags & MAP_ASYNC)) {
- set_domain(brw, "CPU mapping", bo, I915_GEM_DOMAIN_CPU,
- flags & MAP_WRITE ? I915_GEM_DOMAIN_CPU : 0);
+ bo_wait_with_stall_warning(brw, bo, "CPU mapping");
}
if (!bo->cache_coherent) {
@@ -826,8 +816,7 @@ brw_bo_map_gtt(struct brw_context *brw, struct brw_bo *bo, unsigned flags)
print_flags(flags);
if (!(flags & MAP_ASYNC)) {
- set_domain(brw, "GTT mapping", bo,
- I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT);
+ bo_wait_with_stall_warning(brw, bo, "GTT mapping");
}
return bo->map_gtt;