diff options
author | Kenneth Graunke <[email protected]> | 2017-07-17 12:46:58 -0700 |
---|---|---|
committer | Kenneth Graunke <[email protected]> | 2017-07-22 19:34:42 -0700 |
commit | 38e2142f392f9b6ac78eab72a1f92dd37553e1d8 (patch) | |
tree | 313c1b1b015e3429b6a522f3743252d2646dbc99 | |
parent | eb1497e968bd4a0edc1606e8a6f708fab3248828 (diff) |
i965/bufmgr: Explicitly wait instead of using I915_GEM_SET_DOMAIN.
With the advent of asynchronous maps, domain tracking doesn't make a
whole lot of sense. Buffers can be in use on both the CPU and GPU at
the same time. In order to avoid blocking, we stopped using set_domain
for asynchronous mappings, which means that the kernel's tracking has
lies. We can't properly track it in userspace either, as the kernel
can change domains on us spontaneously (for example, when un-swapping).
According to Chris Wilson, I915_GEM_SET_DOMAIN does the following:
1. pins the backing storage (acquiring pages outside of the
struct_mutex)
2. waits either for read/write access, including inter-device waits
3. updates the domain, clflushing as required
4. marks the object as used (for swapping)
5. turns off FBC/PSR/fancy scanout caching
Item (1) is not terribly important. Most BOs are recycled via the
BO cache, so they already have pages. Regardless, we fixed this
via an initial set_domain in the previous patch.
We implement item (2) with I915_GEM_WAIT. This has one downside:
we'll stall unnecessarily if we do a read-only mapping of a buffer
that the GPU is reading. I believe this is pretty uncommon. We
may want to extend the wait ioctl at some point.
Mesa already does item (3) itself. For cache-coherent buffers (most on
LLC systems), we don't need to do any clflushing - the CPU and GPU views
are coherent. For non-coherent buffers (most on non-LLC systems), we
currently only use the CPU for read-only maps, and we explicitly clflush
when necessary.
We don't care about item (4)...swapping has already killed performance.
Plus, with async maps, the kernel's domain tracking is already bogus,
so it can't do this accurately regardless.
Item (5) should be okay because we avoid cached maps of scanout buffers.
Reviewed-by: Matt Turner <[email protected]>
-rw-r--r-- | src/mesa/drivers/dri/i965/brw_bufmgr.c | 23 |
1 files changed, 6 insertions, 17 deletions
diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c b/src/mesa/drivers/dri/i965/brw_bufmgr.c index cd85874ea52..fc2a2276236 100644 --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c @@ -670,22 +670,13 @@ brw_bo_unreference(struct brw_bo *bo) } static void -set_domain(struct brw_context *brw, const char *action, - struct brw_bo *bo, uint32_t read_domains, uint32_t write_domain) +bo_wait_with_stall_warning(struct brw_context *brw, + struct brw_bo *bo, + const char *action) { - struct drm_i915_gem_set_domain sd = { - .handle = bo->gem_handle, - .read_domains = read_domains, - .write_domain = write_domain, - }; - double elapsed = unlikely(brw && brw->perf_debug) ? -get_time() : 0.0; - if (drmIoctl(bo->bufmgr->fd, DRM_IOCTL_I915_GEM_SET_DOMAIN, &sd) != 0) { - DBG("%s:%d: Error setting memory domains %d (%08x %08x): %s.\n", - __FILE__, __LINE__, bo->gem_handle, read_domains, write_domain, - strerror(errno)); - } + brw_bo_wait_rendering(bo); if (unlikely(brw && brw->perf_debug)) { elapsed += get_time(); @@ -755,8 +746,7 @@ brw_bo_map_cpu(struct brw_context *brw, struct brw_bo *bo, unsigned flags) print_flags(flags); if (!(flags & MAP_ASYNC)) { - set_domain(brw, "CPU mapping", bo, I915_GEM_DOMAIN_CPU, - flags & MAP_WRITE ? I915_GEM_DOMAIN_CPU : 0); + bo_wait_with_stall_warning(brw, bo, "CPU mapping"); } if (!bo->cache_coherent) { @@ -826,8 +816,7 @@ brw_bo_map_gtt(struct brw_context *brw, struct brw_bo *bo, unsigned flags) print_flags(flags); if (!(flags & MAP_ASYNC)) { - set_domain(brw, "GTT mapping", bo, - I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT); + bo_wait_with_stall_warning(brw, bo, "GTT mapping"); } return bo->map_gtt; |