aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/iris/iris_bufmgr.c
Commit message (Collapse)AuthorAgeFilesLines
* iris: fix aux buf map failure in 32bits app on AndroidTapani Pälli2020-02-131-8/+9
| | | | | | | | | Cc: [email protected] Reported-by: Zhifang Long <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3784> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3784>
* util/hash_table: update users to use new optimal integer hash functionsAnthony Pesch2020-01-231-14/+2
| | | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
* iris: Allow max dynamic pool size of 2GB for gen12Jordan Justen2019-12-021-1/+9
| | | | | | | | | | | | Reworks: * Adjust comment to list the state packets that curro found to be affected. Fixes: 8125d7960b6 ("intel/dev: Add preliminary device info for Tigerlake") Cc: 19.3 <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* iris: try to set the specified tiling when importing a dmabufJames Xiong2019-11-041-4/+11
| | | | | | | | | | | | | | | | | | When importing a dmabuf with a specified tiling, the dmabuf user should always try to set the tiling mode because: 1) the exporter can set tiling AFTER exporting/importing. 2) a dmabuf could be exported from a kernel driver other than i915, in this case the dmabuf user and exporter need to set tiling separately. This patch fixes a problem when running vkmark under weston with iris on ICL, it crashed to console with the following assert. i965 doesn't have this problem as it always tries to set the specified tiling mode. weston: ../src/gallium/drivers/iris/iris_resource.c:990: iris_resource_from_handle: Assertion `res->bo->tiling_mode == isl_tiling_to_i915_tiling(res->surf.tiling)' failed. Signed-off-by: James Xiong <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* iris: Map each surf to it's aux-surf in the aux-map tablesJordan Justen2019-10-281-0/+19
| | | | | | Rework: Nanley Chery Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris/bufmgr: Initialize aux map context for gen12Jordan Justen2019-10-281-0/+53
| | | | | | | | | Reworks: * free gen_buffer in gen_aux_map_buffer_free. (Rafael) * lock around aux_map_bos accesses. (Ken) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Set bo->reusable = false in iris_bo_make_external_lockedKenneth Graunke2019-09-111-5/+4
| | | | | | This fixes a missing bo->reusable = false in iris_bo_export_gem_handle. Reviewed-by: Chris Wilson <[email protected]>
* iris: Finish initializing the BO before stuffing it in the hash tableKenneth Graunke2019-09-111-4/+2
| | | | | | | Other threads may pick it up once it's in the hash table. Not known to fix anything currently. Reviewed-by: Chris Wilson <[email protected]>
* iris: use driconf for 'bo_reuse' parameterTapani Pälli2019-08-291-4/+2
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meson: define ETIME to ETIMEDOUT if not presentGreg V2019-08-081-3/+0
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* iris: Fix bad external BO hash table and zombie list interactionsKenneth Graunke2019-08-051-12/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A while ago, we started deferring GEM object closure and VMA release until buffers were idle. This had some unforeseen interactions with external buffers. We keep imported buffers in hash tables, so if we have repeated imports of the same GEM object, we map those to the same iris_bo structure. This is critical for several reasons. Unfortunately, we broke this assumption. When freeing a non-idle external buffer, we would drop it from the hash tables, then move it to the zombie list. If someone reimported the same GEM object, we would not find it in the hash tables, and go ahead and make a second iris_bo for that GEM object. But the old iris_bo would still be in the zombie list, and so we would eventually call GEM_CLOSE on it - closing a BO that should have still been live. To work around this, we defer removing a BO from the hash tables until it's actually fully closed. This has the strange effect that an external BO may be on the zombie list, and yet be resurrected before it can be properly cleaned up. In this case, we remove it from the list so it won't be freed. Fixes severe instability in Weston, which was hitting EINVALs and ENOENTs from execbuf2, due to batches referring to a GEM object that had been closed, or at least had its VMA torched. Fixes: 457a55716ea ("iris: Defer closing and freeing VMA until buffers are idle.")
* iris/bufmgr: Move iris_bo_reference into hash_find_bo, rename itKenneth Graunke2019-08-051-14/+16
| | | | | | Everybody importing an external buffer was looking it up in the hash table, then referencing it. We can just do that in the helper instead, which also gives us a convenient spot to stash extra code shortly.
* intel/common: provide common ioctl routineMark Janes2019-08-011-36/+22
| | | | | | | | | | | i965 links against libdrm for drmIoctl, but anv and iris both re-implement this routine to avoid the dependency. intel/dev also needs an ioctl wrapper, so lets share the same implementation everywhere. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* iris: Defer closing and freeing VMA until buffers are idle.Kenneth Graunke2019-07-021-10/+51
| | | | | | | | | | | | There will unfortunately be circumstances where we cannot re-use a virtual memory address until it's no longer active on the GPU. To facilitate this, we instead move BOs to a "dead" list, and defer closing them and returning their VMA until they are idle. We periodically sweep these away in cleanup_bo_cache, which triggers every time a new object's refcount hits zero. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Jordan Justen <[email protected]>
* iris: Add an explicit alignment parameter to iris_bo_alloc_tiled().Kenneth Graunke2019-07-021-10/+16
| | | | | | | | | | | | In the future, some images will need to be aligned to a larger value than 4096. Most buffers, however, don't have any such requirement, so for now we only add the parameter to iris_bo_alloc_tiled() and leave the others with the simpler interface. v2: Fix missing alignment in vma_alloc, caught by Caio! Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Jordan Justen <[email protected]>
* iris: Avoid holding the lock while allocating pages.Kenneth Graunke2019-05-301-5/+5
| | | | | | | | | | | | | | | | | We only need the lock for: 1. Rummaging through the cache 2. Allocating VMA We don't need it for alloc_fresh_bo(), which does GEM_CREATE, and also SET_DOMAIN to allocate the underlying pages. The idea behind calling SET_DOMAIN was to avoid a lock in the kernel while allocating pages, now we avoid our own global lock as well. We do have to re-lock around VMA. Hopefully this shouldn't happen too much in practice because we'll find a cached BO in the right memzone and not have to reallocate it. Reviewed-by: Chris Wilson <[email protected]>
* iris: Move SET_DOMAIN to alloc_fresh_bo()Kenneth Graunke2019-05-301-17/+15
| | | | | | | | Chris pointed out that the order between SET_DOMAIN and SET_TILING doesn't matter, so we can just do the page allocation when creating a new BO. Simplifies the flow a bit. Reviewed-by: Chris Wilson <[email protected]>
* iris: Be lazy about cleaning up purged BOs in the cache.Kenneth Graunke2019-05-291-17/+1
| | | | | | | | | | | | | | | | | | | | | | Mathias Fröhlich reported that commit 6244da8e23e5470d067680 crashes. list_for_each_entry_safe is safe against removing the current entry, but iris_bo_cache_purge_bucket was potentially removing next entries too, which broke our saved next pointer. To fix this, don't bother with the iris_bo_cache_purge_bucket step. We just detected a single entry where the kernel has purged the BO's memory, and so it isn't a usable entry for our cache. We're about to continue the search with the next BO. If that one's purged, we'll clean it up too. And so on. We may miss cleaning up purged BOs that are further down the list after non-purged BOs...but that's probably fine. We still have the time-based cleaner (cleanup_bo_cache) which will take care of them eventually, and the kernel's already freed their memory, so it's not that harmful to have a few kicking around a little longer. Fixes: 6244da8e23e iris: Dig through the cache to find a BO in the right memzone Reviewed-by: Chris Wilson <[email protected]>
* iris: Dig through the cache to find a BO in the right memzoneKenneth Graunke2019-05-291-7/+17
| | | | | | | | | | | | | | | | This saves some util_vma thrash when the first entry in the cache happens to be in a different memory zone, but one just a tiny bit ahead is already there and instantly reusable. Hopefully the cost of a little extra searching won't break the bank - if it does, we can consider having separate list heads or keeping a separate VMA cache. Improves OglDrvRes performance by 22%, restoring a regression from deleting the bucket allocators in 694d1a08d3e5883d97d5352895f8431f. Thanks to Clayton Craft for alerting me to the regression. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Tidy BO sizing code and commentsKenneth Graunke2019-05-291-12/+5
| | | | | | Buckets haven't been power of two sized in over a decade. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Move some field setting after we drop the lock.Kenneth Graunke2019-05-291-13/+13
| | | | | | It's not much, but we may as well hold the lock for a bit less time. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Move cached BO allocation into a helper function.Kenneth Graunke2019-05-291-44/+64
| | | | | | | There's enough going on here to warrant a helper. This also simplifies the control flow and eliminates the last non-error-case goto. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Fall back to fresh allocations of mapping for zero-memset fails.Kenneth Graunke2019-05-291-3/+4
| | | | | | | | | | | | | | | It is unlikely that we would fail to map a cached BO in order to zero its contents. When we did, we would free the first BO in the cache and try again with the second. It's possible that this next BO already had a map setup, in which case we'd succeed. But if it didn't, we'd likely fail again in the same manner. There's not much point in optimizing this case (and frankly, if we're out of CPU-side VMA we should probably dump the cache entirely)...so instead, just fall back to allocating a fresh BO from the kernel which will already be zeroed so we don't have to try and map it. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Move fresh BO allocation into a helper function.Kenneth Graunke2019-05-291-26/+30
| | | | | | There's enough going on here to warrant a helper. More cleaning coming. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Do SET_TILING at a single point rather than in two places.Kenneth Graunke2019-05-291-20/+20
| | | | | | | | | | | | | | | | | | Both the from-cache and fresh-from-GEM cases were calling SET_TILING. In the cached case, we would retry the allocation on failure, pitching one BO from the cache each time. This is silly, because the only time it should fail is if the tiling or stride parameters are unacceptable, which has nothing to do with the particular BO in question. So there's no point in retrying - we should simply fail the allocation. This patch moves both calls to bo_set_tiling_internal() below the cache/fresh split, so we have it at a single point in time instead of two. To preserve the ordering between SET_TILING and SET_DOMAIN, we move that below as well. (I am unsure if the order matters.) Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Use the BO cache even for coherent buffers on non-LLC.Kenneth Graunke2019-05-291-3/+0
| | | | | | | | | | | We mark snooped BOs as non-reusable, so we never return them to the cache. This means that we'd need to call I915_GEM_SET_CACHING to make any BO we find in the cache snooped. But then again, any BO we freshly allocate from the kernel will also be non-snooped, so it has the same issue. There's really no reason to skip the cache - we may as well use it to avoid the I915_GEM_CREATE overhead. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Fix locking around vma_alloc in iris_bo_create_userptrKenneth Graunke2019-05-291-0/+4
| | | | | | | util_vma needs to be protected by a lock. All other callers of vma_alloc and vma_free appear to be holding a lock already. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Fix lock/unlock mismatch for non-LLC coherent BO allocation.Kenneth Graunke2019-05-291-7/+3
| | | | | | | | The goto jumped over the mtx_lock, but proceeded to hit the mtx_unlock. We can simply set the bucket to NULL and it will skip the cache without goto, and without messing up locking. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Add helpers to clone a hardware context.Chris Wilson2019-05-091-0/+24
| | | | | (Chris Wilson wrote this code in a patch titled "i965: Be resilient in the face of GPU hangs"; Ken fixed a bug and copied it to iris.)
* iris: Mark render batches as non-recoverable.Kenneth Graunke2019-05-091-0/+22
| | | | | | | | | | | | | | | | | | Adapted from Chris Wilson's patch. The comment is largely his. Currently, when iris hangs the GPU, it will continue sending batches which incrementally update the state, assuming it's preserved across batches. However, the kernel's GPU reset support reinitializes the guilty context to the default GPU state (reasonably not wanting to trust the current state). This ends up resetting critical things like STATE_BASE_ADDRESS, causing memory accesses in all subsequent batches to be garbage, and almost certainly result in more hangs until we're banned or we kill the machine. We now ask the kernel to ban our render context immediately, so we notice we've gone off the rails as fast as possible. Eventually, we'll attempt to recover and continue. For now, we just avoid torching the GPU over and over.
* iris: Delete bucketing allocatorsKenneth Graunke2019-05-031-167/+3
| | | | | | | | | These add a lot of complexity, and I currently can't measure any performance benefit from having them. In the past, I seem to recall seeing a benefit in drawoverhead scores, but currently it looks like dropping them is either a wash or 1-2% faster. Drop them to simplify allocations.
* iris: Force VMA alignment to be a multiple of the page size.Kenneth Graunke2019-05-031-0/+3
| | | | This should happen regardless, but let's be paranoid.
* iris: leave the top 4Gb of the high heap VMA unusedKenneth Graunke2019-05-031-1/+5
| | | | | This ports commit 9e7b0988d6e98690eb8902e477b51713a6ef9cae from anv to iris. Thanks to Lionel for noticing that it was missing!
* iris: Fix 4GB memory zone heap sizes.Kenneth Graunke2019-05-031-3/+6
| | | | | | | The STATE_BASE_ADDRESS "Size" fields can only hold 0xfffff in pages, and 0xfffff * 4096 = 4294963200, which is 1 page shy of 4GB. So we can't use the top page.
* iris: Make memzone_for_address non-staticKenneth Graunke2019-04-231-5/+5
| | | | I want to use this in iris_resource.c.
* intel/common: move gen_debug to intel/devMark Janes2019-04-101-1/+1
| | | | | | | | | libintel_common depends on libintel_compiler, but it contains debug functionality that is needed by libintel_compiler. Break the circular dependency by moving gen_debug files to libintel_dev. Suggested-by: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Adapt to variable ppGTT sizeChris Wilson2019-04-011-1/+20
| | | | | | | | | Not all hardware is made equal and some does not have the full complement of 48b of address space. Ask what the actual size of virtual address space allocated for contexts, and bail if that is not enough to satisfy our static partitioning needs. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Print the memzone name when allocating BOs with INTEL_DEBUG=bufKenneth Graunke2019-03-281-2/+17
| | | | | This gives me an idea of what kinds of buffers are being allocated on the fly which could help inform our cache decisions.
* iris: Fix util_vma_heap_init size for IRIS_MEMZONE_SHADERKenneth Graunke2019-03-211-1/+1
| | | | Fixes assertions when disabling bucket allocators.
* iris: Use streaming loads to read from tiled surfacesChris Wilson2019-03-131-1/+4
| | | | | | | | | | | Always use the streaming load (since we know we have Broadwell+, all of our target CPU support sse41) for reading back form the tiled surface for mapping the resource. This means we hit the fast WC handling paths on Atoms (without LLC), and for big Core (with LLC) using the streaming load is no less efficient as we do not require the tiled buffer to be pulled into the CPU cache. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Use coherent allocation for PIPE_RESOURCE_STAGINGChris Wilson2019-03-131-0/+18
| | | | | | | | | On !llc machines (Atoms), reading from a linear buffers is slow and so copying from one resource into the linear staging buffer is still slow. However, we can tell the GPU to snoop the CPU cache when reading from and writing to the staging buffer eliminating the slow uncached reads. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Do binder address allocations per-context, not globally.Kenneth Graunke2019-02-211-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | iris_bufmgr allocates addresses across the entire screen, since buffers may be shared between multiple contexts. There used to be a single special address, IRIS_BINDER_ADDRESS, that was per-context - and all contexts used the same address. When I moved to the multi-binder system, I made a separate memory zone for them. I wanted there to be 2-3 binders per context, so we could cycle them to avoid the stalls inherent in pinning two buffers to the same address in back-to-back batches. But I figured I'd allow 100 binders just to be wildly excessive/cautious. What I didn't realize was that we need 2-3 binders per *context*, and what I did was allocate 100 binders per *screen*. Web browsers, for example, might have 1-2 contexts per tab, leading to hundreds of contexts, and thus binders. To fix this, we stop allocating VMA for binders in bufmgr, and let the binder handle it itself. Binders are per-context, and they can assign context-local addresses for the buffers by simply doing a ringbuffer style approach. We only hold on to one binder BO at a time, so we won't ever have a conflicting address. This fixes dEQP-EGL.functional.multicontext.non_shared_clear. Huge thanks to Tapani Pälli for debugging this whole mess and figuring out what was going wrong. Reviewed-by: Tapani Pälli <[email protected]>
* iris: Fix memzone_for_address for the surface and binder zonesKenneth Graunke2019-02-211-2/+2
| | | | | | | | | | | We use > for IRIS_MEMZONE_DYNAMIC because IRIS_BORDER_COLOR_POOL_ADDRESS lives at the very start of that zone. However, IRIS_MEMZONE_SURFACE and IRIS_MEMZONE_BINDER are normal zones. They used to be a single zone (surface) with a single binder BO at the beginning, similar to the border color pool. But when I moved us to multiple binders, I made them have a real zone (if a small one). So both zones should use >=. Reviewed-by: Tapani Pälli <[email protected]>
* iris: Tidy exporting the flink handleChris Wilson2019-02-211-9/+16
|
* iris: vma_free bo->size, not bo_sizeKenneth Graunke2019-02-211-1/+1
| | | | | | | | | | | this is more obviously correct. I think the two end up being the same in practice, since this is in the alloc_from_cache case, and presumably bo from the bucket has bo->size == bucket->size, and bo_size also is bucket->size... still. better to do the obvious thing. brw_bufmgr already does it this way.
* iris: fix memzone_for_address since multibinder changesChris Wilson2019-02-211-3/+3
|
* iris: Support multiple binder BOs, update Surface State Base AddressKenneth Graunke2019-02-211-15/+23
|
* iris: set EXEC_OBJECT_CAPTURE on all driver internal buffersKenneth Graunke2019-02-211-0/+6
|
* iris: Record reusability of bo on constructionChris Wilson2019-02-211-4/+5
| | | | | We know that if the bufmgr->reuse is set to false or if the bo is too large for a bucket, the same will be true when we come to free the bo.
* iris: precompute hashes for cache trackingKenneth Graunke2019-02-211-4/+14
| | | | saves a touch of cpu overhead in the new resolve tracking