aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel/vulkan/genX_query.c
Commit message (Collapse)AuthorAgeFilesLines
* anv: implement gen9 post sync pipe control workaroundLionel Landwerlin2020-02-051-0/+9
| | | | | | | | | | | | | | We've been missing this workaround for a while and since it's required for Gen12, let's implement it for Gen9 first. v2: Update comment for Gen9. v3: Fix clearing of bits... (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3405>
* anv: Handle unavailable queries in vkCmdCopyQueryPoolResultsBrian Ho2020-01-281-0/+54
| | | | | | | | | | | | | | | If VK_QUERY_RESULT_WAIT_BIT is not set, there is currently no special handling of unavailable queries in vkCmdCopyQueryPoolResults, and anv will write an invalid value for the query result. This commit updates vkCmdCopyQueryPoolResults for unavailable queries to return 0 if the VK_QUERY_RESULT_PARTIAL_BIT flag is set and if not, skip writing altogether. Cc: <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>
* anv: Properly fetch partial results in vkGetQueryPoolResultsBrian Ho2020-01-281-2/+11
| | | | | | | | | | | | | | | | | Currently, fetching the partial results (VK_QUERY_RESULT_PARTIAL_BIT) of an unavailable occlusion query via vkGetQueryPoolResults can return invalid values. anv returns slot.end - slot.begin, but in the case of unavailable queries, slot.end is still at the initial value of 0. If slot.begin is non-zero, the occlusion count underflows to a value that is likely outside the acceptable range of the partial result. This commit fixes vkGetQueryPoolResults by always returning 0 if the query is unavailable and the VK_QUERY_RESULT_PARTIAL_BIT is set. Cc: <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3586>
* anv: Add an anv_physical_device field to anv_deviceJason Ekstrand2020-01-201-1/+1
| | | | | | | | | | | | Having to always pull the physical device from the instance has been annoying for almost as long as the driver has existed. It also won't work in a world where we ever have more than one physical device. This commit adds a new field called "physical" to anv_device and switches every location where we use device->instance->physicalDevice to use the new field instead. Reviewed-by: Lionel Landwerlin <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3461>
* anv: Enable Vulkan 1.2 supportIván Briano2020-01-151-1/+1
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* anv: fix intel perf queries availability writesLionel Landwerlin2020-01-091-14/+5
| | | | | | | | | The availability is not written at the location changed in ee6fbb95a74d... Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: ee6fbb95a74d ("anv: Properly handle host query reset of performance queries") Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Add an explicit_address parameter to anv_device_alloc_boJason Ekstrand2019-12-051-0/+1
| | | | | | | | | | We already have a mechanism for specifying that we want a fixed address provided by the driver internals. We're about to let the client start specifying addresses in some very special scenarios as well so we want to pass this through to the allocation function. Reviewed-by: Ivan Briano <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: prepare the driver for delayed submissionsLionel Landwerlin2019-11-111-29/+8
| | | | | | | | | | | | | | | | | | | | | | | | Timeline semaphore introduce support for wait before signal behavior, which means that it is now allowed to call vkQueueSubmit() with wait semaphores not yet submitted for execution. Our kernel driver requires all of the wait primitives to be created before calling the execbuf ioctl. As a result, we must delay submissions in the userspace driver. This change store the necessary information to be able to delay a VkSubmitInfo submission to the kernel driver. v2: Fold count++ into array access (Jason) Move queue list to another patch (Jason) v3: Document cleanup of temporary semaphores (Jason) v4: Track semaphores of SYNC_FD type that needs updating after delayed submission v5: Don't forget to update sync_fd in signaled semaphores after submission (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Properly handle host query reset of performance queriesLionel Landwerlin2019-11-041-32/+20
| | | | | | | | | | | | The host query reset entry point didn't use the availability offset for performance queries. To fix this, reorder the availability of performance queries to match other queries. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 2b5f30b1d9 ("anv: implement VK_INTEL_performance_query") Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Allocate query pool BOs from the cacheJason Ekstrand2019-10-311-25/+15
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Use the query_slot helper in vkResetQueryPoolEXTJason Ekstrand2019-10-311-1/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: implement VK_INTEL_performance_queryLionel Landwerlin2019-10-231-17/+235
| | | | | | | | | | | | | | | | | | | | | v2: Introduce the appropriate pipe controls Properly deal with changes in metric sets (using execbuf parameter) Record marker at query end v3: Fill out PerfCntr1&2 v4: Introduce vkUninitializePerformanceApiINTEL v5: Use new execbuf extension mechanism v6: Fix comments in genX_query.c (Rafael) Use PIPE_CONTROL workarounds (Rafael) Refactor on the last kernel series update (Lionel) v7: Only I915_PERF_IOCTL_CONFIG when perf stream is already opened (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* anv: rework queries writes to ensure ordering memory writesLionel Landwerlin2019-05-081-17/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We use a mix of MI & PIPE_CONTROL commands to write our queries' data (results & availability). Those commands' memory write order is not guaranteed with regard to their order in the command stream, unless CS stalls are inserted between them. This is problematic for 2 reasons : 1. We copy results from the device using MI commands even though the values are generated from PIPE_CONTROL, meaning we could copy unlanded values into the results and then copy the availability that is inconsistent with the values. 2. We allow the user to poll on the availability values of the query pool from the CPU. If the availability lands in memory before the values then we could return invalid values. This change does 2 things to address this problem : - We use either PIPE_CONTROL or MI commands to write both queries values and availability, so that the ordering of the memory writes guarantees that if availability is visible, results are also visible. - For the occlusion & timestamp queries we apply a CS stall before copying the results on the device, to ensure copying with MI commands see the correct values of previous PIPE_CONTROL writes of availability (required by the Vulkan spec). Signed-off-by: Lionel Landwerlin <[email protected]> Reported-by: Iago Toral Quiroga <[email protected]> Cc: [email protected] Reviewed-by: Jason Ekstrand <[email protected]>
* anv: fix argument name for vkCmdEndQueryLionel Landwerlin2019-04-241-2/+2
| | | | | | | | | | Doesn't fix anything but it's not the right function prototype. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 673f33c77dd765 ("anv: Implement CmdBegin/EndQueryIndexed") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Sagar Ghuge <[email protected]>
* anv: Move mi_memcpy and mi_memset to gen_mi_builderJason Ekstrand2019-04-111-5/+4
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Use gen_mi_builder for queriesJason Ekstrand2019-04-111-214/+58
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Implement VK_EXT_host_query_resetJason Ekstrand2019-03-181-0/+14
| | | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Implement transform feedback queriesJason Ekstrand2019-01-221-1/+71
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Implement CmdBegin/EndQueryIndexedJason Ekstrand2019-01-221-1/+20
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: narrow flushing of the render target to buffer writesLionel Landwerlin2019-01-191-1/+1
| | | | | | | | | | | | | | In commit 9a7b3199037ac4 ("anv/query: flush render target before copying results") we tracked all the render target writes to apply a flushes in the vkCopyQueryResults(). But we can narrow this down to only when we write a buffer (which is the only input of vkCopyQueryResults). v2: Drop newer render target write flags introduce by 1952fd8d2ce905 ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v1)
* anv/query: flush render target before copying resultsLionel Landwerlin2018-12-051-0/+9
| | | | | | | | | | | | | | | This change tracks render target writes in the pipeline and applies a render target flush before copying the query results to make sure the preceding operations have landed in memory before the command streamer initiates the copy. v2: Simplify logic in CopyQueryResults (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108909 Fixes: 37f9788e9a8e44 ("anv: flush pipeline before query result copies") Cc: [email protected]
* anv: flush pipeline before query result copiesLionel Landwerlin2018-11-291-5/+4
| | | | | | | | | | | | | | | | | | | | | | | Pipeline state pending bits should be taken into account when copying results. In the particular bug below, the results of the vkCmdCopyQueryPoolResults() command was being overwritten by the preceding vkCmdCopyBuffer() with a same destination buffer. This is because we copy the buffers using the 3D pipeline whereas we copy the query results using the command streamer. Those pieces of HW work in parallel and the results are somewhat undefined. v2: Unconditionally flush the pipeline before copying the results (Jason) v3: Wrap & expressions (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108894 Cc: [email protected]
* anv: Return VK_ERROR_DEVICE_LOST from anv_device_set_lostJason Ekstrand2018-10-261-3/+1
| | | | | | | This lets us get rid of a bunch of duplicated error messages. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: Add helpers for setting/checking device lostJason Ekstrand2018-10-261-2/+2
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv/query: Add an emit_srm helperJason Ekstrand2018-09-171-32/+21
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Add a mi_memset and use it for zeroing queriesJason Ekstrand2018-09-171-12/+2
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/query: Use anv_address everywhereJason Ekstrand2018-09-171-57/+64
| | | | | | | Instead of passing around BOs and offsets, use addresses which are anv's GPU equivalent of pointers. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/query: Write both dwords in emit_zero_queriesJason Ekstrand2018-09-171-0/+5
| | | | | | | Each query slot is a uint64_t and we were only zeroing half of it. Fixes: 7ec6e4e68980 "anv/query: implement multiview interactions" Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/query: Increment an index while writing resultsJason Ekstrand2018-09-171-36/+31
| | | | | | | Instead of computing an index at the end which we hope maps to the number of things written, just count the number of things as we go. Reviewed-by: Lionel Landwerlin <[email protected]>
* Replace uses of _mesa_bitcount with util_bitcountDylan Baker2018-09-071-7/+7
| | | | | | | | | | | | | and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem in nir for platforms that don't have popcount or popcountll, such as 32bit msvc. v2: - Fix additional uses of _mesa_bitcount added after this was originally written Acked-by: Eric Engestrom <[email protected]> (v1) Acked-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* anv: Soft-pin everything elseScott D Phillips2018-06-011-0/+6
| | | | | | | | v2 (Jason Ekstrand): - Break up Scott's mega-patch Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* anv: Use an anv_address in anv_bufferJason Ekstrand2018-05-311-8/+3
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* anv/cmd_buffer: Get rid of the meta query workaroundJason Ekstrand2018-01-231-14/+0
| | | | | | | | Meta has been gone for a long time. Tested-by: Józef Kucia <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Cc: "18.0" <[email protected]>
* anv/query: implement multiview interactionsIago Toral Quiroga2018-01-181-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From the Vulkan spec with KHX extensions: "If queries are used while executing a render pass instance that has multiview enabled, the query uses N consecutive query indices in the query pool (starting at query) where N is the number of bits set in the view mask in the subpass the query is used in. How the numerical results of the query are distributed among the queries is implementation-dependent. For example, some implementations may write each view's results to a distinct query, while other implementations may write the total result to the first query and write zero to the other queries. However, the sum of the results in all the queries must accurately reflect the total result of the query summed over all views. Applications can sum the results from all the queries to compute the total result." In our case we only really emit a single query (in the first query index) that stores the aggregated result for all views, but we still need to manage availability for all the other query indices involved, even if we don't actually use them. This is relevant when clients call vkGetQueryPoolResults and pass all N queries to retrieve the results. In that scenario, without this patch, we will never see queries other than the first being available since we never emit them. v2: we need the same treatment for timestamp queries. v3 (Jason): - Better an if instead of an early return. - We can't write to this memory in the CPU, we should use MI_STORE_DATA_IMM and emit_query_availability (Jason). v4 (Jason): - No need to take the value to write as parameter, just hard code it to 0. Fixes test failures in some work-in-progress CTS multiview+query tests. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: wire up vk_errorf macro to do debug reportingTapani Pälli2017-09-121-1/+2
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Remove 'inline' keywordsMatt Turner2017-08-291-1/+1
| | | | | | | | | | Unless you have data, the compiler knows better than you whether a function should be inlined. No difference in the resulting binary with gcc-6.3.0 or clang-4.0. Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: Stop setting BO flags in bo_init_newJason Ekstrand2017-05-231-0/+7
| | | | | | | | | | | | | | | | | | The idea behind doing this was to make it easier to set various flags. However, we have enough custom flag settings floating around the driver that this is more of a nuisance than a help. This commit has the following functional changes: 1) The workaround_bo created in anv_CreateDevice loses both flags. This shouldn't matter because it's very small and entirely internal to the driver. 2) The bo created in anv_CreateDmaBufImageINTEL loses the EXEC_OBJECT_ASYNC flag. In retrospect, it never should have gotten EXEC_OBJECT_ASYNC in the first place. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv/query: handle more cases of 'out of host memory'Iago Toral Quiroga2017-05-051-0/+10
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/query: Use genxml for MI_MATHJason Ekstrand2017-04-201-43/+28
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed by: Iago Toral Quiroga <[email protected]>
* anv/query: Use snooping on !LLC platformsJason Ekstrand2017-04-071-13/+11
| | | | | | | | | | | | | | Commit b2c97bc789198427043cd902bc76e194e7e81c7d which made us start using a busy-wait for individual query results also messed up cache flushing on !LLC platforms. For one thing, I forgot the mfence after the clflush so memory access wasn't properly getting fenced. More importantly, however, was that we were clflushing the whole query range and then waiting for individual queries and then trying to read the results without clflushing again. Getting the clflushing both correct and efficient is very subtle and painful. Instead, let's side-step the problem by just snooping. Reviewed-by: Chris Wilson <[email protected]>
* anv/query: Busy-wait for available query entriesJason Ekstrand2017-04-051-6/+56
| | | | | | | | | | | | | | | | | | | Before, we were just looking at whether or not the user wanted us to wait and waiting on the BO. Some clients, such as the Serious engine, use a single query pool for hundreds of individual query results where the writes for those queries may be split across several command buffers. In this scenario, the individual query we're looking for may become available long before the BO is idle so waiting on the query pool BO to be finished is wasteful. This commit makes us instead busy-loop on each query until it's available. This significantly reduces pipeline bubbles and improves performance of The Talos Principle on medium settings (where the GPU isn't overloaded with drawing) by around 20% on my SkyLake gt4. Reviewed-by: Chris Wilson <[email protected]> Tested-by: Eero Tamminen <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]>
* anv: Query the kernel for reset statusJason Ekstrand2017-04-041-8/+3
| | | | | | | | | | | | When a client causes a GPU hang (or experiences issues due to a hang in another client) we want to let it know as soon as possible. In particular, if it submits work with a fence and calls vkWaitForFences or vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be able to trust the results of that rendering. In order to provide this guarantee, we have to ask the kernel for context status in a few key locations. Reviewed-by: Kenneth Graunke <[email protected]>
* anv/query: handle out of host memory without crashing in compute_query_result()Iago Toral Quiroga2017-03-241-0/+5
| | | | | | | | | | | | We don't need to make the caller (CmdCopyQueryPoolResults) aware of the problem since compute_query_result() only emits state. The caller is also expected to hit OOM in this scenario right after calling this function, but it is already handling it safely. Fixes: dEQP-VK.api.out_of_host_memory.cmd_copy_query_pool_results Reviewed-by: Topi Pohjolainen <[email protected]>
* anv: return VK_ERROR_DEVICE_LOST immeditely when device is known to be lostIago Toral Quiroga2017-03-241-0/+3
| | | | | | | | If we know the device has been lost we should return this error code for any command that can report it before we attempt to do anything with the device. Reviewed-by: Jason Ekstrand <[email protected]>
* genxml: Make MI_STORE_DATA_IMM have a single 64-bit data fieldJason Ekstrand2017-03-171-2/+1
| | | | | | This is way more convenient than having two separate dword fields. Reviewed-By: Lionel Landwerlin <[email protected]>
* anv: Implement pipeline statistics queriesIlia Mirkin2017-03-171-10/+222
| | | | | | | | In the end, pipeline statistics queries look a lot like occlusion queries only with between 1 and 11 begin/end pairs being generated instead of just the one. Reviewed-By: Lionel Landwerlin <[email protected]>
* anv/query: Rework store_query_resultJason Ekstrand2017-03-171-15/+24
| | | | | | | | The new version is a nice GPU parallel to cpu_write_query_result and it nicely handles things like dealing with 32 vs. 64-bit offsets in the destination buffer. Reviewed-By: Lionel Landwerlin <[email protected]>
* anv/query: Break GPU query calculation into a helperJason Ekstrand2017-03-171-12/+18
| | | | Reviewed-By: Lionel Landwerlin <[email protected]>
* anv/query: Add a helper for writing a query pool resultJason Ekstrand2017-03-171-16/+17
| | | | Reviewed-By: Lionel Landwerlin <[email protected]>
* anv/query: Use a variable-length slot sizeJason Ekstrand2017-03-171-22/+30
| | | | | | | | Not all queries are the same. Even the two queries we support today require a different amount of data per slot. Once we introduce pipeline statistics queries, the size will vary wildly. Reviewed-By: Lionel Landwerlin <[email protected]>