aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel/vulkan/anv_blorp.c
Commit message (Collapse)AuthorAgeFilesLines
* anv: Allow HiZ in TRANSFER_SRC_OPTIMAL on Gen8-9Jason Ekstrand2020-01-041-10/+17
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* anv: Use mocs settings from isl_dev.Rafael Antognolli2019-11-121-1/+1
| | | | | | | v2: Remove device->default_mocs and external_mocs (Jason). Reviewed-by: Jordan Justen <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* anv: Allocate misc BOs from the cacheJason Ekstrand2019-10-311-1/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/blorp: Use BLORP_BATCH_NO_UPDATE_CLEAR_COLORNanley Chery2019-10-281-22/+10
| | | | | | | | Avoid failing the `info->use_clear_address` assertion in ISL on Gen12+. Fixes: 6c9f9a82d78 ("intel/genxml,isl: Add gen12 render surface state changes") Reported-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Use wide formats for nicely aligned stencil clearsJason Ekstrand2019-09-061-0/+14
| | | | | | | | | | | | | | | | In the case where the stencil clear is nicely aligned, we can clear stencil much more efficiently by mapping it as a wide format (say RGBA32_UINT) and blasting out the stencil clear value with a repclear. On Unigine Heaven, this makes one stencil clear go from non-trivial to unnoticeable when looking at per-draw timings. In order for this change to work properly, ANV needs to do a bit more flushing around depth and stencil clears. i965 and iris already have the cache tracking logic to handle this so no changes are required there. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Build for gen12Jordan Justen2019-08-281-0/+3
| | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Record shader compile stats in the pipeline cacheJason Ekstrand2019-08-121-1/+1
| | | | | | We're going to want these to be available regardless of caching. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: fix crash in vkCmdClearAttachments with unused attachmentLionel Landwerlin2019-07-151-3/+3
| | | | | | | | | | | | | anv_render_pass_compile() turns an unused attachment into a NULL depth_stencil_attachment pointer so check that pointer before accessing it. Found with updates to existing CTS tests. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 208be8eafa30be ("anv: Make subpass::depth_stencil_attachment a pointer") Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Juan A. Suarez <[email protected]>
* anv: Flush caches in anv_image_copy_to_shadowJason Ekstrand2019-06-191-0/+13
| | | | | | | | | Copies to a shadow image happen during a VkCmdPipelineBarrier or at subpass transitions. We could potentially be a bit more conservative but these transitions shouldn't happen often and it's better to have our bases covered. Fixes: f3ea0cf828 "anv: Add stencil texturing support for gen7"
* anv/blorp: Update shadow images when clearing or uploadingJason Ekstrand2019-06-171-11/+104
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/blorp: Take an aspect in anv_image_copy_to_shadowJason Ekstrand2019-06-171-3/+2
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/blorp: Delete a pointless assertJason Ekstrand2019-02-141-5/+0
| | | | | | | | Just a little higher up in the function we assert that the aspect masks are actually equal so there's no reason for the weaker check. Also, the temporary variables were causing compiler warnings in release builds. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: assert that color attachment are validLionel Landwerlin2019-02-081-4/+1
| | | | | | | | | | | This reverts commit d76e7779884775bcebf235adb0e8367816b9b95d. Let's make this obvious that there is an application issue if it tries to access an attachment that doesn't exist in the current pass. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: d76e7779884775 ("anv: Handle VK_ATTACHMENT_UNUSED in colorAttachment") Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Handle VK_ATTACHMENT_UNUSED in colorAttachmentDanylo Piliaiev2019-02-041-0/+4
| | | | | | | | | | | | | From the Vulkan 1.0.98 spec for vkCmdClearAttachments: "If the aspectMask member of any element of pAttachments contains VK_IMAGE_ASPECT_COLOR_BIT, then the colorAttachment member of that element must either refer to a color attachment which is VK_ATTACHMENT_UNUSED, or must be a valid color attachment." Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Add pipeline cache support for xfb_infoJason Ekstrand2019-01-221-1/+2
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: narrow flushing of the render target to buffer writesLionel Landwerlin2019-01-191-0/+8
| | | | | | | | | | | | | | In commit 9a7b3199037ac4 ("anv/query: flush render target before copying results") we tracked all the render target writes to apply a flushes in the vkCopyQueryResults(). But we can narrow this down to only when we write a buffer (which is the only input of vkCopyQueryResults). v2: Drop newer render target write flags introduce by 1952fd8d2ce905 ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v1)
* anv: Implement VK_EXT_conditional_rendering for gen 7.5+Danylo Piliaiev2019-01-181-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conditional rendering affects next functions: - vkCmdDraw, vkCmdDrawIndexed, vkCmdDrawIndirect, vkCmdDrawIndexedIndirect - vkCmdDrawIndirectCountKHR, vkCmdDrawIndexedIndirectCountKHR - vkCmdDispatch, vkCmdDispatchIndirect, vkCmdDispatchBase - vkCmdClearAttachments Value from conditional buffer is cached into designated register, MI_PREDICATE is emitted every time conditional rendering is enabled and command requires it. v2: by Jason Ekstrand - Use vk_find_struct_const instead of manually looping - Move draw count loading to prepare function - Zero the top 32-bits of MI_ALU_REG15 v3: Apply pipeline flush before accessing conditional buffer (The issue was found by Samuel Iglesias) v4: - Remove support of Haswell due to possible hardware bug - Made TMP_REG_PREDICATE and TMP_REG_DRAW_COUNT defines to define registers in one place. v5: thanks to Jason Ekstrand and Lionel Landwerlin - Workaround the fact that MI_PREDICATE_RESULT is not accessible on Haswell by manually calculating MI_PREDICATE_RESULT and re-emitting MI_PREDICATE when necessary. v6: suggested by Lionel Landwerlin - Instead of calculating the result of predicate once - re-emit MI_PREDICATE to make it easier to investigate error states. v7: suggested by Jason - Make anv_pipe_invalidate_bits_for_access_flag add CS_STALL if VK_ACCESS_CONDITIONAL_RENDERING_READ_BIT is set. v8: suggested by Lionel - Precompute conditional predicate's result to support secondary command buffers. - Make prepare_for_draw_count_predicate more readable. Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Remove state flush.Rafael Antognolli2019-01-171-2/+0
| | | | | | | We have all the state buffers snooped, so we don't need to clflush everything anymore. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Update usage of block_pool->bo.Rafael Antognolli2019-01-171-1/+1
| | | | | | | | | | | Change block_pool->bo to be a pointer, and update its usage everywhere. This makes it simpler to switch it later to a list of BOs. v3: - Use a static "bos" field in the struct, instead of malloc'ing it. This will be later changed to a fixed length array of BOs. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Move resolve_subpass to genX_cmd_buffer.cJason Ekstrand2019-01-141-66/+0
| | | | | We may have to do transitions around certain kinds of resolves so it helps to have it genX code.
* anv/blorp: Refactor MSAA resolves into an exportable helper functionJason Ekstrand2019-01-141-132/+93
| | | | | | This function is modeled after the aux_op functions except that it has a lot more parameters because it deals with two images as well as source and destination regions.
* anv: Rename has_resolve to has_color_resolveJason Ekstrand2019-01-141-1/+1
|
* blorp: Pass the batch to lookup/upload_shader instead of contextKenneth Graunke2019-01-101-2/+4
| | | | | | | | | This will allow drivers to pin shader buffers if necessary. i965 and anv do not need to do this today, but iris will. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: explictly specify format for blorp ccs/mcs opLionel Landwerlin2019-01-081-4/+6
| | | | | | | | | | | | | | Resolve operations can happen when dealing with view (begin/end subpasses) in which case the view's format needs to apply, not the image's format. v2: Relayout arguments of a ccs_op() call (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108911 Cc: [email protected]
* anv: Use separate MOCS settings for external BOsJason Ekstrand2018-10-031-7/+8
| | | | | | | | | | | | | | | | | | | On Broadwell and above, we have to use different MOCS settings to allow the kernel to take over and disable caching when needed for external buffers. On Broadwell, this is especially important because the kernel can't disable eLLC so we have to do it in userspace. We very badly don't want to do that on everything so we need separate MOCS for external and internal BOs. In order to do this, we add an anv-specific BO flag for "external" and use that to distinguish between buffers which may be shared with other processes and/or display and those which are entirely internal. That, together with an anv_mocs_for_bo helper lets us choose the right MOCS settings for each BO use. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99507 Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/isl: Add a unit suffixes to some struct fields and variablesJason Ekstrand2018-09-261-1/+1
| | | | | | | | | | | | | I was about to make the claim to someone that every field in isl_surf is either an enum or has explicit units. Then I looked at isl_surf and discovered this claim was wrong. We should fix that. This commit does a few refactors: * Add _B suffixes to some struct fields * Add _B to some variables and parameters * Rename row_pitch_tiles -> row_pitch_tl Reviewed-by: Nanley Chery <[email protected]>
* Replace uses of _mesa_bitcount with util_bitcountDylan Baker2018-09-071-1/+1
| | | | | | | | | | | | | and _mesa_bitcount_64 with util_bitcount_64. This fixes a build problem in nir for platforms that don't have popcount or popcountll, such as 32bit msvc. v2: - Fix additional uses of _mesa_bitcount added after this was originally written Acked-by: Eric Engestrom <[email protected]> (v1) Acked-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* anv/blorp: Fix a comment as per Nanley's review feedbackJason Ekstrand2018-09-011-2/+2
| | | | This accidentally didn't make it into 62378c5e9e5
* anv/blorp: Do more flushing around HiZ clearsJason Ekstrand2018-09-011-11/+33
| | | | | | | | | | We make the flush after a HiZ clear unconditional and add a flush/stall before the clear as well. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: blorp: support multiple aspect blitsLionel Landwerlin2018-08-291-70/+75
| | | | | | | | | | Newer blit tests are enabling depth&stencils blits. We currently don't support it but can do by iterating over the aspects masks (copy some logic from the CopyImage function). Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 9f44745eca0e41 ("anv: Use blorp to implement VkBlitImage") Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Take an explicit filter parameter in blorp_blitJason Ekstrand2018-07-181-8/+26
| | | | | | | | This lets us move the glBlitFramebuffer nonsense into the GL driver and make the usage of BLORP mutch more explicit and obvious as to what it's doing. Reviewed-by: Chad Versace <[email protected]>
* anv: Make subpass::depth_stencil_attachment a pointerJason Ekstrand2018-07-091-1/+1
| | | | | | | | This makes certain checks a bit easier and means that we don't have the attachment information duplicated in the attachment list and in depth_stencil_attachment. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Use a default pipeline cache if none is specifiedJason Ekstrand2018-07-021-7/+5
| | | | | | | | If a client is dumb enough to not specify a pipeline cache, give it a default. We have to create one anyway for blorp so we may as well let the client cache shaders in it. Reviewed-by: Timothy Arceri <[email protected]>
* anv: Add support for shader constant data to the pipeline cacheJason Ekstrand2018-07-021-0/+1
| | | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: Use an address for each anv_image planeJason Ekstrand2018-05-311-6/+6
| | | | | | This is better than having BO and offset fields. Reviewed-by: Scott D Phillips <[email protected]>
* anv: Use an anv_address in anv_bufferJason Ekstrand2018-05-311-8/+8
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* anv: Allow blitting to/from any supported formatJason Ekstrand2018-05-091-2/+1
| | | | | | | | Now that blorp handles all the cases, why not? The only real change we have to make is to stop using anv_swizzle_for_render() in blorp_blit because it doesn't work for B4G4R4A4 and blorp now natively handles that. Reviewed-by: Topi Pohjolainen <[email protected]>
* anv: Make blorp update the clear color.Rafael Antognolli2018-04-051-19/+50
| | | | | | | | | | | Instead of updating the clear color in anv before a resolve, just let blorp handle that for us during fast clears. v5: Update comment about HiZ clear color (Jordan). Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* anv/blorp: Pass the clear address to blorp for subpass MSAA resolvesJason Ekstrand2018-03-011-0/+6
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/blorp: Add partial clear support to anv_image_mcs_opJason Ekstrand2018-03-011-1/+14
| | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/blorp: multisample resolve all attachment layersIago Toral Quiroga2018-02-221-11/+20
| | | | | | | | | | | | | | | | | | | | | | | | | We were only resolving the first. v2: - Do not require that the number of layers on dst and src are an exact match, it is okay if the dst has more layers so long as it has at least the same that we are going to resolve. - Do not always resolve array_len layers, we should resolve only from base_array_layer to array_len. v3: - v2 was assuming that array_len represented the total number of layers in the image, but it represents the number of layers starting at the base array ayer. v4: - The number of layers to resolve should be taken from the framebuffer (Nanley). Fixes new CTS tests for multisampled layered rendering: dEQP-VK.renderpass.multisample_resolve.layers_* Reviewed-by: Nanley Chery <[email protected]>
* anv/blorp: Use layout_to_aux_usage when a layout is providedJason Ekstrand2018-02-201-25/+46
| | | | | | | | | | Instead of having aux usage and ANV_AUX_USAGE_DEFAULT to mean "give me something reasonable" we now use anv_layout_to_aux_usage whenever a layout is available. If a layout is available, we ignore the aux_usage parameter. For the cases where we have an explicit aux usage such as clears and aux ops, we have a new ANV_IMAGE_LAYOUT_EXPLICIT_AUX layout. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv/cmd_buffer: Move the rest of clear_subpass into begin_subpassJason Ekstrand2018-02-201-137/+103
| | | | Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Move the color portion of clear_subpass into begin_subpassJason Ekstrand2018-02-201-91/+33
| | | | | | | | | | This doesn't really change much now but it will give us more/better control over clears in the future. The one interesting functional change here is that we are now re-emitting 3DSTATE_DEPTH_BUFFERS and friends for each clear. However, this only happens at begin_subpass time so it shouldn't be substantially more expensive. Reviewed-by: Nanley Chery <[email protected]>
* anv/icl: Use gen11 functionsAnuj Phogat2018-02-161-0/+3
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Use isl_aux_op instead of blorp_hiz_opJason Ekstrand2018-02-081-14/+1
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Use isl_aux_op instead of blorp_fast_clear_opJason Ekstrand2018-02-081-13/+1
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Allow fast-clearing the first slice of a multi-slice imageJason Ekstrand2018-02-081-8/+15
| | | | | | | | Now that we're tracking aux properly per-slice, we can enable this for applications which actually care. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Rework aux trackingJason Ekstrand2018-02-081-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit completely reworks aux tracking. This includes a number of somewhat distinct changes: 1) Since we are no longer fast-clearing multiple slices, we only need to track one fast clear color and one fast clear type. 2) We store two bits for fast clear instead of one to let us distinguish between zero and non-zero fast clear colors. This is needed so that we can do full resolves when transitioning to PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear values in all sorts of places we wouldn't normally. 3) We now track compression state as a boolean separate from fast clear type and this is tracked on a per-slice granularity. The previous scheme had some issues when it came to individual slices of a multi-LOD images. In particular, we only tracked "needs resolve" per-LOD but you could do a vkCmdPipelineBarrier that would only resolve a portion of the image and would set "needs resolve" to false anyway. Also, any transition from an undefined layout would reset the clear color for the entire LOD regardless of whether or not there was some clear color on some other slice. As far as full/partial resolves go, he assumptions of the previous scheme held because the one case where we do need a full resolve when CCS_E is enabled is for window-system images. Since we only ever allowed X-tiled window-system images, CCS was entirely disabled on gen9+ and we never got CCS_E. With the advent of Y-tiled window-system buffers, we now need to properly support doing a full resolve of images marked CCS_E. v2 (Jason Ekstrand): - Fix an bug in the compressed flag offset calculation - Treat 3D images as multi-slice for the purposes of resolve tracking v3 (Jason Ekstrand): - Set the compressed flag whenever we fast-clear - Simplify the resolve predicate computation logic Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Use blorp_ccs_ambiguate instead of fast-clearsJason Ekstrand2018-02-081-0/+5
| | | | | | | | | | | | | | | | | | Even though the blorp pass looks a bit on the sketchy side, the end result in the Vulkan driver is very nice. Instead of having this weird case where you do a fast clear and then maybe have to resolve, we just do the ambiguate and are done with it. The ambiguate does exactly what we want of setting all the CCS values to 0 which puts it into the pass-through state. This should also improve performance a bit in certain cases. For instance, if we did a transition from UNDEFINED to GENERAL for a surface that doesn't have CCS enabled all the time, we would end up doing a fast-clear and then a full resolve which ends up touching every byte in the main surface as well as the CCS. With the ambiguate pass, that transition only touches the CCS. Reviewed-by: Nanley Chery <[email protected]>