summaryrefslogtreecommitdiffstats
path: root/src/intel/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* anv: fix alphaToCoverage when there is no color attachmentSamuel Iglesias Gonsálvez2019-05-071-10/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are tests in CTS for alpha to coverage without a color attachment that are failing. This happens because we remove the shader color outputs when we don't have a valid color attachment for them, but when alpha to coverage is enabled we still want to preserve the the output at location 0 since we need the alpha component. In that case we will also need to create a null render target for RT 0. v2: - We already create a null rt when we don't have any, so reuse that for this case (Jason) - Simplify the code a bit (Iago) v3: - Take alpha to coverage from the key and don't tie this to depth-only rendering only, we want the same behavior if we have multiple render targets but the one at location 0 is not used. (Jason). - Rewrite commit message (Iago) v4: - Make sure we take into account the array length of the shader outputs, which we were no handling correctly either and make sure we also create null render targets for any invalid array entries too. v5: - Simplify removal of unused outputs by using rt_used[] so we don't have to special case alpha to coverage there too. Fixes the following CTS tests: dEQP-VK.pipeline.multisample.alpha_to_coverage_no_color_attachment.* Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Signed-off-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv,i965: Stop warning about incomplete gen11 supportJason Ekstrand2019-05-031-4/+2
| | | | | | | | Both drivers are feature-complete and should be running more-or-less at perf at this point. Drop the warning. Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* anv: fix crash when application does not provide push constantsLionel Landwerlin2019-05-031-1/+1
| | | | | | | | | | | | Found while running Talos Principle. As far as I can tell running a draw call with a pipeline having push constants without the application having called vkCmdPushConstants gives undefined push constant values. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected]
* anv: Stop including POS in FS input limitsJason Ekstrand2019-05-021-1/+1
| | | | | | | It is an input but it comes in as part of the shader payload and doesn't count towards the limits. Reviewed-by: Kenneth Graunke <[email protected]>
* anv: add support for VK_EXT_memory_budgetEric Engestrom2019-04-303-0/+92
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: enable descriptor indexing capabilitiesJuan A. Suarez Romero2019-04-301-0/+2
| | | | | | | | | This enables the remaining capabilities in SPV_EXT_descriptor_indexing. Fixes: 6e230d7607f "anv: Implement VK_EXT_descriptor_indexing" Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* delete autotools .gitignore filesEric Engestrom2019-04-292-13/+0
| | | | | | | | One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* Revert "anv: limit URB reconfigurations when using blorp"Lionel Landwerlin2019-04-293-11/+3
| | | | | | | | | | | | | | | | | | | In commit 0d46e404 ("anv: limit URB reconfigurations when using blorp") we tried to limit the number of URB reconfiguration by checking if the last allocation is large enough to fit the blorp dispatch. We used the last bound pipeline to compare the allocation. The problem with this is that the pipeline is bound but its commands might not have been emitted into the command buffer yet. Let's just revert commit 0d46e404677264bfb12ada15290e39c10a5eb455 since it didn't seem to yield any performance improvement. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 0d46e404 ("anv: limit URB reconfigurations when using blorp") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110535 Acked-by: Jason Ekstrand <[email protected]>
* anv: expose VK_EXT_queue_family_foreign on AndroidTapani Pälli2019-04-291-0/+1
| | | | | | | | | | | | | | VK_ANDROID_external_memory_android_hardware_buffer requires this extension. It is safe to enable it since currently aux usage is disabled for ahw buffers. Fixes following dEQP extension dependency test on Android: dEQP-VK.api.info.device#extensions Cc: <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/descriptor_set: Don't fully destroy sets in pool destroy/resetJason Ekstrand2019-04-261-2/+3
| | | | | | | | | | | | | | | | | | | In 105002bd2d617, we fixed a memory leak bug where we weren't properly destroying descriptor when destroying/resetting a descriptor pool. However, the only real leak that happened was that we we take a reference to the descriptor set layout in the descriptor set and we weren't dropping our reference. Everything else in the descriptor set is tied to the pool itself and doesn't need to be freed on a per-set basis. This commit changes the destroy/reset functions to only bother walking the list of sets to unref the layouts and otherwise we just assume that the whole-pool destroy/reset takes care of the rest. Now that we're doing more non-trivial things with descriptor sets such as allocating things with util_vma_heap, per-set destruction is starting to show up on perf traces. This takes reset back to where it's supposed to be as a cheap whole-pool operation. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Better handle 32-byte alignment of descriptor set buffersJason Ekstrand2019-04-261-3/+3
| | | | | | | | | | | | | | | | | | | In c520f4dec9c, we chose to align the sizes of descriptor set buffers to 32 bytes. We have to align the descriptor set buffer to 32B so that it's valid for using with push constants. We align the size as well so we don't leave lots of holes with util_vma_heap_alloc. Unfortunately, we were only aligning it for alloc and not for free so we were still creating piles of holes when we delete descriptor sets. This causes terrible perf for the allocator once we've deleted piles of descriptor sets. This commit reworks the code so that we align the descriptor set buffer size to 32B for both alloc and free. The result is that it takes the new crucible vkResetDescriptorPool from 104.567719 to 2.898354 seconds. Fixes: c520f4dec9c "anv: Add a concept of a descriptor buffer" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110497 Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/descriptor_set: Properly align descriptor buffer to a pageJason Ekstrand2019-04-241-1/+1
| | | | | | | | | Instead of aligning and then taking inline uniforms into account, we need to take inline uniforms into account and then align to a page. Otherwise, we may not be aligned to a page and allocation may fail. Fixes: 43f40dc7cb2 "anv: Implement VK_EXT_inline_uniform_block" Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/descriptor_set: Only vma_heap_finish if we have a descriptor bufferJason Ekstrand2019-04-241-2/+1
| | | | | Fixes: 7bb34ecff98 "anv: release memory allocated by bo_heap when..." Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/descriptor_set: Destroy sets before pool finalizationJason Ekstrand2019-04-241-5/+5
| | | | | Fixes: 105002bd2d "anv: destroy descriptor sets when pool gets..." Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/descriptor_set: Unlink sets from the pool in set_destroyJason Ekstrand2019-04-241-4/+4
| | | | | | | | | | anv_descriptor_pool_free_set is called on the clean-up path of anv_descriptor_set_create and the set may not have been added to the pool's list of sets yet. While we're here, we move adding it to that list into set_create for symmetry. Fixes: 105002bd2d "anv: destroy descriptor sets when pool gets..." Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan/wsi: Add X11 adaptive sync support based on dri options.Bas Nieuwenhuizen2019-04-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | The dri options are optional. When the dri options are not provided the WSI will not use adaptive sync. FWIW I think for xf86-video-amdgpu this still requires an X11 config option, so only people who opt in can get possible regressions from this. So then the remaining question is: why do this in the WSI? It has been suggested in another MR that the application sets this. However, I disagree with that as I don't think we'll ever get a reasonable set of applications setting it. The next questions is whether this can be a layer. It definitely can be as implemented now. However, I think this generally fits well with the function of the WSI. Furthemore, for e.g. the DISPLAY WSI this is much harder to do in a layer. Of course, most of the WSI could almost be a layer, but I think this still fits best in the WSI. Acked-by: Jason Ekstrand <[email protected]>
* anv: fix argument name for vkCmdEndQueryLionel Landwerlin2019-04-241-2/+2
| | | | | | | | | | Doesn't fix anything but it's not the right function prototype. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 673f33c77dd765 ("anv: Implement CmdBegin/EndQueryIndexed") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Sagar Ghuge <[email protected]>
* anv: Rework the descriptor set layout create loopJason Ekstrand2019-04-191-14/+13
| | | | | | | | | | | Previously, we were storing the per-binding create info pointer in the immutable_samplers field temporarily so that we can switch the order in which we walk the loop. However, now that we have multiple arrays of structs to walk, it makes more sense to store an index of some sort. Because we want to leave immutable_samplers as NULL for undefined bindings, we store index + 1 and then subtract one later. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Ignore descriptor binding flags if bindingCount == 0Jason Ekstrand2019-04-191-3/+2
| | | | | | | | | I missed this on the first go round. The bindingCount field of VkDescriptorSetLayoutBindingFlagsCreateInfoEXT is allowed to be zero which means the flags array is ignored. Fixes: d6c9bd6e01b4d "anv: Put binding flags in descriptor set layouts" Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv/nir: Add a central helper for figuring out SSBO address formatsJason Ekstrand2019-04-193-57/+98
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Implement VK_EXT_descriptor_indexingJason Ekstrand2019-04-195-2/+93
| | | | | | | | | Now that everything is in place to do bindless for all resource types except input attachments and UBOs, VK_EXT_descriptor_indexing is "trivial". Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Put binding flags in descriptor set layoutsJason Ekstrand2019-04-192-0/+19
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Use bindless handles for imagesJason Ekstrand2019-04-194-4/+61
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Use bindless textures and samplersJason Ekstrand2019-04-196-31/+228
| | | | | | | | | | This commit changes anv to put bindless handles and sampler pointers into the descriptor buffer and use those instead of bindful when we run out of binding table space. This "spilling" of descriptors allows to to advertise an almost unbounded number of images and samplers. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Pass the plane into lower_tex_derefJason Ekstrand2019-04-191-7/+5
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Use write_image_view to initialize immutable samplersJason Ekstrand2019-04-191-5/+13
| | | | | | | | | Instead of setting it manually, call the helper. When setting descriptor sets becomes more complicated than just setting some struct values, this will keep immutable sampler handling correct. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Count the number of planes in each descriptor bindingJason Ekstrand2019-04-192-3/+19
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Implement VK_KHR_shader_atomic_int64Jason Ekstrand2019-04-194-2/+21
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Implement SSBOs bindings with GPU addresses in the descriptor BOJason Ekstrand2019-04-196-35/+347
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a new way for ANV to do SSBO bindings by just passing a GPU address in through the descriptor buffer and using the A64 messages to access the GPU address directly. This means that our variable pointers are now "real" pointers instead of a vec2(BTI, offset) pair. This carries a few of advantages: 1. It lets us support a virtually unbounded number of SSBO bindings. 2. It lets us implement VK_KHR_shader_atomic_int64 which we couldn't implement before because those atomic messages are only available in the bindless A64 form. 3. It's way better than messing around with bindless handles for SSBOs which is the only other option for VK_EXT_descriptor_indexing. 4. It's more future looking, maybe? At the least, this is what NVIDIA does (they don't have binding based SSBOs at all). This doesn't a priori mean it's better, it just means it's probably not terrible. The big disadvantage, of course, is that we have to start doing our own bounds checking for robustBufferAccess again have to push in dynamic offsets. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Lower some SSBO operations in apply_pipeline_layoutJason Ekstrand2019-04-191-2/+212
| | | | | | | | | | | | In order to avoid the potential overhead of A64 operations on all SSBO ops, we look for those SSBO ops where we can get to the descriptor set from the SSBO access operation and lower those to a binding-table approach. When robustBufferAccess is enabled, this lets the hardware do the bounds checking for us. It also avoids some potentially expensive 64-bit integer calculations. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Add a has_a64_buffer_access to anv_physical_deviceJason Ekstrand2019-04-194-6/+11
| | | | | | | | This is more descriptive and a bit nicer than checking for gen >= 8 && use_softpin everywhere. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv/pipeline: Add skeleton support for spilling to bindlessJason Ekstrand2019-04-194-27/+122
| | | | | | | | | | If the number of surfaces or samplers exceeds what we can put in a table, we will want to spill out to bindless. There is no bindless support yet but this gets us the basic framework that will be used by later commits. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv/pipeline: Sort bindings by most used firstJason Ekstrand2019-04-191-40/+95
| | | | | | | | | This commit just sorts the bindings by how often they're used vs the array size of the binding. This will let us make more nuanced decisions about what goes in the binding table vs. what to make bindless. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Add a #define for the max binding table sizeJason Ekstrand2019-04-193-4/+16
| | | | | | | | | This also fixes a bug where we mis-calculate maximum binding table sizes and may return true in vkGetDescriptorSetLayoutSupport even for sets too large to fit in a binding table. Fixes: ddc40691221 "anv: Implement VK_KHR_maintenance3" Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Put image params in the descriptor set buffer on gen8 and earlierJason Ekstrand2019-04-196-124/+109
| | | | | | | | | | | | This is really where they belong; not push constants. The one downside here is that we can't push them anymore for compute shaders. However, that's a general problem and we should figure out how to push descriptor sets for compute shaders. This lets us bump MAX_IMAGES to 64 on BDW and earlier platforms because we no longer have to worry about push constant overhead limits. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Make all VkDeviceMemory BOs resident permanentlyJason Ekstrand2019-04-194-46/+48
| | | | | | | | | | | | | | | | | | | | | We spend a lot of time in the driver adding things to hash sets to track residency. The reality is that a properly built Vulkan app uses large memory objects and sub-allocates from them. In a typical frame, most of if not all of those allocations are going to be resident for the entire frame so we're really not saving ourselves much by tracking fine-grained residency. Just throwing everything in the validation list does make it a little bit more expensive inside the kernel to walk the list and ensure that all our VA is in order. However, without relocations, the overhead of that is pretty small. If we ever do run into a memory pressure situation where the fine- grained residency could even potentially help, we would likely be swapping one page out to make room for another within the draw call and performance is totally lost at that point. We're better off swapping out other apps and just letting ours run a whole frame. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: limit URB reconfigurations when using blorpLionel Landwerlin2019-04-193-3/+11
| | | | | | | | | | | If the last graphics pipeline bound to the command buffer has enough space in its VS URB entries for Blorp then avoid reconfiguring the URB partitions. v2: s/0/MESA_SHADER_VERTEX/ (Caio) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: fix uninitialized pthread cond clock domainLionel Landwerlin2019-04-181-1/+1
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 843775bab78a6b ("anv: Rework fences") Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Drop some unneeded ANV_FROM_HANDLE for physical devicesJason Ekstrand2019-04-181-6/+0
| | | | | | Ever since 48ed2a7bb009618ed, we've had one at the top of the function. Reviewed-by: Caio Marcelo de Oliveira Filho [email protected]
* anv: Re-sort the GetPhysicalDeviceFeatures2 switch statementJason Ekstrand2019-04-181-17/+17
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: implement WaEnableStateCacheRedirectToCSLionel Landwerlin2019-04-181-0/+11
| | | | | | | | | | | | This 3d performance workaround was initially put in the kernel but the media driver requires different settings so the register has been whitelisted in i915 [1] and userspace drivers are left initializing it as they wish. [1] : https://patchwork.freedesktop.org/series/59494/ Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* anv/device: expose VK_KHR_shader_float16_int8 in gen8+Iago Toral Quiroga2019-04-182-0/+10
| | | | | | | | v2 (Jason): - Merge shaderFloat16 and shaderInt8 enablement into a single patch. - Merge extension enable. Reviewed-by: Jason Ekstrand <[email protected]> (v1)
* anv/pipeline: support Float16 and Int8 SPIR-V capabilities in gen8+Iago Toral Quiroga2019-04-181-0/+2
| | | | | | | | | | | | | | | v2: - Merge Float16 and Int8 capabilities into a single patch (Jason) - Merged patch that enabled SPIR-V front-end checks for these caps (except for Int8, which was already merged) v3: - Keep capabilities sorted (Jason) v4: - SpvCapabilityFloat16 support already added in master (Juan) Reviewed-by: Jason Ekstrand <[email protected]> (v1)
* meson: Add dependency on genxml to anvil genfilesJuan A. Suarez Romero2019-04-171-1/+1
| | | | | | | | | | | | This fixes a race condition where anv_gen_files are executed before genxml files, which causes a build failure v2: add dependency on idep_genxml (Lionel) Fixes: d1992255bb29054fa51763376d125183a9f602f ("meson: Add build Intel "anv" vulkan driver") Reviewed-by: Lionel Landwerlin <[email protected]>
* compiler/glsl: handle case where we have multiple users for typesTapani Pälli2019-04-161-0/+3
| | | | | | | | | | | | | | | | | | Both Vulkan and OpenGL might be using glsl_types simultaneously or we can also have multiple concurrent Vulkan instances using glsl_types. Patch adds a one time init to track number of users and will release types only when last user calls _glsl_type_singleton_decref(). This change fixes glsl_type memory leaks we have with anv driver. v2: reuse hash_mutex, cleanup, apply fix also to radv driver and rename helper functions (Jason) v3: move init, destroy to happen on GL context init and destroy Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Update to use the new features struct namesJason Ekstrand2019-04-151-6/+6
| | | | | | | | These were updated in version 1.1.106 of vulkan.h to make more sense with the extension names. We may as well keep with the times. Acked-by: Dave Airlie <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: Emit 3DSTATE_VF_STATISTICS dynamicallyKenneth Graunke2019-04-141-5/+0
| | | | | | | | | | | | | | | | | | | | | Pipeline statistics queries should not count BLORP's rectangles. (23) How do operations like Clear, TexSubImage, etc. affect the results of the newly introduced queries? DISCUSSION: Implementations might require "helper" rendering commands be issued to implement certain operations like Clear, TexSubImage, etc. RESOLVED: They don't. Only application submitted rendering commands should have an effect on the results of the queries. Piglit's arb_pipeline_statistics_query-vert_adj exposes this bug when the driver is hacked to always perform glBufferData via a GPU staging copy (for debugging purposes). Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: make nir_const_value scalarKarol Herbst2019-04-141-23/+27
| | | | | | | | | v2: remove & operator in a couple of memsets add some memsets v3: fixup lima Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v2)
* anv: leave the top 4Gb of the high heap VMA unusedLionel Landwerlin2019-04-131-5/+5
| | | | | | | | | | | | | In 628c9ca9089789 I forgot to apply the same -4Gb of the high address of the high heap VMA. This was previously computed in the HIGH_HEAP_MAX_ADDRESS. Many thanks to James for pointing this out. Signed-off-by: Lionel Landwerlin <[email protected]> Reported-by: Xiong, James <[email protected]> Fixes: 628c9ca9089789 ("anv: store heap address bounds when initializing physical device") Reviewed-by: Jason Ekstrand <[email protected]>
* anv/pipeline: Fix MEDIA_VFE_STATE::PerThreadScratchSpace on gen7Jason Ekstrand2019-04-121-3/+23
| | | | | | | | We were always programming it with the Broadwell convention which is too large by a factor of two on Haswell and just plain wrong on IVB and BYT. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]