summaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* radv: implement a workaround for VK_EXT_conditional_renderingSamuel Pitoiset2019-05-021-3/+44
| | | | | | | | | | | | | | | | | | Per the Vulkan spec 1.1.107, the predicate is a 32-bit value. Though the AMD hardware treats it as a 64-bit value which means it might fail to discard. I don't know why this extension has been drafted like that but this definitely not fit with AMD. The hardware doesn't seem to support a 32-bit value for the predicate, so we need to implement a workaround. This fixes an issue when DXVK enables conditional rendering with RADV, this also fixes the Sasha conditionalrender demo. Fixes: e45ba51ea45 ("radv: add support for VK_EXT_conditional_rendering") Reported-by: Philip Rebohle <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix color conversions for normalized uint/sint formatsSamuel Pitoiset2019-05-021-4/+16
| | | | | | | | | | | | | The hardware actually rounds before conversion. This now matches what values are used when performing fast clears vs slow clears. This fixes a rendering issue with Far Cry 3&4. This also fixes a bunch of CTS tests that use a 8-bit UNORM format (only when the 512*512 image size hint is manually disabled). Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not need to force emit the TCS regs on Vega20Samuel Pitoiset2019-05-021-0/+1
| | | | | | | | | This chip doesn't need the fixup. This fixes a bunch of dEQP-VK.tessellation tests and avoid random GPU hangs. Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Restrict YUVY formats to 1 layer.Bas Nieuwenhuizen2019-05-021-0/+7
| | | | | Fixes: 8bb3cec7c9b "radv: Expose VK_EXT_ycbcr_image_arrays." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Set is_array in lowered ycbcr tex instructions.Bas Nieuwenhuizen2019-05-021-0/+1
| | | | | | | Fixes array tests. Fixes: 91702374d5d "radv: Add ycbcr lowering pass." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Fix hang width YCBCR array textures.Bas Nieuwenhuizen2019-05-021-2/+6
| | | | | | | | | | Forgot to apply the width/height divisor for CB writes resulting in the CB using larger than expected slice sizes. Fixes: 42d159f2766 "radv: Add multiple planes to images." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110530 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110526 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: update to use the new features struct namesEric Engestrom2019-04-301-8/+8
| | | | | | | | | These were updated in version 1.1.106 of vulkan.h to make more sense with the extension names. We may as well keep with the times. See also: 90108deb277d33d19233 "anv: Update to use the new features struct names" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: enable descriptor indexing capabilitiesJuan A. Suarez Romero2019-04-301-0/+2
| | | | | | | | | This enables the remaining capabilities in SPV_EXT_descriptor_indexing. Fixes: 0e10790558b "radv: Enable VK_EXT_descriptor_indexing." Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* delete autotools .gitignore filesEric Engestrom2019-04-291-9/+0
| | | | | | | | One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: consider MESA_VK_VERSION_OVERRIDE when setting the api versionEleni Maria Stea2019-04-291-2/+5
| | | | | | | | | Before setting the physical device API version, we should check if the MESA_VK_VERSION_OVERRIDE environment variable is set and take it into account. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: add missing VEGA20 chip in radv_get_device_name()Samuel Pitoiset2019-04-271-0/+1
| | | | | | | | Otherwise it returns "AMD RADV unknown". Cc: 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Expose Vulkan 1.1 for Android.Bas Nieuwenhuizen2019-04-251-1/+1
| | | | | | We have the YCBCR feature now. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Expose VK_EXT_ycbcr_image_arrays.Bas Nieuwenhuizen2019-04-252-0/+7
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Enable YCBCR conversion feature.Bas Nieuwenhuizen2019-04-252-1/+2
| | | | | | | | | This enabled the basic YCBCR features. We support basic multiplane formats using 8-bit and 16-bit unorms, as well as YUV2 formats. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr subsampled & multiplane formats to csv.Bas Nieuwenhuizen2019-04-251-0/+12
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr format features.Bas Nieuwenhuizen2019-04-251-0/+27
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add hashing for the ycbcr samplers.Bas Nieuwenhuizen2019-04-252-7/+7
| | | | | | Otherwise caching gets very confused. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Run the new ycbcr lowering pass.Bas Nieuwenhuizen2019-04-253-3/+6
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr lowering pass.Bas Nieuwenhuizen2019-04-254-0/+376
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Update descriptor sets for multiple planes.Bas Nieuwenhuizen2019-04-253-18/+35
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr samplers in descriptor set layouts.Bas Nieuwenhuizen2019-04-252-2/+77
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Allow mixed src/dst aspects in copies.Bas Nieuwenhuizen2019-04-251-104/+116
| | | | | | e.g. COLOR + PLANE_2, as well COLOR + COLOR for multiplane images. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add support for image views with multiple planes.Bas Nieuwenhuizen2019-04-253-21/+41
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr conversion structs.Bas Nieuwenhuizen2019-04-253-4/+42
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Support different source & dest aspects for planar images in blit2d.Bas Nieuwenhuizen2019-04-251-2/+9
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add single plane image views & meta operations.Bas Nieuwenhuizen2019-04-254-10/+45
| | | | | | | Copies & clear of multiplane images is not allowed so we do not have to handle that case. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add multiple planes to images.Bas Nieuwenhuizen2019-04-257-135/+204
| | | | | | | | | | No functional changes. This temporarily uses plane 0 for everything. Long term plan is that only single plane images get to use metadata like htile/dcc/cmask/fmask. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add logic for multisample format descriptions.Bas Nieuwenhuizen2019-04-254-10/+86
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add logic for subsampled format descriptions.Bas Nieuwenhuizen2019-04-253-0/+28
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add adaptive_sync driconfig option and enable it by default.Bas Nieuwenhuizen2019-04-231-0/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* vulkan/wsi: Add X11 adaptive sync support based on dri options.Bas Nieuwenhuizen2019-04-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | The dri options are optional. When the dri options are not provided the WSI will not use adaptive sync. FWIW I think for xf86-video-amdgpu this still requires an X11 config option, so only people who opt in can get possible regressions from this. So then the remaining question is: why do this in the WSI? It has been suggested in another MR that the application sets this. However, I disagree with that as I don't think we'll ever get a reasonable set of applications setting it. The next questions is whether this can be a layer. It definitely can be as implemented now. However, I think this generally fits well with the function of the WSI. Furthemore, for e.g. the DISPLAY WSI this is much harder to do in a layer. Of course, most of the WSI could almost be a layer, but I think this still fits best in the WSI. Acked-by: Jason Ekstrand <[email protected]>
* radv: Add support for driconf.Bas Nieuwenhuizen2019-04-233-3/+23
| | | | | | | | | This includes 0 options. The cache parsing is located at a position where we can easily add config filtering by VkApplicationInfo. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: add VK_NV_compute_shader_derivates supportSamuel Pitoiset2019-04-223-0/+9
| | | | | | | | | Only computeDerivativeGroupLinear is supported for now. All crucible tests pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Support VK_EXT_inline_uniform_block.Bas Nieuwenhuizen2019-04-195-15/+124
| | | | | | | | | | | | | | | | | | | Basically just reserve the memory in the descriptor sets. On the shader side we construct a buffer descriptor, since AFAIU VGPR indexing on 32-bit pointers in LLVM is still broken. This fully supports update after bind and variable descriptor set sizes. However, the limits are somewhat arbitrary and are mostly about finding a reasonable division of a 2 GiB max memory size over the set. v2: - rebased on top of master (Samuel) - remove the loading resources rework (Samuel) - only load UBO descriptors if it's a pointer (Samuel) - use LLVMBuildPtrToInt to avoid IR failures (Samuel) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v2)
* radv: add VK_KHR_shader_atomic_int64 but disable it for nowSamuel Pitoiset2019-04-173-0/+12
| | | | | | | No support for 64-bit compare&swap atomic operations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* compiler/glsl: handle case where we have multiple users for typesTapani Pälli2019-04-161-0/+3
| | | | | | | | | | | | | | | | | | Both Vulkan and OpenGL might be using glsl_types simultaneously or we can also have multiple concurrent Vulkan instances using glsl_types. Patch adds a one time init to track number of users and will release types only when last user calls _glsl_type_singleton_decref(). This change fixes glsl_type memory leaks we have with anv driver. v2: reuse hash_mutex, cleanup, apply fix also to radv driver and rename helper functions (Jason) v3: move init, destroy to happen on GL context init and destroy Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: sort the shader capabilities alphabeticallySamuel Pitoiset2019-04-161-3/+3
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* radv: enable shaderInt8 on SI and CIKSamuel Pitoiset2019-04-162-4/+3
| | | | | | | No CTS failures. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* Delete autotoolsDylan Baker2019-04-151-200/+0
| | | | | | | | | | Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Marek Olšák <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Matt Turner <[email protected]>
* radv: set ACCESS_NON_READABLE on stores for copy/fill/clear meta shadersSamuel Pitoiset2019-04-152-0/+3
| | | | | | | The compiler will emit GLC=1. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use local buffers for the global bo list.Bas Nieuwenhuizen2019-04-153-2/+8
| | | | | | | | | | | | | | Even if we don't use local buffers in general. Turns out that even though the performance is not the best the kernel still does it better than our own list. We still have to keep the radv bo list for buffers that are shared externally. This improves Talos on lowest quality setting (so as CPU bound as possible) by ~10% if the global bo list is enabled. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add bolist RADV_PERFTEST flag.Bas Nieuwenhuizen2019-04-152-0/+3
| | | | | | To test global_bo_list performance. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: enable VK_KHR_shader_float16_int8Samuel Pitoiset2019-04-152-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use nir constant helpersKarol Herbst2019-04-142-20/+10
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: enable VK_AMD_gpu_shader_half_floatSamuel Pitoiset2019-04-101-0/+1
| | | | | | | | Should be safe to enable as all instructions seem to support 16-bit. Unfortunately, there is no CTS test. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]>
* radv: Add non-uniform indexing lowering.Bas Nieuwenhuizen2019-04-102-7/+12
| | | | | | | | | This patch does it as late as possible so the potential extra basic blocks don't inhibit other optimizations. Big thanks to Jason for writing the lowering pass. Reviewed-by: Samuel Pitoiset <[email protected]>
* nir/radv: remove restrictions on opt_if_loop_last_continue()Timothy Arceri2019-04-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I implemented opt_if_loop_last_continue() I had restricted this pass from moving other if-statements inside the branch opposite the continue. At the time it was causing a bunch of spilling in shader-db for i965. However Samuel Pitoiset noticed that making this pass more aggressive significantly improved the performance of Doom on RADV. Below are the statistics he gathered. 28717 shaders in 14931 tests Totals: SGPRS: 1267317 -> 1267549 (0.02 %) VGPRS: 896876 -> 895920 (-0.11 %) Spilled SGPRs: 24701 -> 26367 (6.74 %) Code Size: 48379452 -> 48507880 (0.27 %) bytes Max Waves: 241159 -> 241190 (0.01 %) Totals from affected shaders: SGPRS: 23584 -> 23816 (0.98 %) VGPRS: 25908 -> 24952 (-3.69 %) Spilled SGPRs: 503 -> 2169 (331.21 %) Code Size: 2471392 -> 2599820 (5.20 %) bytes Max Waves: 586 -> 617 (5.29 %) The codesize increases is related to Wolfenstein II it seems largely due to an increase in phis rather than the existing jumps. This gives +10% FPS with Doom on my Vega56. Rhys Perry also benchmarked Doom on his VEGA64: Before: 72.53 FPS After: 80.77 FPS v2: disable pass on non-AMD drivers Reviewed-by: Ian Romanick <[email protected]> (v1) Acked-by: Samuel Pitoiset <[email protected]>
* radv: fix getting the vertex strides if the bindings aren't contiguousSamuel Pitoiset2019-04-081-1/+15
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110349 Fixes: a66b186bebf ("radv: use typed buffer loads for vertex input fetches") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* simplify LLVM version string printingEric Engestrom2019-04-042-15/+6
| | | | | | | Figure it out once in the build system, then just use that all over the place. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: enable displayable DCC on RavensMarek Olšák2019-04-041-0/+4
|