aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* nir: Initialize lower_flrp_progress everywhereIan Romanick2019-05-091-1/+1
| | | | | | | | | | | | | | | | I don't know why I thought NIR_PASS always set the progress variable. Derp. Fixes: d41cdef2a59 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic") Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Coverity CID: 1444996 Coverity CID: 1444995 Coverity CID: 1444994 Coverity CID: 1444993 Coverity CID: 1444991 Coverity CID: 1444989
* radv: fix setting the number of rectangles when it's dyanmicSamuel Pitoiset2019-05-091-4/+6
| | | | | | | | | | We need to know the number of rectangles. This fixes new CTS dEQP-VK.draw.discard_rectangles.dynamic_*. Fixes: 5db0bf99944 ("radv: Implement VK_EXT_discard_rectangles.") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: call constant folding before opt algebraicTimothy Arceri2019-05-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | The pattern of calling opt algebraic first seems to have originated in i965. The order in OpenGL drivers generally doesn't matter because the GLSL IR optimisations do constant folding before opt algebraic. However in Vulkan drivers calling opt algebraic first can result in missed constant folding opportunities. vkpipeline-db results (VEGA64): Totals from affected shaders: SGPRS: 3160 -> 3176 (0.51 %) VGPRS: 3588 -> 3580 (-0.22 %) Spilled SGPRs: 52 -> 44 (-15.38 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 12 -> 12 (0.00 %) dwords per thread Code Size: 261812 -> 261036 (-0.30 %) bytes LDS: 7 -> 7 (0.00 %) blocks Max Waves: 346 -> 348 (0.58 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Use the flrp lowering pass instead of nir_opt_algebraicIan Romanick2019-05-061-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | I tried to be very careful while updating all the various drivers, but I don't have any of that hardware for testing. :( i965 is the only platform that sets always_precise = true, and it is only set true for fragment shaders. Gen4 and Gen5 both set lower_flrp32 only for vertex shaders. For fragment shaders, nir_op_flrp is lowered during code generation as a(1-c)+bc. On all other platforms 64-bit nir_op_flrp and on Gen11 32-bit nir_op_flrp are lowered using the old nir_opt_algebraic method. No changes on any other Intel platforms. v2: Add panfrost changes. Iron Lake and GM45 had similar results. (Iron Lake shown) total cycles in shared programs: 188647754 -> 188647748 (<.01%) cycles in affected programs: 5096 -> 5090 (-0.12%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12% Reviewed-by: Matt Turner <[email protected]>
* radv: fix rowPitch for R32G32B32 formats on GFX9Samuel Pitoiset2019-05-061-1/+13
| | | | | | | | | | The pitch is actually the number of components per row. We found the problem when we implemented some meta operations for these formats and the wrong pitch has been confirmed with a small test case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108325 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use given stride for images imported from Android.Bas Nieuwenhuizen2019-05-063-0/+35
| | | | | | Handled similarly as radeonsi. I checked the offsets are actually used. Acked-by: Samuel Pitoiset <[email protected]>
* radv: Implement cosited_even sampling.Bas Nieuwenhuizen2019-05-062-2/+83
| | | | | | | | | | Apparently cosited_even was the required one instead of midpoint. This adds slight offset of 0.5 pixels to the coordinates (+ we need the image size to convert to normalized coords) Fixes: 91702374d5d "radv: Add ycbcr lowering pass." Acked-by: Samuel Pitoiset <[email protected]>
* radv: Disable subsampled formats.Bas Nieuwenhuizen2019-05-061-1/+2
| | | | | | | | | | | | | Broken on Polaris and since I discovered NV12 is not subsampled, but a 2-plane format I decided I don't really care. Work to do to re-enable: 1) Figure out which devices support it natively. 2) Write some software emulation for the others. Fixes: 52c1adda21b "radv: Add ycbcr format features." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: apply the indexing workaround for atomic buffer operations on GFX9Samuel Pitoiset2019-05-031-0/+6
| | | | | | | | | | | | | | | | | | | Because the new raw/struct intrinsics are buggy with LLVM 8 (they weren't marked as source of divergence), we fallback to the old instrinsics for atomic buffer operations only. This means we need to apply the indexing workaround for GFX9. The load/store operations still use the new LLVM 8 intrinsics. The fact that we need another workaround is painful but we should be able to clean up that a bit once LLVM 7 support will be dropped. This fixes a GPU hang with AC Odyssey and some rendering problems with Nioh. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573 Fixes: 31164cf5f70 ("ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix radv_get_aspect_format() for D+S formatsSamuel Pitoiset2019-05-031-0/+2
| | | | | | | | | | | | This restores the previous behaviour before YCBCR landed. For D+S formats, it returns the depth format. This fixes an assertion with Thrones of Britannia. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110540 Fixes: 66507cc6563 ("radv: Add single plane image views & meta operations") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only need to force emit the TCS regs on Vega10 and Raven1Samuel Pitoiset2019-05-021-2/+2
| | | | | | | | Other GFX9 chips aren't affected. Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set WD_SWITCH_ON_EOP=1 when drawing primitives from a stream output bufferSamuel Pitoiset2019-05-023-0/+9
| | | | | | | | | | | | | According to RadeonSI, this seems to be required by the hardware to avoid GPU hangs. I think I just forgot to set that bit when I implemented VK_EXT_transform_feedback. This fixes a GPU hang with Space Engineers and DXVK. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110291 Fixes: b4eb029062a ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix set_output_usage_mask() with composite and 64-bit typesRhys Perry2019-05-021-4/+17
| | | | | | | | | | | | | | | | It previously used var->type instead of deref_instr->type and didn't handle 64-bit outputs. This fixes lots of transform feedback CTS tests involving transform feedback and geometry shaders (mostly dEQP-VK.transform_feedback.fuzz.random_geometry.*) v2: fix writemask widening when comp != 0 v3: fix 64-bit variables when comp != 0, again Signed-off-by: Rhys Perry <[email protected]> Cc: 19.0 19.1 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: implement a workaround for VK_EXT_conditional_renderingSamuel Pitoiset2019-05-021-3/+44
| | | | | | | | | | | | | | | | | | Per the Vulkan spec 1.1.107, the predicate is a 32-bit value. Though the AMD hardware treats it as a 64-bit value which means it might fail to discard. I don't know why this extension has been drafted like that but this definitely not fit with AMD. The hardware doesn't seem to support a 32-bit value for the predicate, so we need to implement a workaround. This fixes an issue when DXVK enables conditional rendering with RADV, this also fixes the Sasha conditionalrender demo. Fixes: e45ba51ea45 ("radv: add support for VK_EXT_conditional_rendering") Reported-by: Philip Rebohle <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix color conversions for normalized uint/sint formatsSamuel Pitoiset2019-05-021-4/+16
| | | | | | | | | | | | | The hardware actually rounds before conversion. This now matches what values are used when performing fast clears vs slow clears. This fixes a rendering issue with Far Cry 3&4. This also fixes a bunch of CTS tests that use a 8-bit UNORM format (only when the 512*512 image size hint is manually disabled). Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not need to force emit the TCS regs on Vega20Samuel Pitoiset2019-05-021-0/+1
| | | | | | | | | This chip doesn't need the fixup. This fixes a bunch of dEQP-VK.tessellation tests and avoid random GPU hangs. Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Restrict YUVY formats to 1 layer.Bas Nieuwenhuizen2019-05-021-0/+7
| | | | | Fixes: 8bb3cec7c9b "radv: Expose VK_EXT_ycbcr_image_arrays." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Set is_array in lowered ycbcr tex instructions.Bas Nieuwenhuizen2019-05-021-0/+1
| | | | | | | Fixes array tests. Fixes: 91702374d5d "radv: Add ycbcr lowering pass." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Fix hang width YCBCR array textures.Bas Nieuwenhuizen2019-05-021-2/+6
| | | | | | | | | | Forgot to apply the width/height divisor for CB writes resulting in the CB using larger than expected slice sizes. Fixes: 42d159f2766 "radv: Add multiple planes to images." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110530 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110526 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: update to use the new features struct namesEric Engestrom2019-04-301-8/+8
| | | | | | | | | These were updated in version 1.1.106 of vulkan.h to make more sense with the extension names. We may as well keep with the times. See also: 90108deb277d33d19233 "anv: Update to use the new features struct names" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: enable descriptor indexing capabilitiesJuan A. Suarez Romero2019-04-301-0/+2
| | | | | | | | | This enables the remaining capabilities in SPV_EXT_descriptor_indexing. Fixes: 0e10790558b "radv: Enable VK_EXT_descriptor_indexing." Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* delete autotools .gitignore filesEric Engestrom2019-04-291-9/+0
| | | | | | | | One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: consider MESA_VK_VERSION_OVERRIDE when setting the api versionEleni Maria Stea2019-04-291-2/+5
| | | | | | | | | Before setting the physical device API version, we should check if the MESA_VK_VERSION_OVERRIDE environment variable is set and take it into account. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: add missing VEGA20 chip in radv_get_device_name()Samuel Pitoiset2019-04-271-0/+1
| | | | | | | | Otherwise it returns "AMD RADV unknown". Cc: 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Expose Vulkan 1.1 for Android.Bas Nieuwenhuizen2019-04-251-1/+1
| | | | | | We have the YCBCR feature now. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Expose VK_EXT_ycbcr_image_arrays.Bas Nieuwenhuizen2019-04-252-0/+7
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Enable YCBCR conversion feature.Bas Nieuwenhuizen2019-04-252-1/+2
| | | | | | | | | This enabled the basic YCBCR features. We support basic multiplane formats using 8-bit and 16-bit unorms, as well as YUV2 formats. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr subsampled & multiplane formats to csv.Bas Nieuwenhuizen2019-04-251-0/+12
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr format features.Bas Nieuwenhuizen2019-04-251-0/+27
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add hashing for the ycbcr samplers.Bas Nieuwenhuizen2019-04-252-7/+7
| | | | | | Otherwise caching gets very confused. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Run the new ycbcr lowering pass.Bas Nieuwenhuizen2019-04-253-3/+6
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr lowering pass.Bas Nieuwenhuizen2019-04-254-0/+376
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Update descriptor sets for multiple planes.Bas Nieuwenhuizen2019-04-253-18/+35
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr samplers in descriptor set layouts.Bas Nieuwenhuizen2019-04-252-2/+77
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Allow mixed src/dst aspects in copies.Bas Nieuwenhuizen2019-04-251-104/+116
| | | | | | e.g. COLOR + PLANE_2, as well COLOR + COLOR for multiplane images. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add support for image views with multiple planes.Bas Nieuwenhuizen2019-04-253-21/+41
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr conversion structs.Bas Nieuwenhuizen2019-04-253-4/+42
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Support different source & dest aspects for planar images in blit2d.Bas Nieuwenhuizen2019-04-251-2/+9
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add single plane image views & meta operations.Bas Nieuwenhuizen2019-04-254-10/+45
| | | | | | | Copies & clear of multiplane images is not allowed so we do not have to handle that case. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add multiple planes to images.Bas Nieuwenhuizen2019-04-257-135/+204
| | | | | | | | | | No functional changes. This temporarily uses plane 0 for everything. Long term plan is that only single plane images get to use metadata like htile/dcc/cmask/fmask. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add logic for multisample format descriptions.Bas Nieuwenhuizen2019-04-254-10/+86
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add logic for subsampled format descriptions.Bas Nieuwenhuizen2019-04-253-0/+28
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add adaptive_sync driconfig option and enable it by default.Bas Nieuwenhuizen2019-04-231-0/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* vulkan/wsi: Add X11 adaptive sync support based on dri options.Bas Nieuwenhuizen2019-04-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | The dri options are optional. When the dri options are not provided the WSI will not use adaptive sync. FWIW I think for xf86-video-amdgpu this still requires an X11 config option, so only people who opt in can get possible regressions from this. So then the remaining question is: why do this in the WSI? It has been suggested in another MR that the application sets this. However, I disagree with that as I don't think we'll ever get a reasonable set of applications setting it. The next questions is whether this can be a layer. It definitely can be as implemented now. However, I think this generally fits well with the function of the WSI. Furthemore, for e.g. the DISPLAY WSI this is much harder to do in a layer. Of course, most of the WSI could almost be a layer, but I think this still fits best in the WSI. Acked-by: Jason Ekstrand <[email protected]>
* radv: Add support for driconf.Bas Nieuwenhuizen2019-04-233-3/+23
| | | | | | | | | This includes 0 options. The cache parsing is located at a position where we can easily add config filtering by VkApplicationInfo. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: add VK_NV_compute_shader_derivates supportSamuel Pitoiset2019-04-223-0/+9
| | | | | | | | | Only computeDerivativeGroupLinear is supported for now. All crucible tests pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Support VK_EXT_inline_uniform_block.Bas Nieuwenhuizen2019-04-195-15/+124
| | | | | | | | | | | | | | | | | | | Basically just reserve the memory in the descriptor sets. On the shader side we construct a buffer descriptor, since AFAIU VGPR indexing on 32-bit pointers in LLVM is still broken. This fully supports update after bind and variable descriptor set sizes. However, the limits are somewhat arbitrary and are mostly about finding a reasonable division of a 2 GiB max memory size over the set. v2: - rebased on top of master (Samuel) - remove the loading resources rework (Samuel) - only load UBO descriptors if it's a pointer (Samuel) - use LLVMBuildPtrToInt to avoid IR failures (Samuel) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v2)
* radv: add VK_KHR_shader_atomic_int64 but disable it for nowSamuel Pitoiset2019-04-173-0/+12
| | | | | | | No support for 64-bit compare&swap atomic operations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* compiler/glsl: handle case where we have multiple users for typesTapani Pälli2019-04-161-0/+3
| | | | | | | | | | | | | | | | | | Both Vulkan and OpenGL might be using glsl_types simultaneously or we can also have multiple concurrent Vulkan instances using glsl_types. Patch adds a one time init to track number of users and will release types only when last user calls _glsl_type_singleton_decref(). This change fixes glsl_type memory leaks we have with anv driver. v2: reuse hash_mutex, cleanup, apply fix also to radv driver and rename helper functions (Jason) v3: move init, destroy to happen on GL context init and destroy Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: sort the shader capabilities alphabeticallySamuel Pitoiset2019-04-161-3/+3
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>