summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: Implement cosited_even sampling.Bas Nieuwenhuizen2019-05-062-2/+83
| | | | | | | | | | Apparently cosited_even was the required one instead of midpoint. This adds slight offset of 0.5 pixels to the coordinates (+ we need the image size to convert to normalized coords) Fixes: 91702374d5d "radv: Add ycbcr lowering pass." Acked-by: Samuel Pitoiset <[email protected]>
* radv: Disable subsampled formats.Bas Nieuwenhuizen2019-05-061-1/+2
| | | | | | | | | | | | | Broken on Polaris and since I discovered NV12 is not subsampled, but a 2-plane format I decided I don't really care. Work to do to re-enable: 1) Figure out which devices support it natively. 2) Write some software emulation for the others. Fixes: 52c1adda21b "radv: Add ycbcr format features." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: apply the indexing workaround for atomic buffer operations on GFX9Samuel Pitoiset2019-05-033-5/+14
| | | | | | | | | | | | | | | | | | | Because the new raw/struct intrinsics are buggy with LLVM 8 (they weren't marked as source of divergence), we fallback to the old instrinsics for atomic buffer operations only. This means we need to apply the indexing workaround for GFX9. The load/store operations still use the new LLVM 8 intrinsics. The fact that we need another workaround is painful but we should be able to clean up that a bit once LLVM 7 support will be dropped. This fixes a GPU hang with AC Odyssey and some rendering problems with Nioh. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573 Fixes: 31164cf5f70 ("ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix radv_get_aspect_format() for D+S formatsSamuel Pitoiset2019-05-031-0/+2
| | | | | | | | | | | | This restores the previous behaviour before YCBCR landed. For D+S formats, it returns the depth format. This fixes an assertion with Thrones of Britannia. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110540 Fixes: 66507cc6563 ("radv: Add single plane image views & meta operations") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only need to force emit the TCS regs on Vega10 and Raven1Samuel Pitoiset2019-05-021-2/+2
| | | | | | | | Other GFX9 chips aren't affected. Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set WD_SWITCH_ON_EOP=1 when drawing primitives from a stream output bufferSamuel Pitoiset2019-05-023-0/+9
| | | | | | | | | | | | | According to RadeonSI, this seems to be required by the hardware to avoid GPU hangs. I think I just forgot to set that bit when I implemented VK_EXT_transform_feedback. This fixes a GPU hang with Space Engineers and DXVK. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110291 Fixes: b4eb029062a ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix set_output_usage_mask() with composite and 64-bit typesRhys Perry2019-05-021-4/+17
| | | | | | | | | | | | | | | | It previously used var->type instead of deref_instr->type and didn't handle 64-bit outputs. This fixes lots of transform feedback CTS tests involving transform feedback and geometry shaders (mostly dEQP-VK.transform_feedback.fuzz.random_geometry.*) v2: fix writemask widening when comp != 0 v3: fix 64-bit variables when comp != 0, again Signed-off-by: Rhys Perry <[email protected]> Cc: 19.0 19.1 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: tidy up ac_build_llvm8_tbuffer_{load,store}Samuel Pitoiset2019-05-021-13/+13
| | | | | | | | For consistency with ac_build_llvm8_buffer_{load,store}_common helpers and that will help a bit for removing the vec3 restriction. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement a workaround for VK_EXT_conditional_renderingSamuel Pitoiset2019-05-021-3/+44
| | | | | | | | | | | | | | | | | | Per the Vulkan spec 1.1.107, the predicate is a 32-bit value. Though the AMD hardware treats it as a 64-bit value which means it might fail to discard. I don't know why this extension has been drafted like that but this definitely not fit with AMD. The hardware doesn't seem to support a 32-bit value for the predicate, so we need to implement a workaround. This fixes an issue when DXVK enables conditional rendering with RADV, this also fixes the Sasha conditionalrender demo. Fixes: e45ba51ea45 ("radv: add support for VK_EXT_conditional_rendering") Reported-by: Philip Rebohle <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix color conversions for normalized uint/sint formatsSamuel Pitoiset2019-05-021-4/+16
| | | | | | | | | | | | | The hardware actually rounds before conversion. This now matches what values are used when performing fast clears vs slow clears. This fixes a rendering issue with Far Cry 3&4. This also fixes a bunch of CTS tests that use a 8-bit UNORM format (only when the 512*512 image size hint is manually disabled). Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not need to force emit the TCS regs on Vega20Samuel Pitoiset2019-05-021-0/+1
| | | | | | | | | This chip doesn't need the fixup. This fixes a bunch of dEQP-VK.tessellation tests and avoid random GPU hangs. Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Restrict YUVY formats to 1 layer.Bas Nieuwenhuizen2019-05-021-0/+7
| | | | | Fixes: 8bb3cec7c9b "radv: Expose VK_EXT_ycbcr_image_arrays." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Set is_array in lowered ycbcr tex instructions.Bas Nieuwenhuizen2019-05-021-0/+1
| | | | | | | Fixes array tests. Fixes: 91702374d5d "radv: Add ycbcr lowering pass." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Fix hang width YCBCR array textures.Bas Nieuwenhuizen2019-05-021-2/+6
| | | | | | | | | | Forgot to apply the width/height divisor for CB writes resulting in the CB using larger than expected slice sizes. Fixes: 42d159f2766 "radv: Add multiple planes to images." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110530 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110526 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: update to use the new features struct namesEric Engestrom2019-04-301-8/+8
| | | | | | | | | These were updated in version 1.1.106 of vulkan.h to make more sense with the extension names. We may as well keep with the times. See also: 90108deb277d33d19233 "anv: Update to use the new features struct names" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: enable descriptor indexing capabilitiesJuan A. Suarez Romero2019-04-301-0/+2
| | | | | | | | | This enables the remaining capabilities in SPV_EXT_descriptor_indexing. Fixes: 0e10790558b "radv: Enable VK_EXT_descriptor_indexing." Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* delete autotools .gitignore filesEric Engestrom2019-04-292-10/+0
| | | | | | | | One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* ac,ac/nir: use a better sync scope for shared atomicsRhys Perry2019-04-293-9/+72
| | | | | | | | | | | | | | | | | | | | | | | | | https://reviews.llvm.org/rL356946 (present in LLVM 9 and later) changed the meaning of the "system" sync scope, making it no longer restricted to the memory operation's address space. So a single address space sync scope is needed for shared atomic operations (such as "system-one-as" or "workgroup-one-as") otherwise buffer_wbinvl1 and s_waitcnt instructions can be created at each shared atomic operation. This mostly reimplements LLVMBuildAtomicRMW and LLVMBuildAtomicCmpXchg to allow for more sync scopes and uses the new functions in ac->nir with the "workgroup-one-as" or "workgroup" sync scopes. F1 2017 (4K, Ultra High settings, TAA), avg FPS : 59 -> 59.67 (+1.14%) Strange Brigade (4K, ~highest settings), avg FPS : 51.5 -> 51.6 (+0.19%) RotTR/mountain (4K, VeryHigh settings, FXAA), avg FPS : 57.2 -> 57.2 (+0.0%) RotTR/tomb (4K, VeryHigh settings, FXAA), avg FPS : 42.5 -> 43.0 (+1.17%) RotTR/valley (4K, VeryHigh settings, FXAA), avg FPS : 40.7 -> 41.6 (+2.21%) Warhammer II/fallen, avg FPS : 31.63 -> 31.83 (+0.63%) Warhammer II/skaven, avg FPS : 37.77 -> 38.07 (+0.79%) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: consider MESA_VK_VERSION_OVERRIDE when setting the api versionEleni Maria Stea2019-04-291-2/+5
| | | | | | | | | Before setting the physical device API version, we should check if the MESA_VK_VERSION_OVERRIDE environment variable is set and take it into account. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: add missing VEGA20 chip in radv_get_device_name()Samuel Pitoiset2019-04-271-0/+1
| | | | | | | | Otherwise it returns "AMD RADV unknown". Cc: 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Expose Vulkan 1.1 for Android.Bas Nieuwenhuizen2019-04-251-1/+1
| | | | | | We have the YCBCR feature now. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Expose VK_EXT_ycbcr_image_arrays.Bas Nieuwenhuizen2019-04-252-0/+7
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Enable YCBCR conversion feature.Bas Nieuwenhuizen2019-04-252-1/+2
| | | | | | | | | This enabled the basic YCBCR features. We support basic multiplane formats using 8-bit and 16-bit unorms, as well as YUV2 formats. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr subsampled & multiplane formats to csv.Bas Nieuwenhuizen2019-04-251-0/+12
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr format features.Bas Nieuwenhuizen2019-04-251-0/+27
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add hashing for the ycbcr samplers.Bas Nieuwenhuizen2019-04-252-7/+7
| | | | | | Otherwise caching gets very confused. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Run the new ycbcr lowering pass.Bas Nieuwenhuizen2019-04-253-3/+6
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr lowering pass.Bas Nieuwenhuizen2019-04-254-0/+376
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Update descriptor sets for multiple planes.Bas Nieuwenhuizen2019-04-253-18/+35
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr samplers in descriptor set layouts.Bas Nieuwenhuizen2019-04-252-2/+77
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Add support for planes.Bas Nieuwenhuizen2019-04-252-4/+19
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Allow mixed src/dst aspects in copies.Bas Nieuwenhuizen2019-04-251-104/+116
| | | | | | e.g. COLOR + PLANE_2, as well COLOR + COLOR for multiplane images. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add support for image views with multiple planes.Bas Nieuwenhuizen2019-04-253-21/+41
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr conversion structs.Bas Nieuwenhuizen2019-04-253-4/+42
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Support different source & dest aspects for planar images in blit2d.Bas Nieuwenhuizen2019-04-251-2/+9
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add single plane image views & meta operations.Bas Nieuwenhuizen2019-04-254-10/+45
| | | | | | | Copies & clear of multiplane images is not allowed so we do not have to handle that case. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add multiple planes to images.Bas Nieuwenhuizen2019-04-257-135/+204
| | | | | | | | | | No functional changes. This temporarily uses plane 0 for everything. Long term plan is that only single plane images get to use metadata like htile/dcc/cmask/fmask. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add logic for multisample format descriptions.Bas Nieuwenhuizen2019-04-254-10/+86
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add logic for subsampled format descriptions.Bas Nieuwenhuizen2019-04-253-0/+28
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add adaptive_sync driconfig option and enable it by default.Bas Nieuwenhuizen2019-04-231-0/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* vulkan/wsi: Add X11 adaptive sync support based on dri options.Bas Nieuwenhuizen2019-04-231-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | The dri options are optional. When the dri options are not provided the WSI will not use adaptive sync. FWIW I think for xf86-video-amdgpu this still requires an X11 config option, so only people who opt in can get possible regressions from this. So then the remaining question is: why do this in the WSI? It has been suggested in another MR that the application sets this. However, I disagree with that as I don't think we'll ever get a reasonable set of applications setting it. The next questions is whether this can be a layer. It definitely can be as implemented now. However, I think this generally fits well with the function of the WSI. Furthemore, for e.g. the DISPLAY WSI this is much harder to do in a layer. Of course, most of the WSI could almost be a layer, but I think this still fits best in the WSI. Acked-by: Jason Ekstrand <[email protected]>
* radv: Add support for driconf.Bas Nieuwenhuizen2019-04-233-3/+23
| | | | | | | | | This includes 0 options. The cache parsing is located at a position where we can easily add config filtering by VkApplicationInfo. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add REWIND and GDS registers to register headersMarek Olšák2019-04-231-0/+16
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: add ac_get_i1_sgpr_maskMarek Olšák2019-04-232-0/+18
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: add radeon_info::is_pro_graphicsMarek Olšák2019-04-232-0/+5
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: add radeon_info::marketing_name, replacing the winsys callbackMarek Olšák2019-04-232-0/+3
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radv: add VK_NV_compute_shader_derivates supportSamuel Pitoiset2019-04-223-0/+9
| | | | | | | | | Only computeDerivativeGroupLinear is supported for now. All crucible tests pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Support VK_EXT_inline_uniform_block.Bas Nieuwenhuizen2019-04-195-15/+124
| | | | | | | | | | | | | | | | | | | Basically just reserve the memory in the descriptor sets. On the shader side we construct a buffer descriptor, since AFAIU VGPR indexing on 32-bit pointers in LLVM is still broken. This fully supports update after bind and variable descriptor set sizes. However, the limits are somewhat arbitrary and are mostly about finding a reasonable division of a 2 GiB max memory size over the set. v2: - rebased on top of master (Samuel) - remove the loading resources rework (Samuel) - only load UBO descriptors if it's a pointer (Samuel) - use LLVMBuildPtrToInt to avoid IR failures (Samuel) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v2)
* ac/nir: use the new raw/struct SSBO atomic intrisics for comp_swapSamuel Pitoiset2019-04-191-2/+1
| | | | | | | | | | | | This is actually fixed now. This change requires LLVM r358579. Make sure to have it in your tree, otherwise the following piglit will hang: tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: only use the new raw/struct SSBO atomic intrinsics with LLVM 9+Samuel Pitoiset2019-04-191-1/+4
| | | | | | | | They are buggy with older LLVM version, see r358579. Fixes: 78c551aca1c ("ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>