summaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* radv: force cs/ps/l2 flush at end of command stream. (v2)Dave Airlie2017-08-091-1/+4
| | | | | | | | | | | | | | | | | | | | This seems like a workaround, but we don't see the bug on CIK/VI. On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.* tests, when one tests complete, the first flush at the start of the next test causes a VM fault as we've destroyed the VM, but we end up flushing the compute shader then, and it must still be in the process of doing something. Could also be a kernel difference between SI and CIK. v2: hit this with a bigger hammer. This fixes a bunch of hangs in the vk cts with the robustness tests. Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334 Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: remove semicolon in if(...);Bas Nieuwenhuizen2017-08-081-1/+1
| | | | | | Trivial. Fixes: a6a6146aa91 "radv: Don't allow fmask swizzling for shareable images."
* radv: Fix decompression on multisampled depth buffersAlex Smith2017-08-072-35/+69
| | | | | | | | | | Need to take the sample count into account in the depth decompress and resummarize pipelines and render pass. Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]>
* radv: Don't allow fmask swizzling for shareable images.Bas Nieuwenhuizen2017-08-071-1/+4
| | | | | | | | Also adds an assert because you never know how the winsys changes, and multiprocess format differences are annoying. Fixes: 1e696b962b7 "radv: add separate fmask tile swizzle counter." Reviewed-by: Dave Airlie <[email protected]>
* radv: fix MSAA on SI gpus.Dave Airlie2017-08-071-3/+7
| | | | | | | | | | This ports the workaround from radeonsi, that was missing in radv. This fixes Talos rendering when MSAA is enabled on my Tahiti card. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver) Signed-off-by: Dave Airlie <[email protected]>
* radv: add separate fmask tile swizzle counter.Dave Airlie2017-08-073-3/+11
| | | | | | | | | This mirrors what Marek has done for radeonsi, and uses a separate counter to handle the fmask surface for MSAA MRTs. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Use the correct channel for alpha in resolve srgb conversion.Bas Nieuwenhuizen2017-08-061-1/+1
| | | | | | | | | The argument here is a bitmask, so the old code selected .xy, which got silently truncated to .x when constructing the vec4 from components, instead of using .w. Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader." Reviewed-by: Dave Airlie <[email protected]>
* radv: Only convert linear->srgb in compute resolves.Bas Nieuwenhuizen2017-08-065-79/+54
| | | | | | | | | It justs works with the fragment shader resolve, so no need to do a custom conversion. In fact with SRGB dest, it actually gives wrong results. Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders" Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't use SRGB format for image stores during resolve.Bas Nieuwenhuizen2017-08-062-1/+24
| | | | | | | | | These seem to store very bogus results. Luckily there is some code that converts srgb->linear already, so just making the descriptor format UNORM should work. Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader." Reviewed-by: Dave Airlie <[email protected]>
* radv: generate the same driver UUID as radeonsiAndres Rodriguez2017-08-062-1/+9
| | | | | | | These need to match for interop compatibility queries. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: generate same device UUID as radeonsiAndres Rodriguez2017-08-061-7/+4
| | | | | | | | | | | This is required for interop use cases. The same device must report identical UUIDs through the GL and Vulkan APIs so that users can identify when it is safe to perform a memory object import. v2: use ac helpers to calculate the uuid Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)Dave Airlie2017-08-051-0/+5
| | | | | | | | | | | | This is a bug in the app, but I'd rather avoid hanging the GPU, esp if someone is running in validation and it takes out their development environment. v2: get it right, reverse the polarity. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: also fix texture image descriptors for mipmap tile swizzleDave Airlie2017-08-041-1/+2
| | | | | | | This fixes the image descriptors for mipmapped tile swizzle Fixes: 2b7e8556 (ac/surface: enable tile swizzle for mipmapped textures) Signed-off-by: Dave Airlie <[email protected]>
* radv: fix tile swizzle regression on mipmaps.Dave Airlie2017-08-041-5/+6
| | | | | | | | | | | When Marek enabled mipmapped swizzle, radv didn't have the code in place to handle it. This fixes the regression. I'll look more into GFX9 once I have a vega card (soon). Fixes: 2b7e8556 (ac/surface: enable tile swizzle for mipmapped textures) Signed-off-by: Dave Airlie <[email protected]>
* ac/surface: increment surf_index only when tile swizzle is allowedMarek Olšák2017-08-041-1/+1
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: remove RADEON_SURF_HAS_TILE_MODE_INDEXMarek Olšák2017-08-042-4/+0
| | | | | | | it's useless Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: move tile_swizzle to ac_surface and document itMarek Olšák2017-08-042-6/+6
| | | | | | | Gfx9 will use it too. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radv: Add suballocation for shaders.Bas Nieuwenhuizen2017-08-035-21/+93
| | | | | | | | | | | | | This reduces the number of BOs that we need for the BO lists during a submission. Currently uses a fairly simple linear search for finding free space, that could eventually be improved to a binary tree, which with some per-node info could make a check for space O(1) and finding it O(log n), in the number of buffers in that slab. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: handle 10-bit format clamping workaround.Dave Airlie2017-08-016-10/+37
| | | | | | | | | | | | | | | This fixes: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.* for a2r10g10b10 formats as destination on SI/CIK hardware. This adds support to the meta program for emitting 10-bit outputs, and adds 10-bit support to the fragment shader key. It also only does the int8/10 on SI/CIK. Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Don't underflow non-visible VRAM size.Bas Nieuwenhuizen2017-07-311-2/+4
| | | | | | | | | | | | | | In some APU situations the reported visible size can be larger than VRAM size. This properly clamps the value. Surprisingly both CTS and spec seem to allow a heap type with size 0, so this seemed like the easiest option to me. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Fixes: 4ae84efbc5c "radv: Use enum for memory heaps." Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Tested-by: Michel Dänzer <[email protected]>
* ac/nir,radv: move force_persample to ac_shader_info::force_persampleNicolai Hähnle2017-07-312-2/+2
| | | | | | | Avoid accessing radv-specific structures during the meat of NIR-to-LLVM translation. Reviewed-by: Marek Olšák <[email protected]>
* radv: for stencil only set Z tile mode index to same valueDave Airlie2017-07-281-0/+2
| | | | | | | | | | | | On SI this was causing a hang in dEQP-VK.pipeline.render_to_image.core.2d_array.mipmap.r16g16_sint_s8_uint This was due to not handling the tile mode index for depth like I fixed previously for new GPUs. Fixes: 01d0c5a9 (radv: fix stencil regression since new addrlib import) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/winsys: fix padding command stream for SIDave Airlie2017-07-261-4/+6
| | | | | | | | We were adding pad to size after creating the object, so we could submit a CS bigger than the bo created for it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: rename physical_device->uuid[] to cache_uuid[]Andres Rodriguez2017-07-263-5/+5
| | | | | | | We have a few UUIDs, so lets be more specific. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: only report external semaphore info for opaque fd.Dave Airlie2017-07-251-5/+10
| | | | | | | | | | Until we support sync fd, don't report the info. Fixes CTS dEQP-VK.api.external.semaphore.sync_fd.* from crashing. Fixes: eaa56eab6 (radv: initial support for shared semaphores (v2)) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix buffer views on SI/CIK.Dave Airlie2017-07-241-0/+5
| | | | | | | | | Fixes CTS dEQP-VK.memory.pipeline_barrier.host_write_uniform_texel_buffer.1024 on SI/CIK with radv. Fixes: f4e499ec (radv: add initial non-conformant radv vulkan driver) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: enable sample shadingDave Airlie2017-07-242-2/+4
| | | | | | | This calculates ps_iter_samples from the minSampleShading input Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: don't set dedicated bit for buffer external memory.Dave Airlie2017-07-241-2/+1
| | | | | | | | | | This is an alternate fix for the buffer export dedicated interaction. Fixes CTS dEQP-VK.api.external.memory.opaque_fd.dedicated.buffer.info Fixes: b70829708a (radv: Implement VK_KHR_external_memory) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix non-0 based layer clears.Dave Airlie2017-07-241-4/+9
| | | | | | | | | | | If the layer base was > 0, it wasn't getting passed as the start instance or getting added in the shaders. Fixes CTS dEQP-VK.api.image_clearing.core.clear_color_attachment.2d_r8_uint_multiple_layers Fixes: 7e0382fb (radv: add support for layered clears (v2)) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: check enabled device features.Dave Airlie2017-07-241-0/+13
| | | | | | | | | | | The spec says we should return VK_ERROR_FEATURE_NOT_PRESENT. Ported from anv. Fixes CTS test dEQP-VK.api.device_init.create_device_unsupported_features Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: for external memory imports close the fd on import successDave Airlie2017-07-241-1/+3
| | | | | | | | | | If we get an fd, we need to close it before returning. Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.device_only.import_multiple_times Fixes: b70829708a (radv: Implement VK_KHR_external_memory) Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Don't segfault when exporting an image which hasn't been bound yet.Bas Nieuwenhuizen2017-07-241-1/+1
| | | | | | | | | | | | | The image is set on Memory allocation already, but the image doesn't have to have the BindImageMemory called yet. Luckily, we know offset within a BO has to be 0 for dedicated allocations, so we can just use the dummy 0 in the address calaculations. Fixes CTS test dEQP-VK.api.external.memory.opaque_fd.dedicated.image.export_bind_import_bind Signed-off-by: Bas Nieuwenhuizen <[email protected]> Fixes: b70829708ac "radv: Implement VK_KHR_external_memory" Reviewed-by: Dave Airlie <[email protected]>
* radv: Handle VK_ATTACHMENT_UNUSED in color attachments.Bas Nieuwenhuizen2017-07-246-20/+45
| | | | | | | | | | | | This just sets them to INVALID COLOR, instead of shifting the attachments together. This also fixes a number of cases where we use it first and only then check if it is VK_ATTACHMENT_UNUSED. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* radv: reset non-syncobj semaphore context after wait.Dave Airlie2017-07-221-0/+2
| | | | | | | | | | | When I ported from libdrm, I forgot to add the line to reset the sem, we just need to reset the context. This fixes a regression in DOOM. Fixes: 9ac1432a571 ("radv: port to new libdrm API.") Reported-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: rebase radv_entrypoints_gen.py on anv_entrypoints_gen.pyDylan Baker2017-07-212-275/+287
| | | | | | | | | | | | | The two generators forked from each other, and they remain basically the same. This rebases the radv version on the anv version, but with the radv changes ported over. The result is that we get rid of the "cat |" madness and gain mako, correct "generated by" attributions, and write files out directly. The only differences between the output is whitespace and comments. Signed-off-by: Dylan Baker <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: Generate storage image descriptors unconditionallyAlex Smith2017-07-221-4/+1
| | | | | | | | | | | | | We can also use storage images internally for resolves, which don't require TRANSFER_DST usage on the image, so currently we may not create the needed descriptors. Just create these descriptors unconditionally. Fixes: 0e1886efb9e ("radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT") Reported-by: Grazvydas Ignotas <[email protected]> Signed-off-by: Alex Smith <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: initial support for shared semaphores (v2)Dave Airlie2017-07-216-71/+359
| | | | | | | | | | | | | | | | | | | This adds support for sharing semaphores using kernel syncobjects. Syncobj backed semaphores are used for any semaphore which is created with external flags, and when a semaphore is imported, otherwise we use the current non-kernel semaphores. Temporary imports from syncobj fd are also available, these just override the current user until the next wait, when the temp syncobj is dropped. v2: allocate more chunks upfront, fix off by one after previous refactor of syncobj setup, remove unnecessary null check. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/winsys: add syncobj hooksDave Airlie2017-07-212-0/+44
| | | | | | | | This just adds syncobj create/destroy/export/import paths into the winsys interface. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Add support for VK_KHR_variable_pointers.Bas Nieuwenhuizen2017-07-203-0/+18
| | | | | | | | Just a trivial enable. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: Add VK_KHR_storage_buffer_storage_class support.Bas Nieuwenhuizen2017-07-202-0/+5
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: port to new libdrm API.Dave Airlie2017-07-201-29/+92
| | | | | | | This bumps the libdrm requirement for amdgpu to the 2.4.82. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: introduce some wrapper in cs code to make porting off libdrm_amdgpu ↵Dave Airlie2017-07-201-18/+70
| | | | | | | | | | easier. This just introduces a central semaphore info struct, and passes it around, and introduces some wrappers that will make porting off libdrm_amdgpu easier. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Set the RADEON_SURF_OPTIMIZE_FOR_SPACE flag for imagesAlex Smith2017-07-181-0/+1
| | | | | | | | | | | | | This looks like a regression from df301237940 ("radv: use ac_compute_surface"). Before that, the opt4Space addrlib flag was set to true unless the image has FMASK (ac_compute_surface will similarly only set that flag for images without FMASK). This saves multiple gigabytes of VRAM on one of our games, and brings its VRAM utilisation on RADV in line with AMDGPU-PRO and NVIDIA. Signed-off-by: Alex Smith <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: don't shadow meta_va.Dave Airlie2017-07-181-1/+1
| | | | | | Coverity warned about dead code below, as meta_va was being shadowed. Signed-off-by: Dave Airlie <[email protected]>
* radv: advertise v6 of the wayland surface extensionEmil Velikov2017-07-171-1/+1
| | | | | | | | | | | | | Jason updated the Khronos spec to explicitly state that Wayland surfaces must support VK_PRESENT_MODE_MAILBOX_KHR. ANV did so since day one (back in 2015) Cc: [email protected] Cc: Bas Nieuwenhuizen <[email protected]> Cc: Dave Airlie <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: predicate cmask eliminate when using DCC.Dave Airlie2017-07-176-7/+153
| | | | | | | | | | | | | | | | | | When using DCC some clear values don't require a cmask eliminate step. This patch adds support for black and black with alpha 1, there are other values, but I don't have access to a comprehensive list. This works by setting the cmask eliminate predicate when doing the fast clear, and later when doing the cmask elimination making sure the draws are predicated. This increases the fps on Sascha Willems deferred. Tonga: 580fps->670fps on a Tonga PRO card. Polaris 730->850fps Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/clear: add r32g32b32a32 fast clear support (v2)Dave Airlie2017-07-172-2/+26
| | | | | | | | | | | | | We can only fast clear 128-bit images if the r/g/b channels are the same, and we are using DCC. For DCC we'll bail out on translate if this isn't true, and we catch cmask clears explicitly. v2: remove 64-bit block (Bas), add uint32 as well. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: set cb base tile swizzles for MRT speedups (v4)Dave Airlie2017-07-173-2/+18
| | | | | | | | | | | | | | | | | | | | This patch uses addrlib to workout the tile swizzles according to the surface index. It seems to produce the same values as amdgpu-pro for the deferred test. v2: don't apply swizzle to CMASK. the eg docs don't mention it, and we clearly don't align cmask for that. v3: disable surf index for dedicated images, as these will most likely be shared, and I don't think the metadata has space for this info in it yet. v4: update for shareable images, rename combined_swizzle to tile_swizzle This gets the deferred demo from 730->950fps on my rx480. (dcc cmask elim predication patches get it further) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: allow clear merging for depth/stencil with no care stencilDave Airlie2017-07-171-0/+3
| | | | | | | | | | | | Some of the Sascha Willems demos pick a D32/S8 format for the depth buffer, then do a LOAD_OP_CLEAR/LOAD_OP_DONT_CARE on it, which means we don't get to merge the undefined->depth and clear htile transitions. This add the stencil aspect to the pending clears if there is a depth clear pending and the stencil aspect is don't care. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Remove NV dedicated alloc extension.Bas Nieuwenhuizen2017-07-151-4/+0
| | | | | | | To not confuse apps in thinking it might be faster. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Andres Rodriguez <[email protected]>