summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv/gfx9: handle GFX9 opaque metadataDavid Airlie2017-08-161-4/+5
| | | | | | | | port the opaque metadata changes from radeonsi for gfx9. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: emit db_htile_surface reg on gfx9 as wellDavid Airlie2017-08-161-1/+2
| | | | | | | | This is also a GFX9 register. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: remove some leftover gfx6 descriptor setup.Dave Airlie2017-08-161-4/+0
| | | | | | | | | We set this later in the non-gfx9 path, just remove these bits from here. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/gfx9: fix set predication packet.Dave Airlie2017-08-161-9/+12
| | | | | | | | The predication packet changed format on GFX9, update the driver. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac: fail shader compilation if libelf is replaced by an incompatible versionMarek Olšák2017-08-102-3/+11
| | | | | | | | | | | | UE4Editor has this issue. This commit prevents hangs (release build) or assertion failures (debug build). It doesn't fix the editor, but catastrophic scenarios are prevented. Cc: 17.1 17.2 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: force cs/ps/l2 flush at end of command stream. (v2)Dave Airlie2017-08-091-1/+4
| | | | | | | | | | | | | | | | | | | | This seems like a workaround, but we don't see the bug on CIK/VI. On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.* tests, when one tests complete, the first flush at the start of the next test causes a VM fault as we've destroyed the VM, but we end up flushing the compute shader then, and it must still be in the process of doing something. Could also be a kernel difference between SI and CIK. v2: hit this with a bigger hammer. This fixes a bunch of hangs in the vk cts with the robustness tests. Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334 Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: fix saturate emissionConnor Abbott2017-08-081-2/+2
| | | | | | | The .f32 was already getting added by emit_intrin_2f_param(). Noticed when enabling LLVM module verification. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove semicolon in if(...);Bas Nieuwenhuizen2017-08-081-1/+1
| | | | | | Trivial. Fixes: a6a6146aa91 "radv: Don't allow fmask swizzling for shareable images."
* radv: Fix decompression on multisampled depth buffersAlex Smith2017-08-072-35/+69
| | | | | | | | | | Need to take the sample count into account in the depth decompress and resummarize pipelines and render pass. Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "17.2" <[email protected]>
* radv: Don't allow fmask swizzling for shareable images.Bas Nieuwenhuizen2017-08-071-1/+4
| | | | | | | | Also adds an assert because you never know how the winsys changes, and multiprocess format differences are annoying. Fixes: 1e696b962b7 "radv: add separate fmask tile swizzle counter." Reviewed-by: Dave Airlie <[email protected]>
* radv: fix MSAA on SI gpus.Dave Airlie2017-08-071-3/+7
| | | | | | | | | | This ports the workaround from radeonsi, that was missing in radv. This fixes Talos rendering when MSAA is enabled on my Tahiti card. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver) Signed-off-by: Dave Airlie <[email protected]>
* radv: add separate fmask tile swizzle counter.Dave Airlie2017-08-073-3/+11
| | | | | | | | | This mirrors what Marek has done for radeonsi, and uses a separate counter to handle the fmask surface for MSAA MRTs. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix f16->f32 denorm handling for SI/CIK. (v2)Dave Airlie2017-08-071-2/+16
| | | | | | | | | | | | This just copies the code from the -pro shaders, and fixes the tests on CIK. With this CIK passes the same set of conformance tests as VI. Fixes: 83e58b03 (radv: flush f32->f16 conversion denormals to zero. (v2)) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Use the correct channel for alpha in resolve srgb conversion.Bas Nieuwenhuizen2017-08-061-1/+1
| | | | | | | | | The argument here is a bitmask, so the old code selected .xy, which got silently truncated to .x when constructing the vec4 from components, instead of using .w. Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader." Reviewed-by: Dave Airlie <[email protected]>
* radv: Only convert linear->srgb in compute resolves.Bas Nieuwenhuizen2017-08-065-79/+54
| | | | | | | | | It justs works with the fragment shader resolve, so no need to do a custom conversion. In fact with SRGB dest, it actually gives wrong results. Fixes: 69136f4e633 "radv/meta: add resolve pass using fragment/vertex shaders" Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't use SRGB format for image stores during resolve.Bas Nieuwenhuizen2017-08-062-1/+24
| | | | | | | | | These seem to store very bogus results. Luckily there is some code that converts srgb->linear already, so just making the descriptor format UNORM should work. Fixes: 588185eb6b7 "radv/meta: add srgb conversion to end of resolve shader." Reviewed-by: Dave Airlie <[email protected]>
* radv: generate the same driver UUID as radeonsiAndres Rodriguez2017-08-062-1/+9
| | | | | | | These need to match for interop compatibility queries. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: generate same device UUID as radeonsiAndres Rodriguez2017-08-061-7/+4
| | | | | | | | | | | This is required for interop use cases. The same device must report identical UUIDs through the GL and Vulkan APIs so that users can identify when it is safe to perform a memory object import. v2: use ac helpers to calculate the uuid Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac/gpu: add driver/device UUID query helpersAndres Rodriguez2017-08-062-0/+32
| | | | | | | | | We need vulkan and gl to produce the same UUIDs. Therefore we should keep the mechanism to compute these in a common location to guarantee they are updated in lockstep. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: avoid GPU hangs if someone does a resolve with non-multisample src (v2)Dave Airlie2017-08-051-0/+5
| | | | | | | | | | | | This is a bug in the app, but I'd rather avoid hanging the GPU, esp if someone is running in validation and it takes out their development environment. v2: get it right, reverse the polarity. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: also fix texture image descriptors for mipmap tile swizzleDave Airlie2017-08-041-1/+2
| | | | | | | This fixes the image descriptors for mipmapped tile swizzle Fixes: 2b7e8556 (ac/surface: enable tile swizzle for mipmapped textures) Signed-off-by: Dave Airlie <[email protected]>
* radv: fix tile swizzle regression on mipmaps.Dave Airlie2017-08-041-5/+6
| | | | | | | | | | | When Marek enabled mipmapped swizzle, radv didn't have the code in place to handle it. This fixes the regression. I'll look more into GFX9 once I have a vega card (soon). Fixes: 2b7e8556 (ac/surface: enable tile swizzle for mipmapped textures) Signed-off-by: Dave Airlie <[email protected]>
* ac/surface: align DCC size for surfaces that use tile swizzleMarek Olšák2017-08-041-2/+9
| | | | | | | | Note that dcc_alignment = pipe_interleave_bytes * num_pipes * num_banks, which is greater than the previous open-coded alignment. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: limit tile swizzle to non-mipmaps on SIMarek Olšák2017-08-041-1/+3
| | | | | | | Mipmapping with tile swizzle doesn't work. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: enable tile swizzle for mipmapped texturesMarek Olšák2017-08-041-34/+46
| | | | | | | | | | | The tile swizzle computation was done after the whole miptree was computed, but that was too late, because at that point AddrSurfInfoOut contained information about the smallest miplevel, which is never 2D-tiled. The correct way is to do the computation before the second level is computed. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: set structure size and handle errors for AddrComputeBaseSwizzleMarek Olšák2017-08-041-1/+8
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: increment surf_index only when tile swizzle is allowedMarek Olšák2017-08-043-4/+6
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: compute tile swizzle only when it's allowedMarek Olšák2017-08-041-2/+4
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: add RADEON_SURF_SHAREABLEMarek Olšák2017-08-041-0/+1
| | | | | | | Shareable textures won't use tile swizzle. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: remove RADEON_SURF_HAS_TILE_MODE_INDEXMarek Olšák2017-08-043-5/+0
| | | | | | | it's useless Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface: move tile_swizzle to ac_surface and document itMarek Olšák2017-08-044-8/+25
| | | | | | | Gfx9 will use it too. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* android: ac/common: always build NIR translationMauro Rossi2017-08-031-1/+2
| | | | | | | | | | Android build changes to avoid the following building error: external/mesa/src/gallium/drivers/radeonsi/si_shader_nir.c:505: error: undefined reference to 'ac_nir_translate' Fixes: 86d4b46d66 "ac/common: always build NIR translation" Reviewed-by: Emil Velikov <[email protected]>
* ac: add ac_shader_abi.h in distcheckJuan A. Suarez Romero2017-08-031-0/+1
| | | | | | | | | | | | | | | | | | | Fixes: CXXLD addrlib/libamdgpu_addrlib.la ar: `u' modifier ignored since `D' is the default (see `U') ../../../../src/amd/common/ac_nir_to_llvm.c:33:27: fatal error: ac_shader_abi.h: No such file or directory #include "ac_shader_abi.h" ^ compilation terminated. Makefile:985: recipe for target 'common/common_libamd_common_la-ac_nir_to_llvm.lo' failed When running `make distcheck` Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]>
* radv: Add suballocation for shaders.Bas Nieuwenhuizen2017-08-035-21/+93
| | | | | | | | | | | | | This reduces the number of BOs that we need for the BO lists during a submission. Currently uses a fairly simple linear search for finding free space, that could eventually be improved to a binary tree, which with some per-node info could make a check for space O(1) and finding it O(log n), in the number of buffers in that slab. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: fix streamout overflow predication on VI+Nicolai Hähnle2017-08-021-0/+1
| | | | | | | | | | There is a firmware regression that causes failures. Work around it by using the compute shader for query_buffer_objects to summarize the query results. v2: rename to PREDICATION_OP_BOOL64 (consistent with sid.h) Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Add float cast before shadow comparator clamp.Bas Nieuwenhuizen2017-08-021-1/+2
| | | | | | | | LLVM complained about passing an i32 to a float clamp. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Fixes: 0f9e32519bb "ac/nir: clamp shadow texture comparison value on VI" Reviewed-by: Marek Olšák <[email protected]>
* radeon/ac: use ds_swizzle for derivs on si/cik.Dave Airlie2017-08-023-24/+43
| | | | | | | | | | | | This looks like it's supported since llvm 3.9 at least, so switch over radeonsi and radv to using it, -pro also uses this. We can now drop creating lds for these operations as the ds_swizzle operation doesn't actually write to lds at all. Acked-by: Marek Olšák <[email protected]> (stable requested due to fixing radv CIK conformance tests) Cc: [email protected] Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: fix nir_op_unpack_64_2x32_split_y emissionConnor Abbott2017-08-011-1/+1
| | | | | | This was broken thanks to a typo in b2367cf. Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/nir: fix lsb emissionConnor Abbott2017-08-011-1/+11
| | | | | | | | | | | This makes it match radeonsi. The LLVM backend itself will emit the correct instruction, but LLVM might do incorrect optimizations since it thinks the output is undefined when the input is 0, even though it's not supposed to be. We really need a new intrinsic, or for the backend to become smarter and recognize this pattern. Cc: [email protected] Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: handle 10-bit format clamping workaround.Dave Airlie2017-08-018-16/+51
| | | | | | | | | | | | | | | This fixes: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.* for a2r10g10b10 formats as destination on SI/CIK hardware. This adds support to the meta program for emitting 10-bit outputs, and adds 10-bit support to the fragment shader key. It also only does the int8/10 on SI/CIK. Fixes: f4e499ec7 (radv: add initial non-conformant radv vulkan driver) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Don't underflow non-visible VRAM size.Bas Nieuwenhuizen2017-07-311-2/+4
| | | | | | | | | | | | | | In some APU situations the reported visible size can be larger than VRAM size. This properly clamps the value. Surprisingly both CTS and spec seem to allow a heap type with size 0, so this seemed like the easiest option to me. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Fixes: 4ae84efbc5c "radv: Use enum for memory heaps." Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Tested-by: Michel Dänzer <[email protected]>
* ac/common: always build NIR translationNicolai Hähnle2017-07-311-7/+2
| | | | | | radeonsi needs it now, and we require LLVM 3.9 anyway. Fixes a build with radeonsi but not radv.
* ac/nir: implement load_frag_coord intrinsicNicolai Hähnle2017-07-311-0/+10
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: pass ac_llvm_context to unpack_paramNicolai Hähnle2017-07-311-18/+18
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/nir,radeonsi: add and use ac_shader_abi::frag_posNicolai Hähnle2017-07-312-13/+18
| | | | | | v2: update for LLVMValueRefs in ac_shader_abi Reviewed-by: Marek Olšák <[email protected]>
* ac/nir,radeonsi: add and use ac_shader_abi::{ancillary,sample_coverage}Nicolai Hähnle2017-07-312-6/+6
| | | | | | v2: update for LLVMValueRefs in ac_shader_abi Reviewed-by: Marek Olšák <[email protected]>
* ac/nir,radv: move force_persample to ac_shader_info::force_persampleNicolai Hähnle2017-07-316-6/+10
| | | | | | | Avoid accessing radv-specific structures during the meat of NIR-to-LLVM translation. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use new function ac_build_umin for edgeflag clampingNicolai Hähnle2017-07-312-0/+8
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: clamp shadow texture comparison value on VINicolai Hähnle2017-07-311-1/+13
| | | | | | | Needed for TC-compatible HTILE in radeonsi for test cases like piglit spec/arb_texture_rg/execution/fs-shadow2d-red-01.shader_test Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: add always_vector argument to ac_build_gather_values_extendedNicolai Hähnle2017-07-313-19/+13
| | | | | | | | | | This simplifies a bunch of places that no longer need special treatment of value_count == 1. We rely on LLVM to optimize away the 1-element vector types. This fixes a bunch of bugs where 1-element arrays are indexed indirectly. Reviewed-by: Marek Olšák <[email protected]>