summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: Do not decompress on LAYOUT_GENERAL.Bas Nieuwenhuizen2019-08-071-3/+3
| | | | | | | We handle render loops properly now and STORAGE still disables DCC/TC-compat HTILE in general. Reviewed-by: Dave Airlie <[email protected]>
* radv: Pass through render loop detection to internal layout decisions.Bas Nieuwenhuizen2019-08-076-46/+100
| | | | | | | | And do nothing with it yet. Everything outside a renderpass has no render loop. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add render loop detection in renderpass.Bas Nieuwenhuizen2019-08-072-0/+19
| | | | | | | | | | | | | | VK spec 7.3: "Applications must ensure that all accesses to memory that backs image subresources used as attachments in a given renderpass instance either happen-before the load operations for those attachments, or happen-after the store operations for those attachments." So the only renderloops we can have is with input attachments. Detect these. Reviewed-by: Dave Airlie <[email protected]>
* radv: Fix config reg assert.Bas Nieuwenhuizen2019-08-071-1/+1
| | | | | | | | Using the wrong bounds Fixes: "219d6939df8 radv: add more assertions to make sure packets are correctly emitted" Reviewed-by: Andres Rodriguez <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: add support for nir atomic_inc_wrap/atomic_dec_wrapPierre-Eric Pelloux-Prayer2019-08-061-0/+25
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac: add ac_atomic_inc_wrap / ac_atomic_dec_wrap supportPierre-Eric Pelloux-Prayer2019-08-062-0/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: handle key.mono.u.ps.interpolate_at_sample_force_centerMarek Olšák2019-08-062-0/+4
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/nir: Use correct cast for readfirstlane and ptrs.Bas Nieuwenhuizen2019-08-061-0/+2
| | | | | | Fixes: 028ce527 "radv: Add non-uniform indexing lowering." Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Do non-uniform lowering before bool lowering.Bas Nieuwenhuizen2019-08-061-1/+1
| | | | | | | | Since it can introduce comparisons. Fixes: 028ce527395 "radv: Add non-uniform indexing lowering." Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Lower large indirect variables to scratchConnor Abbott2019-08-051-0/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | results from radeonsi NIR: Totals from affected shaders: SGPRS: 704 -> 464 (-34.09 %) VGPRS: 2056 -> 672 (-67.32 %) Spilled SGPRs: 24 -> 0 (-100.00 %) Spilled VGPRs: 28406 -> 0 (-100.00 %) Private memory VGPRs: 0 -> 3182 (0.00 %) Scratch size: 1064 -> 3228 (203.38 %) dwords per thread Code Size: 935260 -> 40180 (-95.70 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 28 -> 70 (150.00 %) Wait states: 0 -> 0 (0.00 %) results from radv: Totals from affected shaders: SGPRS: 80 -> 48 (-40.00 %) VGPRS: 204 -> 108 (-47.06 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 256 (0.00 %) dwords per thread Code Size: 15792 -> 9504 (-39.82 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 1 -> 2 (100.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* meson: replace last uses of libxmlconfig with idep_xmlconfigEric Engestrom2019-08-031-2/+1
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* meson: replace libmesa_util with idep_mesautilEric Engestrom2019-08-031-2/+2
| | | | | | | | | | | This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* radv: Expose VK_KHR_imageless_framebuffer.Bas Nieuwenhuizen2019-08-022-0/+7
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Implement VK_KHR_imageless_framebuffer.Bas Nieuwenhuizen2019-08-022-10/+38
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Store image view also outside framebuffer.Bas Nieuwenhuizen2019-08-026-33/+31
| | | | | | So we can use it with imageless framebuffers. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Store color/depth surface info in attachment info instead of framebuffer.Bas Nieuwenhuizen2019-08-027-104/+102
| | | | | | That way we can use it for imageless framebuffers. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir,radv: Optimize bounds check for 64 bit CAS.Bas Nieuwenhuizen2019-08-027-17/+36
| | | | | | | | When the application does not ask for robust buffer access. Only implemented the check in radv. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: fix image_has_{cmask,fmask}() helpersSamuel Pitoiset2019-08-021-2/+2
| | | | | | | | | | | The driver should now rely on cmask_offset because CMASK can be disabled by the driver for some reasons (eg. mipmaps). Apply the same change for FMASK, although it should be useless. Fixes: ad1bc8621df ("radv: remove radv_get_image_fmask_info()") Fixes: 10d08da52c6 ("radv/gfx10: add missing dcc_tile_swizzle tweak") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove radv_get_image_fmask_info()Samuel Pitoiset2019-08-024-59/+25
| | | | | | | It's unnecessary to duplicate fields in another struct. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add missing dcc_tile_swizzle tweakSamuel Pitoiset2019-08-021-1/+3
| | | | | | Fixes: c90f46700dd ("radv/gfx10: mask DCC tile swizzle by alignment") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove radv_get_image_cmask_info()Samuel Pitoiset2019-08-024-45/+21
| | | | | | | It's unnecessary to duplicate fields in another struct. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only account for tile_swizzle for color surfaces with DCCSamuel Pitoiset2019-08-021-3/+3
| | | | | | | It's 0 for depth surfaces with TC compat HTILE enabled. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Enable VK_KHR_shader_atomic_int64Bas Nieuwenhuizen2019-08-022-6/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Implement LLVM9 64-bit buffer compare & exchange.Bas Nieuwenhuizen2019-08-021-4/+64
| | | | | | | | | | | LLVM 9 does not have a 64-bit buffer compswap intrinsic, so this extracts the ptr, does a bound check and then uses a cmpxchg LLVM instruction. Not ideal, but the earliest release we're going to get a proper intrinsic is LLVM 10. Reviewed-by: Samuel Pitoiset <[email protected]>
* Revert "ac/nir: handle negate modifier"Connor Abbott2019-08-021-12/+1
| | | | This reverts commit bfea7e4d2965269bff8f1f6449cb99c312fd7384.
* Revert "ac/nir: handle abs modifier"Connor Abbott2019-08-021-30/+11
| | | | | | This reverts commit d3c80733cdfe8552b2f447ec8ed62465d0f2af1a. These were only appearing due to memory corruption.
* radv: re-apply "Optimize rebinding the same descriptor set."Samuel Pitoiset2019-08-021-1/+7
| | | | | | | | | | | | | This makes it cheaper to just change the dynamic offsets with the same descriptor sets. This optimization has been reverted a while back because of random GPU hangs on GFX9, no it looks fine, at least CTS no longer hangs on GFX9 and it doesn't hang on GFX10 as well. It fixes a performance problem with Wolfenstein Youngblood. Suggested-by: Philip Rebohle <[email protected]>
* radv/gfx10: use the correct target machine for Wave32Samuel Pitoiset2019-08-023-10/+26
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add Wave32 support for vertex, tessellation and geometry shadersSamuel Pitoiset2019-08-027-8/+26
| | | | | | | It can be enabled with RADV_PERFTEST=gewave32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add Wave32 support for fragment shadersSamuel Pitoiset2019-08-027-2/+16
| | | | | | | It can be enabled with RADV_PERFTEST=pswave32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: implement a GE bug workaroundSamuel Pitoiset2019-07-311-4/+23
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: remove an obsolete VGT_REUSE_OFF workaroundSamuel Pitoiset2019-07-311-6/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: disable LATE_ALLOC_GS on Navi14Samuel Pitoiset2019-07-311-1/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: implement a bug workaround for GE_PC_ALLOCSamuel Pitoiset2019-07-312-17/+13
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: implement a bug workaround for NGG -> legacy transitionsSamuel Pitoiset2019-07-312-2/+21
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: skip draw calls with 0-sized index buffersSamuel Pitoiset2019-07-311-0/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* tree-wide: replace MAYBE_UNUSED with ASSERTEDEric Engestrom2019-07-317-27/+27
| | | | | | Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* radv: drop incorrect MAYBE_UNUSEDEric Engestrom2019-07-311-2/+2
| | | | | | | `compressed` is clearly always used on the line right after. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* radv/gfx10: add Wave32 support for compute shadersSamuel Pitoiset2019-07-317-6/+53
| | | | | | | It can be enabled with RADV_PERFTEST=cswave32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: fix incorrect Phis if callbacks use control flow inside control flowMarek Olšák2019-07-301-2/+2
|
* ac/nir: handle abs modifierMarek Olšák2019-07-301-11/+30
|
* ac: fix a memory leak in the error path of ac_build_type_name_for_intrMarek Olšák2019-07-301-0/+1
|
* ac: allow control flow statements in NIR callbacksMarek Olšák2019-07-302-20/+29
| | | | This fixes a crash when compiling geometry shaders on radeonsi.
* ac/nir: handle negate modifierMarek Olšák2019-07-301-1/+12
|
* radeonsi/nir: implement FBFETCH for KHR_blend_equation_advancedMarek Olšák2019-07-302-0/+7
|
* radeonsi: adjust RB+ blend optimization settingsMarek Olšák2019-07-301-1/+1
| | | | based on PAL
* ac/surface: allow linear swizzle mode automatic selection on gfx9 & 10Marek Olšák2019-07-301-1/+0
| | | | let addrlib make the decision to get the same result as PAL.
* radv: Fix descriptor set allocation failure.Bas Nieuwenhuizen2019-07-301-1/+5
| | | | | | | | | | | | | | Set all the handles to VK_NULL_HANDLE: "If the creation of any of those descriptor sets fails, then the implementation must destroy all successfully created descriptor set objects from this command, set all entries of the pDescriptorSets array to VK_NULL_HANDLE and return the error." (Vulkan 1.1.117 Spec, section 13.2) CC: <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: fix queries with WAIT_BIT returning VK_NOT_READYAndres Rodriguez2019-07-271-1/+1
| | | | | | | | | | | | | | | | | When vkGetQueryPoolResults() is called with VK_QUERY_RESULT_WAIT_BIT set, the driver is supposed to wait for the query to become available before returning. Currently, radv returns once the query is indeed ready, but it returns VK_NOT_READY. It also fails to populate the results. The problem is a missing volatile in the secondary check for query availability. This patch removes the secondary check altogether since it is redundant with the preceding loop. This bug was found with an unreleased version of SteamVR. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: only compile the GS copy shader on-demandSamuel Pitoiset2019-07-301-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>