summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: merge radv_dcc_clear_level() into radv_clear_dcc()Samuel Pitoiset2019-07-021-30/+22
| | | | | | | | This will help for clearing DCC arrays because we need to know the subresource range. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for decompressing DCC layers with computeSamuel Pitoiset2019-07-021-51/+53
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: compute the DCC fast clear size per slice on GFX8Samuel Pitoiset2019-07-022-0/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: compute the size of one DCC slice on GFX8Samuel Pitoiset2019-07-022-0/+7
| | | | | | | | Addrlib doesn't provide this info. Because DCC is linear, at least on GFX8, it's easy to compute the size of one slice. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Only allocate supplied number of descriptors when variable.Bas Nieuwenhuizen2019-07-011-1/+7
| | | | | | Fixes: b5e04e9217b "radv: Support allocating variable size descriptor sets." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019 Reviewed-by: Samuel Pitoiset <[email protected]>
* nir: Add lower_rotate flag and set to true in all driversSagar Ghuge2019-07-011-0/+1
| | | | | | Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* radv: rework how the number of VGPRs is computedSamuel Pitoiset2019-07-013-26/+31
| | | | | | | Just a cleanup, it shouldn't change anything. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather if a vertex shaders needs the instance IDSamuel Pitoiset2019-07-011-4/+14
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix decompressing DCC levels with computeSamuel Pitoiset2019-07-011-1/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: the number of VGPR_COMP_CNT for GS is expected to be 0 on GFX8Samuel Pitoiset2019-07-011-1/+1
| | | | | | | Just move around the switch case. GFX9+ is handled below. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: reduce number of VGPRs for TESS_EVAL if primitive ID is not usedSamuel Pitoiset2019-07-011-3/+10
| | | | | | | We only need to 2. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make sure to mark the image as compressed when clearing DCC levelsSamuel Pitoiset2019-07-013-27/+8
| | | | | | | Found while working on DCC for arrays. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: change ac_query_gpu_info() signatureEmil Velikov2019-06-282-4/+3
| | | | | | | | | | | | | | | | | | | | | | Currently libdrm_amdgpu provides a typedef of the various handles. While the goal was to make those opaque, it effectively became part of the API To the best of my knowledge there are two ways to have opaque handles: - "typedef void *foo;" - rather messy IMHO - "stuct foo;" and use "struct foo *" through the API In our case amdgpu_device_handle is used only internally, plus respective code is not used or applicable for r300 and r600. Hence we copied the typedef. Seemingly this will be a problem since libdrm_amdgpu wants to change the API, while not updating the code(?). Either way, we can safely s/amdgpU_device_handle/void */ and carry on. Cc: Michel Dänzer <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Marek Olšák <marek.olsak at amd.com>
* radv: only enable VK_AMD_gpu_shader_{half_float,int16} on GFX9+Samuel Pitoiset2019-06-281-2/+2
| | | | | | | | | | | | | | These two extensions are supported on GFX8 but the throughput of 16-bit floats/integers is same as 32-bit. Also, shaderInt16 is only enabled on GFX9+ for the same reason, be more consistent. This fixes a crash with Wolfenstein II because it expects shaderInt16 to be enabled when VK_AMD_gpu_shader_half_float is exposed. Note that AMDVLK only enables these extensions on GFX9+. Cc: 19.1 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add si_emit_ia_multi_vgt_param() helperSamuel Pitoiset2019-06-281-9/+25
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only export clip/cull distances if PS reads themSamuel Pitoiset2019-06-273-4/+15
| | | | | | | | | | | | | | | | The only exception is the GS copy shader which emits them unconditionally. Totals from affected shaders: SGPRS: 71320 -> 71008 (-0.44 %) VGPRS: 54372 -> 54240 (-0.24 %) Code Size: 2952628 -> 2941368 (-0.38 %) bytes Max Waves: 9689 -> 9723 (0.35 %) This helps Dota2, Doom, GTAV and Hitman 2. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix FMASK expand if layerCount is VK_REMAINING_ARRAY_LAYERSSamuel Pitoiset2019-06-271-1/+1
| | | | | | | | This doesn't fix anything known, but it's likely going to break if layerCount is ~0U. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: rename and re-document cache flush flagsSamuel Pitoiset2019-06-2510-58/+62
| | | | | | | SMEM and VMEM caches are L0 on gfx10. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set DISABLE_CONSTANT_ENCODE_REG to 1 for Raven2Samuel Pitoiset2019-06-253-1/+9
| | | | | | | Ported from RadeonSI, will be emitted for GFX10 too. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clear CMASK layers instead of the whole buffer on GFX8Samuel Pitoiset2019-06-257-18/+35
| | | | | | | | | | This reduces the size of fill operations needed to clear CMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clear FMASK layers instead of the whole buffer on GFX8Samuel Pitoiset2019-06-258-10/+31
| | | | | | | | | | This reduces the size of fill operations needed to clear FMASK for layered color textures. GFX9 unsupported for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always initialize levels without DCC as fully expandedSamuel Pitoiset2019-06-251-17/+15
| | | | | | | This fixes a rendering issue with RoTR/DXVK. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: set the calling convention for inlined function callsMarek Olšák2019-06-242-0/+11
| | | | | | | otherwise the behavior is undefined Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* amd/rtld: update the ELF representation of LDS symbolsNicolai Hähnle2019-06-241-7/+27
| | | | | | | | | | | | | | | | The initial prototype used a processor-specific symbol type, but feedback suggests that an approach using processor-specific section name that encodes the alignment analogous to SHN_COMMON symbols is preferred. This patch keeps both variants around for now to reduce problems with LLVM compatibility as we switch branches around. This also cleans up the error reporting in this function. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/surface: remove addrlib_family_rev_idMarek Olšák2019-06-243-108/+7
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: lower bitfield_extract to ubfe/ibfe.Daniel Schürmann2019-06-243-35/+21
| | | | Reviewed-by: Connor Abbott <[email protected]>
* amd/common: lower bitfield_insert to bfm & bitfield_selectDaniel Schürmann2019-06-242-26/+26
| | | | Reviewed-by: Connor Abbott <[email protected]>
* radv: add support for VK_AMD_buffer_markerSamuel Pitoiset2019-06-242-0/+36
| | | | | | | | This simple extension might be useful for debugging purposes. GAPID has support for it. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* android: winsys/amdgpu,radv: fix generated amdgfxregs.h header dependeciesMauro Rossi2019-06-212-2/+3
| | | | | | | | | | | | | | Fix android building errors in winsys/amdgpu and radv due to 'amdgfxregs.h' not found. Changelog: amd/common - generated $(intermediated)/common path is added to exports winsys/amdgpu - libmesa_amd_common static dependency is added radv - correct generated $(intermediated)/common path is added to includes Fixes: f480b8a ("amd/common: use generated register header") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radv: add support for VK_KHR_depth_stencil_resolveSamuel Pitoiset2019-06-212-0/+22
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: pass sample locations for transitions before depth/stencil resolvesSamuel Pitoiset2019-06-213-1/+34
| | | | | | | | HTILE decompressions need the user sample locations if specified in the current subpass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clear the depth/stencil resolve attachment if necessarySamuel Pitoiset2019-06-211-18/+55
| | | | | | | | The driver might need to clear one aspect of the depth/stencil resolve attachment before performing the resolve itself. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: decompress HTILE if the resolve src image is compressedSamuel Pitoiset2019-06-211-1/+17
| | | | | | | | It's required to decompress HTILE before resolving with the compute path. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: select the depth/stencil resolve method based on some conditionsSamuel Pitoiset2019-06-211-13/+65
| | | | | | | Only fallback to the compute path for layers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement all depth/stencil resolve modes using computeSamuel Pitoiset2019-06-212-0/+522
| | | | | | | | | This path supports layers but it requires to decompress HTILE before resolving. The driver also needs to fixup HTILE after the resolve. This path is probably slower than the graphics one. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement all depth/stencil resolve modes using graphicsSamuel Pitoiset2019-06-212-0/+614
| | | | | | | | | When using graphics, the driver doesn't need to decompress HTILE before resolving. This path currently doesn't support layers so we have to fallback to the compute path. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: record if a render pass has depth/stencil resolve attachmentsSamuel Pitoiset2019-06-212-1/+29
| | | | | | | Only supported with vkCreateRenderPass2(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: rename has_resolve to has_color_resolveSamuel Pitoiset2019-06-213-5/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: emit framebuffer state from primary if secondary doesn't inherit itSamuel Pitoiset2019-06-211-0/+9
| | | | | | | | | | | | | | | | Otherwise fast color/depth clears can't work because they depend on the framebuffer. This fixes the following CTS (when the small hint is disabled): - dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer - dEQP-VK.geometry.layered.2d_array.secondary_cmd_buffer - dEQP-VK.geometry.layered.cube.secondary_cmd_buffer - dEQP-VK.geometry.layered.cube_array.secondary_cmd_buffer Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110810 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107986 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable DCC for mipmapped color textures on GFX8Samuel Pitoiset2019-06-201-2/+7
| | | | | | | It's tricky on GFX9, so only GFX8 for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not fast clears if one level can't be fast clearedSamuel Pitoiset2019-06-201-0/+15
| | | | | | | And fallback to slow color clears. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add fast clears support for mipmapped color images with DCCSamuel Pitoiset2019-06-201-1/+11
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_dcc_clear_level() helperSamuel Pitoiset2019-06-202-3/+30
| | | | | | | For clearing only one level. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: re-initialize DCC metadata after decompressing using computeSamuel Pitoiset2019-06-201-4/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: initialize levels without DCC during layout transitionsSamuel Pitoiset2019-06-201-1/+48
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/rtld: report better error messages for LDS overallocationNicolai Hähnle2019-06-191-2/+11
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac/rtld: check correct LDS max sizeMarek Olšák2019-06-192-1/+9
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: add s_sethalt to shaders for debuggingNicolai Hähnle2019-06-192-0/+18
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac/rtld: fix sorting of LDS symbols by alignmentNicolai Hähnle2019-06-191-2/+2
| | | | Tested-by: Dieter Nützel <[email protected]>
* radv: Fix vulkan build in meson.Bas Nieuwenhuizen2019-06-191-0/+7
| | | | | | | Apparently the android part was never ported to meson. CC: <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>