summaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* radv: do not support blitting surfaces for R32G32B32 formatsSamuel Pitoiset2018-10-121-0/+7
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108113 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement clear operations for R32G32B32Samuel Pitoiset2018-10-113-1/+284
| | | | | | | | | | This fixes crashes for some CTS: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.*.linear_*_* dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.color.*.*_linear_* Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108113 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: disallow 3D images and mipmaps/layers for R32G32B32 linear formatsSamuel Pitoiset2018-10-111-0/+14
| | | | | | | | R32G32B32 are weird formats and we are only going to support some basic operations for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add a workaround for a VGT hang with prim restart and stripsSamuel Pitoiset2018-10-111-0/+11
| | | | | | | | | | | Otherwise, Yakuza and The Evil Within hang the GPU with DXVK. This apparently only works on Polaris. Suggested by Marek. Cc: [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unsigned comparison against 0Dave Airlie2018-10-111-1/+1
| | | | | | | | The value is always >= 0 here. Found by coverity Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: remove dead code for master_fd closeDave Airlie2018-10-111-2/+0
| | | | | | | | | We have never opened master_Fd at this point, so remove code to close it. Found by coverity. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: don't pass shader key by copyDave Airlie2018-10-111-7/+6
| | | | | | Coverity pointed out we were copying 168 bytes here unnecessarily. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: add missing meson c++ visibility argumentsEric Engestrom2018-10-091-0/+1
| | | | | | | | Fixes: 6f3aee40f90d725653b6 "radv: using tls to store llvm related info and speed up compiles (v10)" Cc: Dave Airlie <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: tidy up radv_pipeline_init_multisample_state()Samuel Pitoiset2018-10-081-19/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always set PA_SC_MODE_CNTL_1.OUT_OF_ORDER_WATER_MARKSamuel Pitoiset2018-10-081-2/+2
| | | | | | | | It has probably no effect without out of order rasterization anyway. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set DB_EQAA.INCOHERENT_EQAA_READSSamuel Pitoiset2018-10-081-1/+1
| | | | | | | | My attempt was to set this field instead of duplicating one. Fixes: 6cfa321c39 ("radv: add potential missing fields for DB_EQAA") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: correct PKT3_COPY_DATA definitionsMarek Olšák2018-10-062-7/+7
|
* ac: define all address spaces properlyMarek Olšák2018-10-061-1/+1
|
* radv: fix resetting the pool for timestamp queriesSamuel Pitoiset2018-10-041-7/+5
| | | | | | | | | | | Since the driver no longer uses the availability bit for timestamp queries it shouldn't reset it. Instead, it should reset the query values to UINT32_MAX. This fixes VM faults. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108164 Signed-off-by: Samuel Pitoiset <[email protected]> Tested-by: Józef Kucia <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* util: disable cache if we have no build-id and timestamp is zeroTimothy Arceri2018-10-021-4/+0
| | | | | | | | | | Timestamp can be zero for example when Flatpak is used. In this case just disable the cache rather then segfaulting when incompatible cache items are loaded. V2: actually return false when mtime is 0. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not try to set DCC_CONTROL when image doesn't use DCCSamuel Pitoiset2018-10-011-1/+1
| | | | | | | | Unnecessary. While we are at it, remove the check for pre-VI because it's already checked earlier. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add a sanity check for mutable formats and TC-compat HTILESamuel Pitoiset2018-10-011-5/+22
| | | | | | | | | | If apps use the MUTABLE bit and the same formats as the image one in the list, we can still enable TC-compat HTILE. I don't think this happens often but given the fact that TC-compat HTILE allows a nice boost in some situations, it's worth checking. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: disable HTILE for very small depth surfacesSamuel Pitoiset2018-10-011-1/+3
| | | | | | | | | | | Like we disable DCC/CMASK for small color surfaces as well. Serious Sam 2017 creates a 1x1 depth surface and I think it should be faster to do slow clears on the graphics queue instead of fast clears on compute, and eventually a depth expand if the surface isn't TC-compatible HTILE. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add potential missing fields for DB_EQAASamuel Pitoiset2018-10-011-1/+3
| | | | | | | Other drivers set these two as well, just apply the same rule. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: disable complicated point clipping against user clip planesSamuel Pitoiset2018-10-011-1/+0
| | | | | | | | | I don't think this is required by Vulkan too. Ported from RadeonSI (AMDVLK doesn't set it either). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not sync CP DMA when copying buffersSamuel Pitoiset2018-09-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already track if the DMA engine is busy/idle with a flag, and we emit a packet that waits for all CP DMA operations to be complete. This is done at end of command buffer because the kernel doesn't wait for them, and also when emitting barriers, so it should be safe. This improves small copies for both aligned and unaligned sizes. Aligned sizes: BEFORE: 1 KB: 59.840000 ms 2 KB: 71.200000 ms AFTER: 1 KB: 31.200000 ms 2 KB: 31.040000 ms Unaligned sizes: BEFORE: 2 KB: 68.3200 ms 3 KB: 79.3600 ms 5 KB: 76.6400 ms 9 KB: 90.8800 ms 17 KB: 116.0000 ms AFTER: 2 KB: 31.0400 ms 3 KB: 32.0000 ms 5 KB: 30.8800 ms 9 KB: 30.5600 ms 17 KB: 29.6000 ms Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: adjust the CmdUpdateBuffer threshold for optimal performanceSamuel Pitoiset2018-09-282-1/+3
| | | | | | | | | | | | | | | | According to my benchmark results, it appears that we should reduce the threshold to 1024. BEFORE: 1 KB: 68.656000 ms 2 KB: 118.368000 ms AFTER: 1 KB: 31.760000 ms 2 KB: 29.840000 ms Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not use the availability bit for timestamp queriesSamuel Pitoiset2018-09-281-30/+27
| | | | | | | | | | | It's unnecessary because we can just check if the timestamp is to different to the default value when a pool is created or resetted. Instead of waiting for the availability bit to be 1, we have to emit a not equal WAIT_REG_MEM for checking if the timestamp is ready. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Remove garbage comment.Bas Nieuwenhuizen2018-09-271-1/+0
| | | | Trivial.
* radv: Do not use multiple draws for multisample copies.Bas Nieuwenhuizen2018-09-271-57/+5
| | | | | | | | Use sample rate shading instead, should give better locality. Makes Nier with 8x msaa on a Raven go 5 fps -> 7 fps in the menu. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: only emit ZPASS_DONE for timestamp queries on gfx queuesAndres Rodriguez2018-09-251-1/+1
| | | | | | | | | | | | A ZPASS_DONE packet doesn't make sense for the compute queue. It will result in a gpu hang. This change resolves a gpu hang for SteamVR+Vega. Cc: [email protected] Fixes: 1f616a840eac02241c585d28e9dac8f19a297f39 "radv: emit a dummy ..." Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: make use of nir_lower_load_const_to_scalar()Timothy Arceri2018-09-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows NIR to CSE more operations. LLVM does this also so the impact is limited, however doing this in NIR allows other opts to make progress. For example in radeonsi more loops are unrolled in Civilization Beyond Earth. The actual pipeline-db stats are not overwhelming but even in the negatively affected shaders the NIR is clearly better. It just happens that the code shuffling and in some cases calls to max rather than a flt result in the final output from LLVM not giving as good numbers. However this is an incremental opt that further passes build off so the change should be made IMO. Totals from affected shaders: SGPRS: 20192 -> 20184 (-0.04 %) VGPRS: 19516 -> 19524 (0.04 %) Spilled SGPRs: 437 -> 444 (1.60 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1527444 -> 1522276 (-0.34 %) bytes LDS: 6 -> 6 (0.00 %) blocks Max Waves: 1018 -> 1016 (-0.20 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use the resolve compute path if dest uses multiple layersSamuel Pitoiset2018-09-211-1/+2
| | | | | | | | | | | | | | The hardware path doesn't support resolving layers, for both source and destination images. This fixes a reflection issue when MSAA is enabled which affects GTA V and probably DIRT3. CC: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107786 Signed-off-by: Samuel Pitoiset <[email protected]> Tested-by: Gregor Münch <gr.muench_at_gmail.com> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv,radv: Implement vkAcquireNextImage2Jason Ekstrand2018-09-211-10/+25
| | | | | | | | This was added as part of 1.1 but it's very hard to track exactly what extension added it. In any case, we should implement it. Cc: [email protected] Reviewed-by: Dave Airlie <[email protected]>
* radv: only enable shaderInt16 on GFX9+ and LLVM7+Samuel Pitoiset2018-09-211-1/+1
| | | | | | | | | | | The throughput is similar to 32-bit integers on GFX8 and AMDVLK does not expose 16-bit integers on pre Vega as well. On GFX9+, only LLVM 7+ has support. This fixes a bunch of CTS crashes on GFX9/LLVM 6. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix driver UUID SHA1 init.Bas Nieuwenhuizen2018-09-201-0/+2
| | | | | | | | | Was missing the init, found by Emil. Fixes: d17443a4593 "radv: Use build ID if available for cache UUID." CC: <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: use a 64-bit unsigned integer when allocating a descriptor poolSamuel Pitoiset2018-09-191-1/+1
| | | | | | | pool->size is a 64-bit unsigned integer too. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable VK_SUBGROUP_FEATURE_ARITHMETIC_BITSamuel Pitoiset2018-09-192-0/+2
| | | | | | | | All CTS pass on Polaris/Vega with LLVM 6, 7 and master, so I think it's safe to enable the feature. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not support blitting surfaces with depth and stencilSamuel Pitoiset2018-09-191-0/+4
| | | | | | | | | | | Fixes: dEQP-VK.api.copy_and_blit.core.blit_image.all_formats.depth_stencil.d32_sfloat_s8_uint_d32_sfloat_s8_uint.optimal_optimal_nearest And all friends that try to blit a surface with different depth and stencil formats. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* Revert "radv: fix descriptor pool allocation size"Bas Nieuwenhuizen2018-09-181-2/+1
| | | | | | | | | | This reverts commit 90819abb56f6b1a0cd4946b13b6caf24fb46e500. This logic was wrong, the original code is correct. The direct impact is that we allocate up to approximately a squared amount of memory compared to what we should allocate. Acked-by: Samuel Pitoiset <[email protected]>
* radv: implement VK_EXT_conservative_rasterizationSamuel Pitoiset2018-09-183-1/+63
| | | | | | | | | Only supported by GFX9+. The conservativeraster Sascha demo seems to work as expected. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not re-create the sampler for every blits in CmdBlitImage()Samuel Pitoiset2018-09-181-15/+17
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allow to force anisotropy via RADV_TEX_ANISOSamuel Pitoiset2018-09-182-2/+47
| | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use build ID if available for cache UUID.Bas Nieuwenhuizen2018-09-171-8/+35
| | | | | | | | | | | To get an useful UUID for systems that have a non-useful mtime for the binaries. I started using SHA1 to ensure we get reasonable mixing in the various possibilities and the various build id lengths. CC: <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: enable shaderInt16 capabilitySamuel Pitoiset2018-09-172-1/+2
| | | | | | | | | Not sure if this is all wired up. CTS does pass and the Tangrams demo works fine on Vega. There are corruption issues on Polaris but not sure if that related to 16-bit support. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix use of unreachable() in the meta blit pathSamuel Pitoiset2018-09-171-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* Revert "radv: Optimize rebinding the same descriptor set."Samuel Pitoiset2018-09-171-7/+1
| | | | | | This introduces random GPU hangs on Vega, at least. This reverts commit 02a43edf186cb9998741ba765cb948bb238a122d.
* radv: fix descriptor pool allocation sizeSamuel Pitoiset2018-09-171-1/+2
| | | | | | | | | | | The size has to be multiplied by the number of sets. This gets rid of the OUT_OF_POOL_KHR error and fixes a crash with the Tangrams demo. CC: 18.1 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Only allow 16 user SGPRs for compute on GFX9+.Bas Nieuwenhuizen2018-09-161-1/+1
| | | | | | | | | | Apparently for compute there are only 16 instead of the 32 for the graphics path. Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.0 CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Set the user SGPR MSB for Vega.Bas Nieuwenhuizen2018-09-161-0/+1
| | | | | | | Otherwise using 32 user SGPRs would be broken. CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Optimize rebinding the same descriptor set.Bas Nieuwenhuizen2018-09-161-1/+7
| | | | | | | | This makes it cheaper to just change the dynamic offsets with the same descriptor sets. Suggested-by: Philip Rebohle <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: emit the initial config only once in the preamblesSamuel Pitoiset2018-09-144-50/+48
| | | | | | | | | It shouldn't be needed to emit the initial graphics or compute state when beginning a new command buffer. Emitting them in the preamble should be enough and this will reduce IB sizes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix setting global locations for indirect descriptorsSamuel Pitoiset2018-09-141-1/+0
| | | | | | | | | | | | Indirect descriptors only need one entry, we don't have to emit a location for every descriptors. Fixes GPU hangs with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix flushing indirect descriptorsSamuel Pitoiset2018-09-141-3/+9
| | | | | | | | | | | | | | Let say, we first bind a graphics pipeline that needs indirect descriptors sets. The userdata pointers will be emitted at draw time. Then if we bind a compute pipeline that doesn't need any indirect descriptors, the driver will re-emit them for all grpahics stages. To avoid this to happen, just check the bind point type. CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix GPU hangs with 32-bit indirect descriptorsSamuel Pitoiset2018-09-141-3/+5
| | | | | | | | | | | LLVM 6 isn't affected. Fixes GPU hangs with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>