summaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* radv: Add entrypoints generation with the new vk.xmlBas Nieuwenhuizen2018-03-071-107/+164
| | | | | | A lot of it is based on intel again. Reviewed-by: Dave Airlie <[email protected]>
* radv: report the scratch private memory size with shader statsSamuel Pitoiset2018-03-061-1/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* vulkan: do not expose surface/swapchain extensions on AndroidTapani Pälli2018-03-061-2/+2
| | | | | | | | | On Android surface/swapchain extensions are implemented by the loader. Patch modifies both anv and radv extension scripts disabling currently exposed ones. See also earlier commit 9f763c1f9b. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.cTimothy Arceri2018-03-053-48/+5
| | | | | | | Until llvm handles indirects better we will need to use these workarounds in the radeonsi backend also. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix copying from 3D images starting at non-zero depth.Bas Nieuwenhuizen2018-03-051-0/+3
| | | | | Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* radv: do not set pending_reset_query in BeginCommandBuffer()Samuel Pitoiset2018-03-021-7/+0
| | | | | | | | | | | | | This is just useless for two reasons: 1) flush_bits is not set accordingly, so nothing will be flushed in BeginQuery(). 2) we always flush caches in EndCommandBuffer(), so if a reset is done in a previous command buffer we are safe. Cc: "18.0" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only emit cache flushes when the pool size is large enoughSamuel Pitoiset2018-03-013-11/+15
| | | | | | | | This is an optimization which reduces the number of flushes for small pool buffers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: keep track of the query pool sizeSamuel Pitoiset2018-03-012-5/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make sure to emit cache flushes before starting a querySamuel Pitoiset2018-03-013-7/+33
| | | | | | | | | | | If the query pool has been previously resetted using the compute shader path. Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292 Cc: "18.0" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use the syncobj wait ioctl to wait on fences if possible.Bas Nieuwenhuizen2018-03-013-9/+26
| | | | | | Handles the !waitAll and signal after the start of the wait cases correctly. Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement more efficient !waitAll fence waiting.Bas Nieuwenhuizen2018-03-013-0/+75
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement waiting on non-submitted fences.Bas Nieuwenhuizen2018-03-011-2/+11
| | | | | Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement WaitForFences with !waitAll.Bas Nieuwenhuizen2018-03-011-5/+15
| | | | | | | | | | Nothing to do except using a busy wait loop. At least for old kernels. A better implementation for newer kernels to come later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255 Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* ac/shader: move scanning some info about input PS declarationsSamuel Pitoiset2018-02-281-6/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: remove device pointer from buffer.Dave Airlie2018-02-281-1/+0
| | | | | | | | This is never used. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: expose async compute on SIDave Airlie2018-02-271-2/+0
| | | | | | | | | | | It looks like we had all the pieces in place for this, just never tested it and turned it on. I don't see any CTS regressions and the computeshader demo runs. Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: merge tess rings into a single boDave Airlie2018-02-272-56/+39
| | | | | | | Inspired by a passing commit to radeonsi. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Really use correct HTILE expanded words.James Legg2018-02-241-3/+3
| | | | | | | | | | | | | | | | | | | | When transitioning to an htile compressed depth format, Set the full depth range, so later rasterization can pass HiZ. Previously, for depth only formats, the depth range was set to 0 to 0. This caused unwanted HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer (VK_FORMAT_D32_SFLOAT was not affected somehow). These values are derived from PAL [0], since I can't find the specification describing the htile values. [0] https://github.com/GPUOpen-Drivers/pal/blob/5cba4ecbda9452773f59692f5915301e7db4a183/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp#L1500 CC: Dave Airlie <[email protected]> CC: Bas Nieuwenhuizen <[email protected]> CC: [email protected] Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Fixes: 5158603182fe7435 "radv: Use correct HTILE expanded words."
* radv/extensions: fix c_vk_version for patch == NoneMauro Rossi2018-02-241-1/+2
| | | | | | | | | | | | | | | | | | | | Similar to cb0d1ba156 ("anv/extensions: Fix VkVersion::c_vk_version for patch == None") fixes the following building errors: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48: error: use of undeclared identifier 'None'; did you mean 'long'? return instance && VK_MAKE_VERSION(1, 0, None) <= core_version; ^~~~ long external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION' (((major) << 22) | ((minor) << 12) | (patch)) ^ ... fatal error: too many errors emitted, stopping now [-ferror-limit=] 20 errors generated. Fixes: e72ad05c1d ("radv: Return NULL for entrypoints when not supported.") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix autotools build.Bas Nieuwenhuizen2018-02-231-1/+3
| | | | | | | Somewhere along the way the Makefile changes got lost ... Fixes: 4db78f3a6b "radv: Put supported extensions in a struct." Acked-by: Dave Airlie <[email protected]>
* radv: Return NULL for entrypoints when not supported.Bas Nieuwenhuizen2018-02-234-9/+83
| | | | | | | | | | | | | | | | This implements strict checking for the entrypoint ProcAddr functions. - InstanceProcAddr with instance = NULL, only returns the 3 allowed entrypoints. - DeviceProcAddr does not return any instance entrypoints. - InstanceProcAddr does not return non-supported or disabled instance entrypoints. - DeviceProcAddr does not return non-supported or disabled device entrypoints. - InstanceProcAddr still returns non-supported device entrypoints. Reviewed-by: Dave Airlie <[email protected]>
* radv: Reword radv_entrypoints_gen.pyBas Nieuwenhuizen2018-02-231-56/+106
| | | | | | With a big inspiration from anv as always ... Reviewed-by: Dave Airlie <[email protected]>
* radv: Track enabled extensions.Bas Nieuwenhuizen2018-02-232-36/+48
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Put supported extensions in a struct.Bas Nieuwenhuizen2018-02-234-63/+133
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: enable lowering of fpow to fexp2 and flog2Samuel Pitoiset2018-02-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | There is no fpow in hardware, so it's always lowered somewhere, but it appears that lowering at NIR level is better. Figured while comparing compute shaders between RadeonSI and RADV. Polaris10: Totals from affected shaders: SGPRS: 18936 -> 18904 (-0.17 %) VGPRS: 12240 -> 12220 (-0.16 %) Spilled SGPRs: 2809 -> 2809 (0.00 %) Code Size: 718116 -> 719848 (0.24 %) bytes Max Waves: 1409 -> 1410 (0.07 %) Vega10: Totals from affected shaders: SGPRS: 18392 -> 18392 (0.00 %) VGPRS: 12008 -> 11920 (-0.73 %) Spilled SGPRs: 3001 -> 2981 (-0.67 %) Code Size: 777444 -> 778788 (0.17 %) bytes Max Waves: 1503 -> 1504 (0.07 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't send num_tcs_input_cp to sgprs.Dave Airlie2018-02-211-4/+1
| | | | | | | We never use it in the shaders. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/tess: don't need to look in constant for vertices_per_patchDave Airlie2018-02-211-1/+1
| | | | | | | This just avoids passing this value via user sgprs. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: allow to force family using RADV_FORCE_FAMILYSamuel Pitoiset2018-02-201-0/+33
| | | | | | | Useful for pipeline-db. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: compact varyings after removing unused onesSamuel Pitoiset2018-02-191-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | It makes no sense to compact before, and the description of nir_compact_varyings() confirms that. Polaris10: Totals from affected shaders: SGPRS: 108528 -> 108128 (-0.37 %) VGPRS: 74548 -> 74500 (-0.06 %) Spilled SGPRs: 844 -> 814 (-3.55 %) Code Size: 3007328 -> 2992932 (-0.48 %) bytes Max Waves: 16019 -> 16009 (-0.06 %) Vega10: Totals from affected shaders: SGPRS: 106088 -> 106232 (0.14 %) VGPRS: 74652 -> 74700 (0.06 %) Spilled SGPRs: 692 -> 658 (-4.91 %) Code Size: 2967708 -> 2953028 (-0.49 %) bytes Max Waves: 18178 -> 18162 (-0.09 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: Always lower indirect derefs after nir_lower_global_vars_to_local.Bas Nieuwenhuizen2018-02-153-36/+53
| | | | | | | | Otherwise new local variables can cause hangs on vega. CC: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105098 Reviewed-by: Timothy Arceri <[email protected]>
* radv: Fix compiler warning about uninitialized 'set'Eric Anholt2018-02-121-1/+1
| | | | | | | The compiler doesn't figure out that we only get result == VK_SUCCESS if set got initialized. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/shader: scan info about output PS declarationsSamuel Pitoiset2018-02-081-9/+9
| | | | | | | | NIR->LLVM should only be a translation pass, and all scan stuff should be done before. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement VK_EXT_external_memory_hostFredrik Höglund2018-02-086-8/+137
| | | | | | | Ported from the radeonsi GL_AMD_pinned_memory implementation. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: run nir_opt_shrink_loadSamuel Pitoiset2018-02-061-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVM can't shrink loads. Polaris10: Totals from affected shaders: SGPRS: 62528 -> 59955 (-4.11 %) VGPRS: 44708 -> 44616 (-0.21 %) Spilled SGPRs: 16 -> 8 (-50.00 %) Code Size: 1355504 -> 1355172 (-0.02 %) bytes Max Waves: 11710 -> 11670 (-0.34 %) Vega10: Totals from affected shaders: SGPRS: 51448 -> 50371 (-2.09 %) VGPRS: 39140 -> 39048 (-0.24 %) Spilled SGPRs: 16 -> 16 (0.00 %) Code Size: 1307188 -> 1304296 (-0.22 %) bytes Max Waves: 11312 -> 11292 (-0.18 %) This reduces SGPRs spilling in MadMax, and it also reduces number of SGPRs in DOW3 and F12017. The number of waves slightly decreases in F1 but I don't see any performance changes after benchmarking it. Talos and Serious Sam are not affected because they don't use any push constants. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't support tc-compat on multisample d32s8 at all.Dave Airlie2018-02-061-2/+2
| | | | | | | | | | | | RX550 fails dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2 So increase the range of the workaround. Fixes: f4c534ef6 (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* amd: remove support for LLVM 3.9Marek Olšák2018-02-021-4/+0
| | | | | | | | | | | Only these are supported: - LLVM 4.0 - LLVM 5.0 - LLVM 6.0 - master (7.0) Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Don't expose VK_KHX_multiview on android.Bas Nieuwenhuizen2018-02-011-1/+1
| | | | | | | | | | deqp does not allow any KHX extensions, and since deqp is included in android-cts, android does not allow any khx extensions. So disable VK_KHX_multiview on android. Reviewed-by: Samuel Pitoiset <[email protected]> CC: 18.0 <[email protected]>
* radv: do not insert shaders in cache when it's disabledSamuel Pitoiset2018-02-011-5/+24
| | | | | | | | | | | | | When the application doesn't provide its own pipeline cache, the driver uses a in-memory cache but it shouldn't insert any entries when the cache is explicitely disabled by the user. Found while running my experimental pipeline-db tool with a ton of shaders, the memory footprint was just huge, and sometimes the process was even killed... Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use separate bindings for graphics and compute descriptorsSamuel Pitoiset2018-02-013-53/+125
| | | | | | | | | | | | | The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104732 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store the bind point when creating descriptors with templatesSamuel Pitoiset2018-02-012-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not dump meta shader statsSamuel Pitoiset2018-01-312-21/+18
| | | | | | | That's quite useless and that pollutes the output. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove predication on cache flushesMatthew Nicholls2018-01-314-18/+13
| | | | | | | | | This can lead to a situation where cache flushes could get conditionally disabled while still clearing the flush_bits, and thus flushes due to application pipeline barriers may never get executed. Fixes: a6c2001ace (radv: add support for cmd predication.) Signed-off-by: Dave Airlie <[email protected]>
* radv: Merge raster state with PM4 generation.Bas Nieuwenhuizen2018-01-302-75/+50
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Move gs state out of pipeline.Bas Nieuwenhuizen2018-01-302-43/+43
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Split out cliprect rule generation.Bas Nieuwenhuizen2018-01-302-25/+33
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Merge VGT_GS_MODE computation with PM4 generation.Bas Nieuwenhuizen2018-01-302-28/+25
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Split out processing the vertex input state.Bas Nieuwenhuizen2018-01-301-35/+43
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Move tessellation state out of pipeline.Bas Nieuwenhuizen2018-01-302-50/+58
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Move blend state out of pipeline.Bas Nieuwenhuizen2018-01-302-67/+72
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Split out generating VGT_SHADER_STAGES_EN.Bas Nieuwenhuizen2018-01-302-24/+27
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>