summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* ac/shader: scan output usage mask for VS and TESSamuel Pitoiset2018-03-062-0/+22
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* vulkan: do not expose surface/swapchain extensions on AndroidTapani Pälli2018-03-061-2/+2
| | | | | | | | | On Android surface/swapchain extensions are implemented by the loader. Patch modifies both anv and radv extension scripts disabling currently exposed ones. See also earlier commit 9f763c1f9b. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* ac: pass the unmodified number of components to load gs inputsTimothy Arceri2018-03-061-2/+2
| | | | | | | | | | | Currently both users of this would overflow an array when the input was a dual slot double as they expected the number of components to be a max of 4. Since we pass the type we can just let the functions handle doubles in a way they choose. Reviewed-by: Dave Airlie <[email protected]>
* ac: add ac_build_fsign()Samuel Pitoiset2018-03-053-25/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_isign()Samuel Pitoiset2018-03-053-24/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_fract()Samuel Pitoiset2018-03-053-26/+34
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.cTimothy Arceri2018-03-055-48/+44
| | | | | | | Until llvm handles indirects better we will need to use these workarounds in the radeonsi backend also. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix copying from 3D images starting at non-zero depth.Bas Nieuwenhuizen2018-03-051-0/+3
| | | | | Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* radv: do not set pending_reset_query in BeginCommandBuffer()Samuel Pitoiset2018-03-021-7/+0
| | | | | | | | | | | | | This is just useless for two reasons: 1) flush_bits is not set accordingly, so nothing will be flushed in BeginQuery(). 2) we always flush caches in EndCommandBuffer(), so if a reset is done in a previous command buffer we are safe. Cc: "18.0" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: fix nir_intrinsic_shared_atomic_comp_swap handlingTimothy Arceri2018-03-021-1/+1
| | | | | | | | | | Following on from 49879f377870 this makes sure we use the correct src index. Fixes cts test: KHR-GL46.compute_shader.atomic-case3 Reviewed-by: Dave Airlie <[email protected]>
* radv: only emit cache flushes when the pool size is large enoughSamuel Pitoiset2018-03-013-11/+15
| | | | | | | | This is an optimization which reduces the number of flushes for small pool buffers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: keep track of the query pool sizeSamuel Pitoiset2018-03-012-5/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make sure to emit cache flushes before starting a querySamuel Pitoiset2018-03-013-7/+33
| | | | | | | | | | | If the query pool has been previously resetted using the compute shader path. Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292 Cc: "18.0" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use the syncobj wait ioctl to wait on fences if possible.Bas Nieuwenhuizen2018-03-013-9/+26
| | | | | | Handles the !waitAll and signal after the start of the wait cases correctly. Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement more efficient !waitAll fence waiting.Bas Nieuwenhuizen2018-03-013-0/+75
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement waiting on non-submitted fences.Bas Nieuwenhuizen2018-03-011-2/+11
| | | | | Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement WaitForFences with !waitAll.Bas Nieuwenhuizen2018-03-011-5/+15
| | | | | | | | | | Nothing to do except using a busy wait loop. At least for old kernels. A better implementation for newer kernels to come later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255 Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: fix shared atomic operations.Dave Airlie2018-03-011-5/+5
| | | | | | | | | | | The nir->llvm conversion was using the wrong srcs. Fixes: tests/spec/arb_compute_shader/execution/shared-atomics.shader_test Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: don't apply slice rounding on txf_msDave Airlie2018-03-011-1/+1
| | | | | | | | | | | This matches the tgsi code. Fixes arb_texture_multisample texelFetch piglit tests. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Fixes: f4e499ec7914 (radv: add initial non-conformant radv vulkan driver) Signed-off-by: Dave Airlie <[email protected]>
* ac/shader: move scanning some info about input PS declarationsSamuel Pitoiset2018-02-285-15/+26
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/radv: move load base vertex abi setup to vertex shader.Dave Airlie2018-02-281-1/+1
| | | | | | | | | This was segfaulting: dEQP-VK.memory.pipeline_barrier.host_write_index_buffer.1024 Fixes: 8de6f797070 (ac/radeonsi: add load_base_vertex() to the abi) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/shader: fix vertex input with components.Dave Airlie2018-02-281-1/+1
| | | | | | | | | | This fixes: dEQP-VK.glsl.440.linkage.varying.component.* Fixes: 1c57a6da5e3 (ac/shader: scan vertex inputs usage mask) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: remove device pointer from buffer.Dave Airlie2018-02-281-1/+0
| | | | | | | | This is never used. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac: implement nir_op_ldexpTimothy Arceri2018-02-281-0/+7
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac: fix nir_op_fdd{x,y} handlingTimothy Arceri2018-02-281-2/+2
| | | | | | | | | | | | | | | radeonsi, i965 and anv all treat fdd{x,y} opcodes the same as fdd{x,y}_coarse by default. The SPIR-V spec lets the implementation decide how it should be handled and radv was previously going for the higher quality option. Here we change the shared amd code to match how nir_op_fdd{x,y} is expected to be handled by the other NIR drivers. Fixes piglit test: ./bin/arb_shader_texture_lod-texgrad -auto Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: add load_base_vertex() to the abiTimothy Arceri2018-02-282-1/+9
| | | | | | | | | | Fixes the following piglit tests: ./bin/arb_shader_draw_parameters-basevertex basevertex -auto -fbo ./bin/arb_shader_draw_parameters-basevertex basevertex-baseinstance -auto -fbo Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add support for handling nir_intrinsic_load_vertex_idTimothy Arceri2018-02-281-0/+4
| | | | | | | This will be used by radeonsi. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: fix f2b and i2b for doublesTimothy Arceri2018-02-281-2/+4
| | | | | | | | Without this llvm was asserting in debug builds. V2: use LLVMConstNull() Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: clean up a hack about rounding 2nd coord componentSamuel Pitoiset2018-02-271-3/+5
| | | | | | | | It's basically just the opposite, and it only makes sense to round the layer for 2D texture arrays. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: expose async compute on SIDave Airlie2018-02-271-2/+0
| | | | | | | | | | | It looks like we had all the pieces in place for this, just never tested it and turned it on. I don't see any CTS regressions and the computeshader demo runs. Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: merge tess rings into a single boDave Airlie2018-02-272-56/+39
| | | | | | | Inspired by a passing commit to radeonsi. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: use ordered float comparisons except for not equalSamuel Pitoiset2018-02-261-3/+3
| | | | | | | | | | | | | | | | | | Original patch from Timothy Arceri, I have just fixed the not equal case locally. This fixes one important rendering issue in Wolfenstein 2 (the cutscene transition issue). RadeonSI uses the same ordered comparisons, so I guess that what we should do as well. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104302 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104905 Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* ac: make use of ac_get_llvm_num_components() helperTimothy Arceri2018-02-261-5/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Really use correct HTILE expanded words.James Legg2018-02-241-3/+3
| | | | | | | | | | | | | | | | | | | | When transitioning to an htile compressed depth format, Set the full depth range, so later rasterization can pass HiZ. Previously, for depth only formats, the depth range was set to 0 to 0. This caused unwanted HiZ rejections with a VK_FORMAT_D16_UNORM depth buffer (VK_FORMAT_D32_SFLOAT was not affected somehow). These values are derived from PAL [0], since I can't find the specification describing the htile values. [0] https://github.com/GPUOpen-Drivers/pal/blob/5cba4ecbda9452773f59692f5915301e7db4a183/src/core/hw/gfxip/gfx9/gfx9MaskRam.cpp#L1500 CC: Dave Airlie <[email protected]> CC: Bas Nieuwenhuizen <[email protected]> CC: [email protected] Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Fixes: 5158603182fe7435 "radv: Use correct HTILE expanded words."
* radv/extensions: fix c_vk_version for patch == NoneMauro Rossi2018-02-241-1/+2
| | | | | | | | | | | | | | | | | | | | Similar to cb0d1ba156 ("anv/extensions: Fix VkVersion::c_vk_version for patch == None") fixes the following building errors: out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_radv_common_intermediates/radv_entrypoints.c:1161:48: error: use of undeclared identifier 'None'; did you mean 'long'? return instance && VK_MAKE_VERSION(1, 0, None) <= core_version; ^~~~ long external/mesa/include/vulkan/vulkan.h:34:43: note: expanded from macro 'VK_MAKE_VERSION' (((major) << 22) | ((minor) << 12) | (patch)) ^ ... fatal error: too many errors emitted, stopping now [-ferror-limit=] 20 errors generated. Fixes: e72ad05c1d ("radv: Return NULL for entrypoints when not supported.") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix autotools build.Bas Nieuwenhuizen2018-02-231-1/+3
| | | | | | | Somewhere along the way the Makefile changes got lost ... Fixes: 4db78f3a6b "radv: Put supported extensions in a struct." Acked-by: Dave Airlie <[email protected]>
* radv: Return NULL for entrypoints when not supported.Bas Nieuwenhuizen2018-02-234-9/+83
| | | | | | | | | | | | | | | | This implements strict checking for the entrypoint ProcAddr functions. - InstanceProcAddr with instance = NULL, only returns the 3 allowed entrypoints. - DeviceProcAddr does not return any instance entrypoints. - InstanceProcAddr does not return non-supported or disabled instance entrypoints. - DeviceProcAddr does not return non-supported or disabled device entrypoints. - InstanceProcAddr still returns non-supported device entrypoints. Reviewed-by: Dave Airlie <[email protected]>
* radv: Reword radv_entrypoints_gen.pyBas Nieuwenhuizen2018-02-231-56/+106
| | | | | | With a big inspiration from anv as always ... Reviewed-by: Dave Airlie <[email protected]>
* radv: Track enabled extensions.Bas Nieuwenhuizen2018-02-232-36/+48
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Put supported extensions in a struct.Bas Nieuwenhuizen2018-02-234-63/+133
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: remove emission of nir_op_fpowSamuel Pitoiset2018-02-221-4/+0
| | | | | | | fpow is now lowered at NIR level. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable lowering of fpow to fexp2 and flog2Samuel Pitoiset2018-02-221-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | There is no fpow in hardware, so it's always lowered somewhere, but it appears that lowering at NIR level is better. Figured while comparing compute shaders between RadeonSI and RADV. Polaris10: Totals from affected shaders: SGPRS: 18936 -> 18904 (-0.17 %) VGPRS: 12240 -> 12220 (-0.16 %) Spilled SGPRs: 2809 -> 2809 (0.00 %) Code Size: 718116 -> 719848 (0.24 %) bytes Max Waves: 1409 -> 1410 (0.07 %) Vega10: Totals from affected shaders: SGPRS: 18392 -> 18392 (0.00 %) VGPRS: 12008 -> 11920 (-0.73 %) Spilled SGPRs: 3001 -> 2981 (-0.67 %) Code Size: 777444 -> 778788 (0.17 %) bytes Max Waves: 1503 -> 1504 (0.07 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: set GLC=1 for load/store of coherent/volatile imagesSamuel Pitoiset2018-02-221-3/+4
| | | | | | | | | | This disables persistence accross wavefronts. F1 2017 and Wolfenstein 2 appear to use some coherent images but this patch doesn't seem to change anything. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/radeonsi: pass type to load_tess_varyings()Timothy Arceri2018-02-222-2/+14
| | | | | | We need this to be able to load 64bit varyings. Reviewed-by: Marek Olšák <[email protected]>
* amd/common:add uvd hevc enc support check in hw queryJames Zhu2018-02-212-1/+12
| | | | | | | Based on amdgpu hardware query information to check if UVD hevc enc support Signed-off-by: James Zhu <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: add glsl_is_array_image() helperSamuel Pitoiset2018-02-211-23/+18
| | | | | | | For consistency. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: set the DA field when performing atomics on 3D imagesSamuel Pitoiset2018-02-211-1/+2
| | | | | | | This doesn't fix anything known but it should definitely be set. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't send num_tcs_input_cp to sgprs.Dave Airlie2018-02-211-4/+1
| | | | | | | We never use it in the shaders. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/tess: don't need to look in constant for vertices_per_patchDave Airlie2018-02-212-2/+5
| | | | | | | This just avoids passing this value via user sgprs. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/radv: cleanup some tcs output values accessDave Airlie2018-02-211-2/+8
| | | | | | | Just consolidates some code to make it easier to change. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>