summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: align command buffer starting address to fix some Raven hangsMarek Olšák2018-03-082-1/+21
| | | | | | Cc: 17.3 18.0 <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* ac/nir: do not emit unnecessary null exports in fragment shadersSamuel Pitoiset2018-03-081-13/+16
| | | | | | | | | | | | | | | | | | | | Null exports should only be needed when no other exports are emitted. This removes a bunch of 'exp null off, off, off, off done vm'. Affected games are Dota 2 and Wolfenstein 2, not sure if that really helps, but code size is decreasing there. Polaris10: Totals from affected shaders: SGPRS: 8216 -> 8216 (0.00 %) VGPRS: 7072 -> 7072 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 454968 -> 453896 (-0.24 %) bytes Max Waves: 772 -> 772 (0.00 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/radeonsi: add emit_kill to the abiTimothy Arceri2018-03-082-1/+10
| | | | | | | | This should fix a regression with Rocket League grass rendering on the NIR backend. Reviewed-by: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104717
* ac: make use of if/loop build helpersTimothy Arceri2018-03-081-42/+18
| | | | | | | | | | | | These helpers insert the basic block in the same order as they appear in NIR making it easier to follow LLVM IR dumps. The helpers also insert more useful labels onto the blocks. TGSI use the line number of the corresponding opcode in the TGSI dump as the label id, here we use the corresponding block index from NIR. Reviewed-by: Marek Olšák <[email protected]>
* ac: add if/loop build helpersTimothy Arceri2018-03-083-0/+211
| | | | | | These have been ported over from radeonsi. Reviewed-by: Marek Olšák <[email protected]>
* radv: enable AMD_gcn_shader extensionDaniel Schürmann2018-03-072-0/+4
| | | | | Signed-off-by: Daniel Schürmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: implement AMD_gcn_shader extended instructionsDaniel Schürmann2018-03-071-0/+28
| | | | | | Co-authored-by: Dave Airlie <[email protected]> Signed-off-by: Daniel Schürmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Don't emit a warning on VI-GFX9.Bas Nieuwenhuizen2018-03-071-1/+3
| | | | | | | | | We are conformant: https://www.khronos.org/conformance/adopters/conformant-products#submission_308 v2: Actually not emit it on gfx9. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Enable vulkan 1.1.0 for configurations that can support it.Bas Nieuwenhuizen2018-03-071-0/+2
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Disable sampler ycbcr conversion.Bas Nieuwenhuizen2018-03-072-0/+24
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Expose that we don't support any VK_KHR_16_bit_storage parts.Bas Nieuwenhuizen2018-03-071-0/+9
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement vkEnumerateInstanceVersion.Bas Nieuwenhuizen2018-03-071-0/+7
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Add trivial device group implementation.Bas Nieuwenhuizen2018-03-075-0/+79
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement vkCmdDispatchBase.Bas Nieuwenhuizen2018-03-072-3/+41
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement VkGetDeviceQueue2.Bas Nieuwenhuizen2018-03-071-2/+16
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Support VkPhysicalDeviceProtectedMemoryFeatures.Bas Nieuwenhuizen2018-03-071-0/+6
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Support VkPhysicalDeviceShaderDrawParameterFeatures.Bas Nieuwenhuizen2018-03-071-0/+6
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement VK_KHR_maintenance3.Bas Nieuwenhuizen2018-03-073-4/+90
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Add minimal subgroup support.Bas Nieuwenhuizen2018-03-074-0/+70
| | | | | | | Deliberately not implementing workgroup scopes as that is not needed for core vulkan. Reviewed-by: Dave Airlie <[email protected]>
* radv: Change client version check.Bas Nieuwenhuizen2018-03-071-1/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Update MAX_API_VERSION to 1.1.0Bas Nieuwenhuizen2018-03-077-37/+37
| | | | | | | v2: Don't bump supported version. v3: Update json files. Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add vote_ieq/vote_feq lowering pass.Bas Nieuwenhuizen2018-03-075-5/+99
| | | | | | | | | | | | The old vote_eq implementation supported only booleans, but now we have to support arbitrary values, so use the read_first_invocation intrinsic + ballot. I took this as an opportunity to figure out how easy it was to do this in nir instead of in the nir_to_llvm pass, and it actually turned out pretty okay IMO. Only creating the pass is some extra code. Reviewed-by: Dave Airlie <[email protected]>
* nir: Generalize nir_intrinsic_vote_eqJason Ekstrand2018-03-071-1/+1
| | | | | | | | | | | | | The SPIR-V extension wants us to be able to do an AllEqual on any vector or scalar type. This has two implications: 1) We need to be able to handle vectors so we switch the vote_eq intrinsics to be vectorized intrinsics. 2) We need to handle floats which have different behavior with respect to +-0, NaN, etc. than the integer variant so we need two variants. Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan: Rename multiview from KHX to KHRJason Ekstrand2018-03-073-8/+8
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: fix passing address32_hi to LLVM for high valuesMarek Olšák2018-03-072-3/+3
| | | | The old function treats high values as negative, which LLVM interprets as 0.
* radv: Add entrypoints generation with the new vk.xmlBas Nieuwenhuizen2018-03-071-107/+164
| | | | | | A lot of it is based on intel again. Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: don't put lod into args if it's zero.Dave Airlie2018-03-071-2/+1
| | | | | | | | | | | If it's zero but put it in args we still end up consuming a register for it. This fixes some spilling in the NIR paths in Dirt Rally that isn't seen with TGSI. Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: report the scratch private memory size with shader statsSamuel Pitoiset2018-03-061-1/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: count the scratch private memory sizeSamuel Pitoiset2018-03-062-2/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac: add ac_count_scratch_private_memory()Samuel Pitoiset2018-03-062-0/+34
| | | | | | | Imported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: only enable used channels when exporting parametersSamuel Pitoiset2018-03-061-4/+20
| | | | | | | | | | This allows us to generate, for example, "exp param0 v0, off, off, off" if only the first channel is needed. Not sure if this improves performance but it's worth trying. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac: update enabled channels mask when optimizing PARAM exportsSamuel Pitoiset2018-03-061-2/+16
| | | | | | | | | When the mask is not 0xf we need to update the number of enabled channels, otherwise the hardware won't emit the components that are combined. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: pass the number of enabled channels to si_llvm_init_export_args()Samuel Pitoiset2018-03-061-8/+13
| | | | | | | | Currently, it's always 0xf but an upcoming patch will reduce the number of channels for parameters export. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/shader: scan output usage mask for VS and TESSamuel Pitoiset2018-03-062-0/+22
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* vulkan: do not expose surface/swapchain extensions on AndroidTapani Pälli2018-03-061-2/+2
| | | | | | | | | On Android surface/swapchain extensions are implemented by the loader. Patch modifies both anv and radv extension scripts disabling currently exposed ones. See also earlier commit 9f763c1f9b. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* ac: pass the unmodified number of components to load gs inputsTimothy Arceri2018-03-061-2/+2
| | | | | | | | | | | Currently both users of this would overflow an array when the input was a dual slot double as they expected the number of components to be a max of 4. Since we pass the type we can just let the functions handle doubles in a way they choose. Reviewed-by: Dave Airlie <[email protected]>
* ac: add ac_build_fsign()Samuel Pitoiset2018-03-053-25/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_isign()Samuel Pitoiset2018-03-053-24/+28
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: add ac_build_fract()Samuel Pitoiset2018-03-053-26/+34
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.cTimothy Arceri2018-03-055-48/+44
| | | | | | | Until llvm handles indirects better we will need to use these workarounds in the radeonsi backend also. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix copying from 3D images starting at non-zero depth.Bas Nieuwenhuizen2018-03-051-0/+3
| | | | | Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* radv: do not set pending_reset_query in BeginCommandBuffer()Samuel Pitoiset2018-03-021-7/+0
| | | | | | | | | | | | | This is just useless for two reasons: 1) flush_bits is not set accordingly, so nothing will be flushed in BeginQuery(). 2) we always flush caches in EndCommandBuffer(), so if a reset is done in a previous command buffer we are safe. Cc: "18.0" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: fix nir_intrinsic_shared_atomic_comp_swap handlingTimothy Arceri2018-03-021-1/+1
| | | | | | | | | | Following on from 49879f377870 this makes sure we use the correct src index. Fixes cts test: KHR-GL46.compute_shader.atomic-case3 Reviewed-by: Dave Airlie <[email protected]>
* radv: only emit cache flushes when the pool size is large enoughSamuel Pitoiset2018-03-013-11/+15
| | | | | | | | This is an optimization which reduces the number of flushes for small pool buffers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: keep track of the query pool sizeSamuel Pitoiset2018-03-012-5/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make sure to emit cache flushes before starting a querySamuel Pitoiset2018-03-013-7/+33
| | | | | | | | | | | If the query pool has been previously resetted using the compute shader path. Fixes: a41e2e9cf5 ("radv: allow to use a compute shader for resetting the query pool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292 Cc: "18.0" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use the syncobj wait ioctl to wait on fences if possible.Bas Nieuwenhuizen2018-03-013-9/+26
| | | | | | Handles the !waitAll and signal after the start of the wait cases correctly. Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement more efficient !waitAll fence waiting.Bas Nieuwenhuizen2018-03-013-0/+75
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement waiting on non-submitted fences.Bas Nieuwenhuizen2018-03-011-2/+11
| | | | | Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>
* radv: Implement WaitForFences with !waitAll.Bas Nieuwenhuizen2018-03-011-5/+15
| | | | | | | | | | Nothing to do except using a busy wait loop. At least for old kernels. A better implementation for newer kernels to come later. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105255 Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <[email protected]>