aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: do not expose GTT as device local memory mostly for APUsSamuel Pitoiset2020-04-271-29/+30
| | | | | | | | | | | | | | | | | | | | | | On APUs, the memory is unified (all heaps are equally fast) and apps should count all memory heaps together. But some games like Id Tech games (Youngblood and such) don't manage memory correctly on APUs and they spill everything when one VRAM heap is full. Instead of spilling buffers, they should just allocate new buffers in the second heap but it seems like these games are confused if two memory heaps have the DEVICE_LOCAL_BIT set. This is probably a first step towards better memory management on APUs but there is still some work to do if we want to run most apps with a small dedicated VRAM (256MB or so). This gives a huge boost for Id Tech games on APUs, and doesn't seem to reduce Feral games performance. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4771>
* radv: Add WSI buffers to BO list only if they can be used.Bas Nieuwenhuizen2020-04-273-14/+42
| | | | | | | | | | | Also reverse the BO list removal loop. This way typical WSI usage should find the entry in O(active swapchains) iterations, which should not be a performance issues. Tested with Doom(2106) which found the entry in 1 iteration every time. Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4306>
* ac,radeonsi: fix compilations issues with LLVM 11Samuel Pitoiset2020-04-274-14/+18
| | | | | | | | | | Latest LLVM replaced LLVMVectorTypeKind. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2826 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4755>
* drm-uapi,radv,radeonsi: Add amdgpu_drm.h header.Bas Nieuwenhuizen2020-04-278-8/+8
| | | | | | | | | | | | Use it instead of the libdrm provided amdgpu_drm.h header. I used the kernel revision from the README to get the header so the header versions should be consistent. Tested by removing /usr/include/libdrm/amdgpu_drm.h from my dev-machine. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4749>
* ac: reassociate FP expressions for inexact instructions for radeonsiMarek Olšák2020-04-271-0/+9
| | | | | | | | | | | | | | | | | Totals: SGPRS: 2591784 -> 2590696 (-0.04 %) VGPRS: 1666888 -> 1666736 (-0.01 %) Spilled SGPRs: 4131 -> 4107 (-0.58 %) Spilled VGPRs: 38 -> 38 (0.00 %) Private memory VGPRs: 2176 -> 2176 (0.00 %) Scratch size: 2228 -> 2228 (0.00 %) dwords per thread Code Size: 52715468 -> 52693584 (-0.04 %) bytes LDS: 92 -> 92 (0.00 %) blocks Max Waves: 479897 -> 479892 (-0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
* ac: generate FMA for inexact instructions for radeonsiMarek Olšák2020-04-273-0/+40
| | | | | | | | | | | | | | | | | | | NIR mostly does this already. Totals: SGPRS: 2588520 -> 2591784 (0.13 %) VGPRS: 1666984 -> 1666888 (-0.01 %) Spilled SGPRs: 4074 -> 4131 (1.40 %) Spilled VGPRs: 38 -> 38 (0.00 %) Private memory VGPRs: 2176 -> 2176 (0.00 %) Scratch size: 2228 -> 2228 (0.00 %) dwords per thread Code Size: 52726872 -> 52715468 (-0.02 %) bytes LDS: 92 -> 92 (0.00 %) blocks Max Waves: 479872 -> 479897 (0.01 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
* ac: update and document fast math flags used by radeonsiMarek Olšák2020-04-272-3/+13
| | | | | | | | This should have no effect, because we never use FP division, but it's safer for the future. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
* ac: force enable -structurizecfg-skip-uniform-regions for LLVM 11Marek Olšák2020-04-271-0/+4
| | | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4696>
* radv: fix robust_buffer_access if enabled via VkPhysicalDeviceFeatures2Samuel Pitoiset2020-04-271-10/+44
| | | | | | | | It can be enabled via pEnabledFeatures or via vkPhysicalDeviceFeatures2. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4706>
* radv: Pass logical device to si_emit_graphicsJoshua Ashton2020-04-253-5/+6
| | | | | | | We'll need this in order to retrieve the va of a bo for a future ext. Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4728>
* radv/aco: enable 8/16-bit storage and int8/int16 on GFX8+Rhys Perry2020-04-243-22/+29
| | | | | | | | | | | | With this, Doom Eternal should now run with ACO on GFX8+. The generated 8/16-bit storage code is okay but the generated int8/int16 code is currently pretty bad but it works and apparently Doom Eternal doesn't actually use it (even though it requires it). Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4707>
* aco: lower 8/16-bit integer arithmeticRhys Perry2020-04-241-0/+31
| | | | | | | | | dEQP-VK.spirv_assembly.type.* passes with the features and extensions enabled. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4707>
* aco: improve sub-dword emit_split_vector() with sgprsRhys Perry2020-04-241-6/+7
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: clobber scc in s_bfe_u32 in get_alu_src()Rhys Perry2020-04-241-1/+2
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: handle undef p_create_vector operands in the optimizerRhys Perry2020-04-241-0/+4
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: vectorize global loads/storesRhys Perry2020-04-241-2/+10
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: allow 8/16-bit shared loadsRhys Perry2020-04-241-2/+0
| | | | | | | | These should work now Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: add and use get_buffer_store_op() helperRhys Perry2020-04-241-78/+33
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: refactor visit_store_scratch() to use new helpersRhys Perry2020-04-241-32/+15
| | | | | | | | Should support 8/16-bit stores now Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: refactor visit_store_global() to use new helpersRhys Perry2020-04-241-30/+29
| | | | | | | | Should support 8/16-bit stores now Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: refactor visit_store_ssbo() to use new helpersRhys Perry2020-04-241-64/+13
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: refactor store_vmem_mubuf() to use new helpersRhys Perry2020-04-241-58/+9
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: refactor store_lds() to use new helpersRhys Perry2020-04-241-91/+90
| | | | | | | | It should also work correctly for 8/16-bit stores Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: add helpers for splitting storesRhys Perry2020-04-241-0/+155
| | | | | | | | | | | | | | split_store_data() splits a vector and p_as_uniforms it if needed. scan_write_mask()/advance_write_mask() are similar to u_bit_scan_consecutive_range(), but makes it easier to only clear part of the range and will also give ranges for zero'd bits. split_buffer_store() is a helper for splitting VMEM/SMEM stores. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: use emit_load helper for VMEM/SMEM loadsRhys Perry2020-04-241-494/+226
| | | | | | | | Also implements 8/16-bit loads for scratch/global. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: refactor load_lds to use new helpersRhys Perry2020-04-241-98/+75
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: add emit_load helperRhys Perry2020-04-241-0/+285
| | | | | | | | | | This helper is used for recombining split loads, passing the result to p_as_uniform, aligning the offset down and shifting it right if needed and handling large constant offsets. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: add and use RegClass::get() helperRhys Perry2020-04-242-14/+12
| | | | | | | | | | | | Eventually, we'll probably want to replace the current RegClass(type, size) constructor with this. This has a functional change in that get_reg_class() now creates v1/v2 instead of v4b/v8b. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: be more careful about using SMEM for load_globalRhys Perry2020-04-241-3/+4
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* radv: allocate larger shader memory slabs if neededRhys Perry2020-04-241-2/+2
| | | | | | | | | Fixes dEQP-VK.ssbo.phys.layout.random.16bit.scalar.13 hang with ACO (features needed for the test are implemented in a later commit) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* radv: align buffer descriptor sizes to dwordRhys Perry2020-04-242-2/+16
| | | | | | | | | This is needed to prevent bounds checking issues when load 8/16-bit values with dword loads. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4639>
* aco: Move s_setprio to correct place after the gs_alloc_req.Timur Kristóf2020-04-241-2/+3
| | | | | | | | | Previously the setprio was inside the branch, so it would only reset the priority on the first wave, but not the others. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
* aco: Use 24-bit multiplication for NGG wave id and thread id.Timur Kristóf2020-04-241-2/+2
| | | | | | | | Both of them should always fit 24 bits anyway. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
* aco: Use 24-bit multiplication in TCS I/OTimur Kristóf2020-04-241-5/+5
| | | | | | | | | | | | | The TCS inputs and outputs must always fit into the LDS, which implies that their addresses also always fit 24 bits. On AMD GPUs, 24-bit multiplication is much faster than 32-bit multiplication, so we can take the opportunity to use that for TCS I/O instead. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
* aco: Const correctness for aco_print_ir.Timur Kristóf2020-04-242-26/+26
| | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
* aco: Const correctness for get_barrier_interaction.Timur Kristóf2020-04-242-11/+11
| | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
* aco: Abort when RA can't find a register.Timur Kristóf2020-04-241-1/+2
| | | | | | | | | | | | | Previously, it was just unreachable, which means it will generate invalid shaders when it encounters a situation when it can't allocate registers for eg. a large load. This commit makes it slightly easier to notice such problems without triggering a GPU hang. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
* aco: Increase barrier_count to 7 to include barrier_barrier.Timur Kristóf2020-04-241-1/+1
| | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
* aco: Only store TCS outputs to VMEM when they are read by TES.Timur Kristóf2020-04-241-12/+26
| | | | | | | | | Totals from affected shaders (GFX10): Code Size: 10832 -> 10736 (-0.89 %) bytes Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
* radv: Add inputs read by TES to radv_shader_info.Timur Kristóf2020-04-242-0/+9
| | | | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4536>
* aco: fix outdated label_vec from p_create_vector labellingRhys Perry2020-04-241-3/+5
| | | | | | | | | | | Fixes random dEQP-VK.transform_feedback.fuzz.* crashes. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Fixes: 2dc550202e82c5da198ad0a416a5d24dd89addd8 ('aco: copy-propagate p_create_vector copies of vectors') Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4730>
* ac,radeonsi: simplify checking for Navi1x chipsMarek Olšák2020-04-241-4/+1
| | | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4698>
* ac: out-of-order rasterization is not supported on gfx10Marek Olšák2020-04-241-0/+1
| | | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4698>
* spirv: Use nir_const_value for spec constantsJason Ekstrand2020-04-241-5/+5
| | | | | | | | | | | When we originally wrote spirv_to_nir we didn't have a good scalar value union to handily use so we rolled our own thing for spec constants. Now that we have nir_const_value, we can use that and simplify a bunch of the spec constant logic. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
* radv: Properly handle all sizes of specialization constantsJason Ekstrand2020-04-241-2/+15
| | | | | | | | cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4675>
* anv/radv: Resolving 'GetInstanceProcAddr' should not require a valid instanceEduardo Lima Mitev2020-04-241-0/+5
| | | | | | | | | | | | | | | | | | | | | | | Since vk_icdGetInstanceProcAddr() is wired through vkGetInstanceProcAddr() in both drivers, we lost the ability for 'GetInstanceProcAddr' to resolve itself prior to having a valid instance. An upcoming spec change will fix that and allow vkGetInstanceProcAddr() to resolve itself passing NULL as instance. See https://gitlab.khronos.org/vulkan/vulkan/issues/2057 for details. This patch implements the change in both radv and anvil. CTS changes have already landed: https://gitlab.khronos.org/Tracker/vk-gl-cts/issues/2278 vulkan-loader changes have also landed: https://gitlab.khronos.org/Tracker/vk-gl-cts/issues/2278 Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4273>
* aco: fix v_or(s_lshl) and v_add(s_lshl) optimizationsRhys Perry2020-04-241-2/+2
| | | | | | | | | | Signed-off-by: Rhys Perry <[email protected]> Fixes: d1621834f367d41500b7c1a819c046eb429fb8a6 ('aco: combine VALU and SALU into various VOP3 instructions') Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2822 Reviewed-by: Timur Kristóf <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4717>
* radv: adjust the supported subgroup stagesSamuel Pitoiset2020-04-231-1/+2
| | | | | | | | | VK_SHADER_STAGE_ALL now includes all ray-tracing related stages. Noticed while comparing vulkaninfo with some other drivers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4679>
* radv: simplify checking for Navi1x chipsSamuel Pitoiset2020-04-234-13/+5
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4702>
* aco: improve code for 32-bit isignRhys Perry2020-04-231-6/+3
| | | | | | | | No shader-db changes on Navi. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Timur Kristóf <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4667>