summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* nir: replace nir_load_system_value calls with appropiate builder functionsKarol Herbst2018-11-146-24/+24
| | | | | | | | | this helps reduce the overall code changes when a bit_size parameter is added to nir_load_system_value Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* radv: make use of nir_move_out_const_to_consumer()Timothy Arceri2018-11-141-0/+4
| | | | | | | | | | | | | | | | | | vkpipeline-db results: Totals from affected shaders: SGPRS: 28400 -> 28576 (0.62 %) VGPRS: 27916 -> 27692 (-0.80 %) Spilled SGPRs: 140 -> 138 (-1.43 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1534456 -> 1520560 (-0.91 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 3541 -> 3582 (1.16 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: set optimal OVERWRITE_COMBINER_WATERMARK on GFX9Samuel Pitoiset2018-11-132-3/+21
| | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set PA.SC_CONSERVATIVE_RASTERIZATION.NULL_SQUAD_AA_MASK_ENABLESamuel Pitoiset2018-11-131-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: binding streamout buffers doesn't change context regsSamuel Pitoiset2018-11-131-2/+7
| | | | | | Cc: 18.3 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make use of num_good_cu_per_sh in si_emit_graphics() tooSamuel Pitoiset2018-11-121-2/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clean up setting partial_es_wave for distributed tess on VISamuel Pitoiset2018-11-121-7/+4
| | | | | | | | Only needed when the pipeline actually uses tessellation. I don't think that changes anything, except improving readability. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: cleanup and document a Hawaii bug with offchip buffersSamuel Pitoiset2018-11-121-9/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/surface: remove the overallocation workaround for Vega12Marek Olšák2018-11-091-4/+0
| | | | | | not needed anymore (probably since the tile_swizzle fix) Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: include LLVM IR in the VK_AMD_shader_info "disassembly"Nicolai Hähnle2018-11-091-0/+1
| | | | | | | Helpful for debugging compiler backend problems: this allows us to easily retrieve the LLVM IR from RenderDoc. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: fix GPU hangs when loading depth/stencil clear values on SI/CIKSamuel Pitoiset2018-11-081-5/+19
| | | | | | | | HTILE is supported on these chips, not sure how I missed that. This restores using PFP_SYNC_ME when LOAD_CONTEXT_REG is not used. Fixes: f425d9ee74 ("radv: use LOAD_CONTEXT_REG when loading fast clear values") Signed-off-by: Samuel Pitoiset <[email protected]>
* radv: use LOAD_CONTEXT_REG when loading fast clear valuesSamuel Pitoiset2018-11-082-19/+27
| | | | | | | | | This avoids syncing the Micro Engine. This is only supported for VI+ currently. There is probably a way for using LOAD_CONTEXT_REG on previous chips but that could be done later. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+Samuel Pitoiset2018-11-081-1/+1
| | | | | | | | | | | Inclusive and exclusives scan are missing because older chips don't have llvm.amdgcn.update.dpp. This fixes crashes with dEQP-VK.subgroups.arithmetic.*. CC: [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: disable conditional rendering for vkCmdCopyQueryPoolResults()Samuel Pitoiset2018-11-071-0/+10
| | | | | | | | | VK_EXT_conditional_rendering says that copy commands should not be affected by conditional rendering. Cc: 18.2 18.3 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: allocate enough space in CS when copying query results with computeSamuel Pitoiset2018-11-071-0/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir_to_llvm: fix b2f for f64Timothy Arceri2018-11-071-3/+12
| | | | | | Fixes: d7e0d47b9de3 ("nir: Add a bunch of b2[if] optimizations") Reviewed-by: Dave Airlie <[email protected]>
* radv: more use of radv_cp_wait_mem()Samuel Pitoiset2018-11-051-22/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: replace si_emit_wait_fence() with radv_cp_wait_mem()Samuel Pitoiset2018-11-054-10/+14
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: add missing TFB queries support to CmdCopyQueryPoolsResults()Samuel Pitoiset2018-11-052-0/+278
| | | | | | | Cc: 18.3 <[email protected]> Fixes: b4eb029062a ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: remove useless sync after copying query results with computeSamuel Pitoiset2018-11-051-4/+0
| | | | | | | | | | | | | | | The spec says: "vkCmdCopyQueryPoolResults is considered to be a transfer operation, and its writes to buffer memory must be synchronized using VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT before using the results." VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle, while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector caches and L2. So, it's useless to set those flags internally. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* android: radv: add libmesa_git_sha1 static dependencyMauro Rossi2018-11-031-1/+2
| | | | | | | | | | | | | | libmesa_git_sha1 whole static dependency is added to get git_sha1.h header and avoid following building error: external/mesa/src/amd/vulkan/radv_device.c:46:10: fatal error: 'git_sha1.h' file not found ^ 1 error generated. Fixes: 9d40ec2cf6 ("radv: Add support for VK_KHR_driver_properties.") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* amd: Make vgpr-spilling depend on llvm versionJan Vesely2018-11-021-1/+2
| | | | | | | | | The option was removed in LLVM r345763 Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radv: fix begin/end transform feedback with 0 counter buffers.Dave Airlie2018-11-021-12/+16
| | | | | | | | | | If the user gives 0 counterBuffers then the driver should still enable transform feedback on all targets. This changes the driver to always enable xfb, and use counter buffers where one is defined for the target in question. Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback) Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: apply xfb buffer offset at buffer binding time not later. (v2)Dave Airlie2018-11-021-2/+4
| | | | | | | | | | | In order to handle pause/resume properly, the offset should be added to the buffer binding not to the begin/end paths. v2: don't add offset to size Fixes ext_transform_feedback-alignment* under zink Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback) Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: set PA_SU_PRIM_FILTER_CNTL optimallySamuel Pitoiset2018-11-011-0/+9
| | | | | | | | Ported from RadeonSI. It's always TRUE for CIK+ because RADV doesn't support 16 samples. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: only enable gl_SampleMask if MSAA is enabled tooSamuel Pitoiset2018-11-011-2/+11
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: use radeon_info::num_good_cu_per_shSamuel Pitoiset2018-11-011-3/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: make use of i1false in few more placesSamuel Pitoiset2018-11-011-3/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: add support for Raven2Samuel Pitoiset2018-11-013-3/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/xfb: don't increase offset by component mask start.Dave Airlie2018-10-311-3/+0
| | | | | | | | | | | | | | This is incorrect, the offset is into the buffer, and it's legal to write loc 0,0 -> buffer0, offset 0 loc 0,1 -> buffer1, offset 0 This fixes a bunch of piglits running on my zink xfb code on radv. Fixes: 6c21645046 (radv: emit stream outputs for vertex and tessellation stages) Reviewed-by: Samuel Pitoiset <[email protected]>
* configure: allow building with python3Emil Velikov2018-10-312-6/+6
| | | | | | | | | | | | | | | Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python2 chosen prior to python3 v2: use python2 by default Cc: Ilia Mirkin <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* radv: use WAIT_REG_MEM_GREATER_OR_EQUAL instead of a magic valueSamuel Pitoiset2018-10-312-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: use pool->stride when calling radv_query_shader()Samuel Pitoiset2018-10-311-2/+2
| | | | | | | Not needed to recompute the stride. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: rename some parameters in Cmd{Begin,End}TransformFeedbackEXT()Samuel Pitoiset2018-10-311-8/+8
| | | | | | | To match latest spec. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/winsys: do not assign last submission when chained path failedSamuel Pitoiset2018-10-311-1/+4
| | | | | | | | I don't think we want to wait for something that hasn't been correctly submitted. This is similar to the fallback path. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/winsys: fix buffer deletion in the sysmem pathSamuel Pitoiset2018-10-311-2/+3
| | | | | | | In case we failed to submit the CS correctly. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/winsys: cleanup the chained submission pathSamuel Pitoiset2018-10-311-11/+17
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv/winsys: remove unused surface_best()Samuel Pitoiset2018-10-312-10/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: add support for Raven2 (v2)Marek Olšák2018-10-307-1/+19
| | | | | | v2: fix enabling primitive binning Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: fix ac_build_fdiv for f64Marek Olšák2018-10-291-1/+2
| | | | | | trivial Fixes: a5f35aa742c
* radv: add missing meson build dependencyEric Engestrom2018-10-291-1/+1
| | | | | | Fixes: 9d40ec2cf6ec6d3d9d78 "radv: Add support for VK_KHR_driver_properties." Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: implement VK_EXT_transform_feedbackSamuel Pitoiset2018-10-299-16/+568
| | | | | | | | This implementation should work and potential bugs can be fixed during the release candidates window anyway. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: add multiple streams support for the GS copy shaderSamuel Pitoiset2018-10-291-26/+76
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: emit stream outputs for vertex and tessellation stagesSamuel Pitoiset2018-10-291-0/+137
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: declare streamout SGPRsSamuel Pitoiset2018-10-292-2/+56
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: gather stream output infoSamuel Pitoiset2018-10-293-0/+60
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: allow to emit a vertex to a specified streamSamuel Pitoiset2018-10-291-8/+14
| | | | | | | This is required for GS multiple streams support. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: allow to use up to 4 GSVS ring buffersSamuel Pitoiset2018-10-291-21/+57
| | | | | | | | For all streams. We basically just need to update the base address and compute a stride for every stream. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: adjust the number of output components per streamSamuel Pitoiset2018-10-291-5/+4
| | | | | | | | Same as the previous patch, except that is only the number of components. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: adjust the GSVS ring sizes based on the number of componentsSamuel Pitoiset2018-10-291-6/+19
| | | | | | | | | For multiple streams support we have to set the different ring buffer sizes correctly. This relies on the number of output components per stream. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>