aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan/radv_nir_to_llvm.c
Commit message (Collapse)AuthorAgeFilesLines
* Revert "radv: add support for MRTs compaction to avoid holes"Bas Nieuwenhuizen2020-07-061-1/+2
| | | | | | | | | | | | | | | This reverts commit 7a5e6fd25f2e132ef4cacc3a5b714c4e153227b0. Since we have two different users bisecting issues to this commit, let's revert. Reviewed-by: Samuel Pitoiset <[email protected]> Fixes: 7a5e6fd25f2 "radv: add support for MRTs compaction to avoid holes" Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3202 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3228 (Other report in https://gitlab.freedesktop.org/mesa/mesa/-/issues/3151#note_558589) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5758>
* radv: add support for MRTs compaction to avoid holesSamuel Pitoiset2020-06-291-2/+1
| | | | | | | | | | | | | | | | | | | SPI_SHADER_COL_FORMAT allocates export memory and CB_SHADER_MASK map them to higher MRTs if necessary. The hardware allows to remap MRTs to avoid holes somehow. For example, if we have a scenario where MRT0 is unused and only MRT1 and MRT2 are used, SPI_SHADER_COL_FORMAT is 0x77 and CB_SHADER_MASK/CB_TARGET_MASK are 0x770 (this assumes SPI_SHADER_UINT16_ABGR is set). This allows us to remove one workaround that was added for fixing GPU hangs with DXVK. I think this is because SPI_SHADER_COL_FORMAT expects contiguous MRTs to be allocated. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5434>
* radv: remove the load/store workaround for Monster Hunter World with LLVMSamuel Pitoiset2020-06-261-2/+0
| | | | | | | | Now that ACO is default, this is pointless. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5658>
* radv: replace == GFX10 with >= GFX10 where it's neededSamuel Pitoiset2020-06-191-2/+2
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5389>
* radv/llvm: implement radv_enable_mrt_output_nan_fixup workaroundSamuel Pitoiset2020-06-121-0/+24
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5359>
* radv: remove useless assignment in build_streamout_vertex()Samuel Pitoiset2020-05-241-2/+1
| | | | | | | Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3025 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5158>
* radv: fix duplicated expression in ac_setup_rings()Samuel Pitoiset2020-05-211-1/+1
| | | | | | | | | | Probably a search&replace mistake when that common struct was introduced. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3006 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5130>
* Revert "ac,radeonsi: fix compilations issues with LLVM 11"Michel Dänzer2020-05-191-1/+1
| | | | | | | | | | This reverts commit 42b1696ef627a5bfee29911a780fa0a4dbf04610. The corresponding LLVM changes were reverted. Acked-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5087>
* radv: Refactor calculate_tess_lds_size and get_tcs_num_patches.Timur Kristóf2020-04-291-4/+8
| | | | | | | | | | | | | | Previously these functions needed the bit mask of the TCS outputs and patch outputs written, and concluded the number of outputs from that. Now, they take the number of outputs and patch outputs instead. This will allow the backend compiler to better optimize the LDS layout. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4388>
* ac,radeonsi: fix compilations issues with LLVM 11Samuel Pitoiset2020-04-271-1/+1
| | | | | | | | | | Latest LLVM replaced LLVMVectorTypeKind. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2826 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4755>
* radv: simplify checking for Navi1x chipsSamuel Pitoiset2020-04-231-4/+2
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4702>
* radeonsi: skip vs output optimizations for some outputsPierre-Eric Pelloux-Prayer2020-04-201-1/+1
| | | | | | | | | | If PT_SPRITE_TEX is enabled, PS inputs are overriden at runtime so we can't apply the vs output optim. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2747 Fixes: 3ec9975555d ("radeonsi: eliminate trivial constant VS outputs") Reviewed-by: Marek Olšák <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4559>
* radv/llvm: fix exporting the viewport index if the fragment shader needs itSamuel Pitoiset2020-04-171-0/+1
| | | | | | | | | | | | | It's like the layer, it has to be exported via the pos and also as a varying if the fragment shader reads it. Fixes dEQP-VK.draw.shader_viewport_index.fragment_shader_* Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4564>
* radv: enable lowering of GS intrinsics for the LLVM backendSamuel Pitoiset2020-04-081-48/+14
| | | | | | | | | | | | | | | | | | | | | | | | This replaces emit_vertex with: if (vertex_count < max_vertices) { emit_vertex_with_counter vertex_count ... vertex_count += 1 } Which is exactly what NIR->LLVM was doing but at NIR level. This pass is already called by ACO. pipeline-db changes on GFX10: Totals from affected shaders: SGPRS: 1952 -> 1912 (-2.05 %) VGPRS: 2112 -> 2044 (-3.22 %) Code Size: 189368 -> 185620 (-1.98 %) bytes Max Waves: 494 -> 491 (-0.61 %) No pipeline-db changes on other generations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4182>
* radv/gfx10: fix required ballot size with VK_EXT_subgroup_size_controlSamuel Pitoiset2020-03-171-1/+2
| | | | | | | | | | | | | If compute shaders require a specific subgroup size (ie. Wave32), we have to use the correct ballot size. Fixes dEQP-VK.subgroups.ballot_other.compute.*_requiredsubgroupSize. Fixes: fb07fd4e6cb ("radv: implement VK_EXT_subgroup_size_control") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4215>
* radv: add llvm_compiler_shader() helperSamuel Pitoiset2020-03-131-2/+36
| | | | | | | | | To match aco_compile_shader(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
* radv: remove unnecessary LLVM includesSamuel Pitoiset2020-03-131-5/+0
| | | | | | | | They are already included from src/amd/llvm. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4163>
* radv: Move some helper functions to the radv_shader.h header file.Timur Kristóf2020-03-111-83/+18
| | | | | | | | | Move calculate_tess_lds_size and get_tcs_num_patches to radv_shader.h ACO will need to call these functions too. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3964>
* amd: join emit_kill() from radv and radeonsi in ac_nir_to_llvmDaniel Schürmann2020-03-091-8/+0
| | | | | | Reviewed-by: Marek Olšák <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4047>
* radv: Squelch possibly-undefined warningEric Anholt2020-02-181-1/+1
| | | | | | | | The same condition is used in the def as in the use, but gcc wasn't figuring it out. Reviewed-by: Samuel Pitoiset <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3867>
* radv/gfx10: implement NGG GS queriesSamuel Pitoiset2020-01-291-0/+27
| | | | | | | | | | | The number of generated primitives is only counted by the hardware if GS uses the legacy path. For NGG GS, we need to accumulate that value in the NGG GS itself. To achieve that, we use a plain GDS atomic operation. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
* amd/common,radv: move vertex_format_table to ac_shader_util.{h,c}Rhys Perry2020-01-281-27/+3
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3086>
* radv/gfx10: simplify some duplicated NGG GS codeSamuel Pitoiset2020-01-151-62/+41
| | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3382>
* radv/gfx10: add support for NGG passthrough modeSamuel Pitoiset2020-01-131-9/+13
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: do not declare LDS for NGG if uselessSamuel Pitoiset2020-01-131-6/+9
| | | | | | | Only needed for NGG without passthrough mode or for NGG streamout. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: unify primitive export codeMarek Olšák2020-01-081-54/+6
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: unify build_sendmsg_gs_alloc_reqMarek Olšák2020-01-081-24/+4
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: declare an enum for the OOB select field on GFX10Samuel Pitoiset2019-12-191-1/+1
| | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
* radv/gfx10: fix ngg_get_ordered_idSamuel Pitoiset2019-12-171-1/+1
| | | | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3133>
* radv: handle unaligned vertex fetches on GFX6/GFX10Samuel Pitoiset2019-12-131-47/+86
| | | | | | | | | | | | | | | | | | | | The Vulkan spec doesn't have any words for vertex attributes alignment. Fixes a test failure on GFX6 and a GPU hang on GFX10 with: dEQP-VK.spirv_assembly.instruction.spirv1p4.entrypoint.tess_con_pc_entry_point vkpipeline-db results on GFX10: Totals from affected shaders: SGPRS: 463772 -> 472972 (1.98 %) VGPRS: 343208 -> 343752 (0.16 %) Spilled SGPRs: 323 -> 336 (4.02 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 13806200 -> 14164472 (2.60 %) bytes Max Waves: 84021 -> 83755 (-0.32 %) Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2161 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fix the vertex order for triangle strips emitted by a GSSamuel Pitoiset2019-12-041-48/+47
| | | | | | | | | My fix wasn't totally correct as pointed out by Marek. Ported from RadeonSI. Fixes: deafe4cc587 ("radv/gfx10: fix primitive indices orientation for NGG GS") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: simplify a check in radv_fixup_vertex_input_fetches()Samuel Pitoiset2019-12-041-4/+2
| | | | | | | The number of loaded channels should always be > 0 now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: set swizzled bit in cache policy as a hint not to merge loads/storesMarek Olšák2019-11-251-10/+10
| | | | | | LLVM now merges loads and stores for all opcodes, so this must be set. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Replace supports_spill with explict_scratch_argsConnor Abbott2019-11-251-9/+5
| | | | | | | | | | | The former was always true and hence dead code. We will want to explicitly declare the ring offset register with ACO, but we also want to declare the scratch offset too, and we can't try to disable it since ACO also supports spilling and the determination of whether spilling has to happen occurs well after setting up registers. So replace supports_spill with something that will actually be used for ACO. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Move argument declaration out of nir_to_llvmConnor Abbott2019-11-251-774/+51
| | | | | | Now it's executed for ACO too. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir, radv, radeonsi: Switch to using ac_shader_argsConnor Abbott2019-11-251-706/+701
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Acked-by: Marek Olšák <[email protected]>
* radv: Rename ac_arg_regfileConnor Abbott2019-11-251-2/+2
| | | | | | | | We'll duplicate this in a header file in the next commit, and then remove the original enum. Just rename it temporarily so that things keep building. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: remove useless RADV_DEBUG=unsafemath debug optionSamuel Pitoiset2019-11-151-25/+1
| | | | | | | This option is useless and shouldn't be used at all. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix radv_nir_get_max_workgroup_size when nir=NULLRhys Perry2019-11-111-1/+4
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Fixes: 84a1a2578 ('compiler: pack shader_info from 160 bytes to 96 bytes') Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: fix primitive indices orientation for NGG GSSamuel Pitoiset2019-11-071-7/+45
| | | | | | | | | | The primitive indices have to be swapped to follow the drawing order. This fixes corruption with Overwatch when NGG GS is force enabled. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: determine shaders wavesize at pipeline levelSamuel Pitoiset2019-11-061-2/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: hardcode the number of waves for the GFX6 LS-HS bugSamuel Pitoiset2019-11-061-1/+1
| | | | | | | It's always 64. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: declare NGG scratch for VS or TES and only on GFX10Samuel Pitoiset2019-10-311-5/+3
| | | | | | | | Do not need to declare it for other stages because this is for streamout. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement VK_KHR_shader_float_controlsSamuel Pitoiset2019-10-181-3/+7
| | | | | | | | | | | | | | | | This exposes what's required for DX and this is what we already configure. The driver flushes denorms for FP32 and preserves them for FP16/FP64. Note that we can't allow both preserving and flushing denorms because this won't work for merged shaders. This will require LLVM to update the float mode register to make it work. Only enabled on GFX8+ with the LLVM path because it's untested on previous chips and ACO doesn't support it. This extension is required for SPIRV 1.4. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fix NGG streamout with triangle strips for VSSamuel Pitoiset2019-10-021-1/+5
| | | | | | | | | | The number of vertices has to be adjusted with the output primitive type. This fixes dEQP-VK.transform_feedback.simple.triangle_strip_*. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fix storing/loading NGG stream outputs for GSSamuel Pitoiset2019-10-021-12/+77
| | | | | | | | | | | The GS outputs are stored differently in the LDS storage, they are indexed by out_idx which is incremented for each stored DWORD. Thus, we need a different path for exporting the stream outputs. This fixes a bunch of CTS failures when NGG GS is force enabled. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: use the component mask when storing/loading NGG stream outputsSamuel Pitoiset2019-10-021-0/+6
| | | | | | | It's unnecessary to store/load more components that needed. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fix storing/loading NGG stream outputs for VS and TESSamuel Pitoiset2019-10-021-8/+10
| | | | | | | | | | | | | The LDS storage allocated for stream outputs is 4 * N, where N is the number of outputs. So, we have to store/load with N as index and not with the output location as index. This doesn't fix anything known but it should fix out-of-bounds access and it also reduces the number of outputs written to the LDS storage. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* aco,radv: rename record_llvm_ir/llvm_ir_string to record_ir/ir_stringRhys Perry2019-09-261-1/+1
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: never kill a NGG GS shaderRhys Perry2019-09-181-1/+3
| | | | | | | | Seems to fix a hang with excessive vertex emissions when NGG is used for GS. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>