aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan/radv_shader.h
Commit message (Collapse)AuthorAgeFilesLines
* radv: merge radv_shader_variant_info into radv_shader_infoSamuel Pitoiset2019-09-061-82/+64
| | | | | | | Having two different structs is useless. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/radeonsi: Don't count read-only data when reporting code sizeConnor Abbott2019-09-051-0/+1
| | | | | | | | | | We usually use these counts as a simple way to figure out if a change reduces the number of instructions or shrinks an instruction. However, since .rodata sections aren't executed, we shouldn't be counting their size for this analysis. Make the linker return the total executable size, and use it to report the more useful size in both drivers. Reviewed-by: Marek Olšák <[email protected]>
* radv: move lowering PS inputs/outputs at the right placeSamuel Pitoiset2019-08-301-0/+3
| | | | | | | At shaders creation, just after NIR linking. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: gather info about PS inputs in the shader info passSamuel Pitoiset2019-08-301-4/+4
| | | | | | | It's the right place to do that. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: make use of has_ls_vgpr_init_bugSamuel Pitoiset2019-08-271-0/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Keep shader info when needed.Bas Nieuwenhuizen2019-08-121-2/+4
| | | | | | | This allows enabling the shader info keeping on a per shader basis. Also disables the cache on a per shader basis. Reviewed-by: Dave Airlie <[email protected]>
* radv: Use string for nir dumping.Bas Nieuwenhuizen2019-08-121-1/+1
| | | | | | Reviewed-by: Dave Airlie <[email protected]> Allows us to easily dump all nir shaders for combined variants in vega and simplifies ownership.
* radv: Get max workgroup size without nir.Bas Nieuwenhuizen2019-08-121-0/+5
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Add utility function to calculate max waves.Bas Nieuwenhuizen2019-08-121-0/+6
| | | | | | Not AC because a lot of it is data extraction out of radv structs. Reviewed-by: Dave Airlie <[email protected]>
* radv: Put wave size in shader options/info.Bas Nieuwenhuizen2019-08-121-3/+2
| | | | | | | Instead of having the three values everywhere. This is also more future proof if we want the driver to make those decisions eventually. Reviewed-by: Dave Airlie <[email protected]>
* ac/nir,radv: Optimize bounds check for 64 bit CAS.Bas Nieuwenhuizen2019-08-021-0/+1
| | | | | | | | When the application does not ask for robust buffer access. Only implemented the check in radv. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: add Wave32 support for vertex, tessellation and geometry shadersSamuel Pitoiset2019-08-021-0/+1
| | | | | | | It can be enabled with RADV_PERFTEST=gewave32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add Wave32 support for fragment shadersSamuel Pitoiset2019-08-021-0/+1
| | | | | | | It can be enabled with RADV_PERFTEST=pswave32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add Wave32 support for compute shadersSamuel Pitoiset2019-07-311-0/+1
| | | | | | | It can be enabled with RADV_PERFTEST=cswave32. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Don't include radv_private.h from radv_shader.hDaniel Schürmann2019-07-301-56/+13
| | | | | | | This patch decouples radv_shader.h from any LLVM dependency. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: implement VK_EXT_post_depth_coverageSamuel Pitoiset2019-07-171-0/+1
| | | | | | | | I did implement this extension a while ago but it didn't work on pre GFX10 for some reasons. Now all CTS pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: tidy up radv_get_shader_name() and add NGG stagesSamuel Pitoiset2019-07-121-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Add a common member in the union to make things more clear.Bas Nieuwenhuizen2019-07-091-0/+3
| | | | | | | This clarifies that the struct can be used when the shader can be one of VS/TES. Reviewed-by: Samuel Pitoiset <[email protected]>
* Revert "radv: keep track of whether NGG is used for GS on GFX10"Bas Nieuwenhuizen2019-07-091-6/+0
| | | | | | | | | | | This reverts commit 63e0675d986744a9ed2d9a15b7cba84ff4a24fc2. The GS is merged with the preceding shader and since the preceding shader will have as_ngg set the final binary will have is_ngg set. So we do not need the gs key here. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: keep track of whether NGG is used for GS on GFX10Samuel Pitoiset2019-07-091-0/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Lower input attachments in NIR.Daniel Schürmann2019-07-081-1/+0
| | | | | | | | v2 (Connor) - Fix warning in release mode using MAYBE_UNUSED Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: implement NGG support (VS only)Samuel Pitoiset2019-07-071-0/+2
| | | | | | | | | | This needs to be cleaned up a bit, and it probably contains missing stuff and/or bugs. This doesn't fix the "half of the triangles" issue. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Combine vs and tes output keys parts.Bas Nieuwenhuizen2019-07-071-10/+12
| | | | | | That way the same deref is valid for both shader stages. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add the concept of radv shader binaries.Bas Nieuwenhuizen2019-07-041-8/+45
| | | | | | | | | | | | | This simplifies a bunch of stuff by (1) Keeping all the things in a single allocation, making things easier for the cache. (2) creating a shader_variant creation helper. This is immediately put to use by creating rtld shader binaries. This is the main reason for the binaries, as we need to do the linking at upload time, i.e. post caching. We do not enable rtld yet. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add export_prim_id to the shader variant info.Bas Nieuwenhuizen2019-07-041-0/+2
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Merge rsrc1/rsrc2 fields with the config fields.Bas Nieuwenhuizen2019-07-041-2/+0
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: rework how the number of VGPRs is computedSamuel Pitoiset2019-07-011-1/+0
| | | | | | | Just a cleanup, it shouldn't change anything. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only export clip/cull distances if PS reads themSamuel Pitoiset2019-06-271-0/+2
| | | | | | | | | | | | | | | | The only exception is the GS copy shader which emits them unconditionally. Totals from affected shaders: SGPRS: 71320 -> 71008 (-0.44 %) VGPRS: 54372 -> 54240 (-0.24 %) Code Size: 2952628 -> 2941368 (-0.38 %) bytes Max Waves: 9689 -> 9723 (0.35 %) This helps Dota2, Doom, GTAV and Hitman 2. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Run the new ycbcr lowering pass.Bas Nieuwenhuizen2019-04-251-1/+2
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add ycbcr lowering pass.Bas Nieuwenhuizen2019-04-251-0/+3
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: store more vertex attribute infos as pipeline keysSamuel Pitoiset2019-03-131-0/+6
| | | | | | | They are required for using typed buffer loads. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix float16 interpolation set up.Bas Nieuwenhuizen2019-02-221-0/+1
| | | | | | | | float16 types can have non-flat interpolation so set up the HW correctly for that. Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Handle clip+cull distances more generally as compact arrays.Bas Nieuwenhuizen2019-02-201-0/+2
| | | | | | | | | | | | Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 . That MR keeps the clip and cull arrays split. So we have to handle - compact arrays with location_frac != 0 - VARYING_SLOT_CLIP_DIST1 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: store vertex attribute formats as pipeline keysSamuel Pitoiset2019-02-141-0/+1
| | | | | | | The formats will be used for reducing the number of loaded channels. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for push constants inlining when possibleSamuel Pitoiset2019-02-121-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | This removes some scalar loads from shaders, but it increases the number of SET_SH_REG packets. This is currently basic but it could be improved if needed. Inlining dynamic offsets might also help. Original idea from Dave Airlie. 29077 shaders in 15096 tests Totals: SGPRS: 1321325 -> 1357101 (2.71 %) VGPRS: 936000 -> 932576 (-0.37 %) Spilled SGPRs: 24804 -> 24791 (-0.05 %) Code Size: 49827960 -> 49642232 (-0.37 %) bytes Max Waves: 242007 -> 242700 (0.29 %) Totals from affected shaders: SGPRS: 290989 -> 326765 (12.29 %) VGPRS: 244680 -> 241256 (-1.40 %) Spilled SGPRs: 1442 -> 1429 (-0.90 %) Code Size: 8126688 -> 7940960 (-2.29 %) bytes Max Waves: 80952 -> 81645 (0.86 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather if shaders load dynamic offsets separatelySamuel Pitoiset2019-02-121-0/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: gather more info about push constantsSamuel Pitoiset2019-02-121-0/+4
| | | | | | | | This is needed in order to inline some push constants when possible. This also adds a new helper for initializing the pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/radv/radeonsi: add ac_get_num_physical_sgprs() helperTimothy Arceri2019-02-011-6/+0
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: remove radv_userdata_info::indirect fieldSamuel Pitoiset2019-01-281-1/+0
| | | | | | | Always false. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: declare streamout SGPRsSamuel Pitoiset2018-10-291-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: gather stream output infoSamuel Pitoiset2018-10-291-0/+18
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: gather which GS stream is used for every outputsSamuel Pitoiset2018-10-291-0/+1
| | | | | | | To only emit outputs for the given stream. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: gather the number of output components per streamSamuel Pitoiset2018-10-291-0/+1
| | | | | | | This will be also used for splitting the GS->VS ring buffer. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: gather the number of streams used by geometry shadersSamuel Pitoiset2018-10-291-0/+1
| | | | | | | | This will be used for splitting the GS->VS ring buffer. The stream ID is always 0 for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: use nir_opt_find_array_copies()Timothy Arceri2018-10-181-1/+2
| | | | | | | | | | | | | | | | | | | | | | | Totals from affected shaders: SGPRS: 1112 -> 1112 (0.00 %) VGPRS: 1492 -> 1196 (-19.84 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 112172 -> 101316 (-9.68 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 93 -> 98 (5.38 %) Wait states: 0 -> 0 (0.00 %) All affected shaders are from "Batman: Arkham City" over DXVK. The pass detects that the temporary array created by DXVK for storing TCS inputs is a copy of the input arrays and allows us to avoid copying all of the input data and then indirecting on it with if-ladders, instead we just do indirect indexing. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: handle loc->indirect correctly for the first descriptorSamuel Pitoiset2018-09-141-1/+0
| | | | | | | | | | | | | This was wrong for descriptor #0 when all of them are indirect. This is because indirect_offset was 0 and we emitted a "normal" descriptor pointer for nothing. While we are at it remove radv_userdata_info::indirect_offset which is useless. CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix passing clip/cull distances from VS to PSSamuel Pitoiset2018-08-311-0/+1
| | | | | | | | | | | | | | | | | CTS doesn't test input clip/cull distances for the fragment shader stage, which explains why this was totally broken. I wrote a simple test locally that works now. This fixes a crash with GTA V and DXVK. Note that we are exporting unused parameters from the vertex shader now, but this can't be optimized easily because we don't keep the fragment shader info... Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107477 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: reduce CPU overhead in radv_flush_descriptors()Samuel Pitoiset2018-07-091-0/+1
| | | | | | | | The number of enabled descriptors for a given pipeline stage can be computed at compile time. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not use an user SGPR for the sample position offsetSamuel Pitoiset2018-06-201-1/+0
| | | | | | | We know the number of samples at compile time. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't store the number of samples as log2Samuel Pitoiset2018-06-201-1/+1
| | | | | | | Needed for the following patch. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>