summaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* radv: Add startup debug option.Bas Nieuwenhuizen2018-05-314-2/+50
| | | | | | | | | | | | | | | | This adds a RADV_DEBUG=startup option to dump more info about instance creation and device enumeration. A common question end users have is why the direver is not loading for them, and this has two common reasons: 1) They did not install the driver. 2) AMDGPU is not used for the card in the kernel. This adds some info messages so we can easily get a some useful output from end users. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add option to print errors even in optimized builds.Bas Nieuwenhuizen2018-05-3114-96/+108
| | | | | | | | | | | Errors are not that common of a case so we can eat a slight perf hit in having to call a function and do a runtime check. In turn this makes debugging random errors happening for end users easier, because they don't have to have a debug build on hand. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Make the sem_info allocate/free functions static.Bas Nieuwenhuizen2018-05-312-15/+9
| | | | | | | They are only used in 1 file. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Only expose subgroup shuffles on VI+.Bas Nieuwenhuizen2018-05-301-2/+5
| | | | | | | | | The current implementation depends on bpermute, which is VI+. Fixes: f2c6a550611 "radv: enable subgroup capabilities" Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: fix emitting descriptor pointers with LLVM < 7Samuel Pitoiset2018-05-301-2/+4
| | | | | | | | | | This was terribly wrong, I forced use of 32-bit pointers when emitting shader descriptor pointers. This fixes GPU hangs with LLVM 5&6 because 32-bit pointers are only supported with LLVM 7. Fixes: 88d1ed0f81 ("radv: emit shader descriptor pointers consecutively") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: emit shader descriptor pointers consecutivelySamuel Pitoiset2018-05-291-47/+57
| | | | | | | | | | | | This reduces the number of SET_SH_REG packets which are emitted for applications that use more than one descriptor set per stage. We should be able to emit more SET_SH_REG packets consecutively (like push constants and vertex buffers for the vertex stage), but this will be improved later. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allow radv_emit_shader_pointer_head() to emit more pointersSamuel Pitoiset2018-05-291-3/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: split radv_emit_shader_pointer()Samuel Pitoiset2018-05-291-5/+20
| | | | | | | | | This will allow to emit consecutive shader pointers for reducing the number of emitted SET_SH_REG packets, which is recommended. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Implement VK_KHR_draw_indirect_count.Bas Nieuwenhuizen2018-05-282-0/+50
| | | | | | | | | Literally the same as the AMD ext. Passes *indirect_draw_count* CTS tests. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Implement alternate GFX9 scissor workaround.Bas Nieuwenhuizen2018-05-281-33/+47
| | | | | | | | | | | | | | | | | | | | | This improves dota2 performance for me by 11% when I force the GPU DPM level to low (otherwise dota2 is CPU limited for 4k on my threadripper), which should be a large part of the radv-amdvlk gap. (For me with that was radv 60.3 -> 66.6, while AMDVLK does about 68 fps) It looks like dota2 rendered the GUI with a bunch of draws with a SetScissors before almost each draw, causing a lot of pipeline stalls. I'm not really happy with the duplication of code, but overriding radeon_set_context_reg would also be messy since we have the pre-recorded pipelines and a bunch of si_cmd_buffer code, as well as some memory->context reg loads for which things would be more complicated. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: run the EarlyCSEMemSSA LLVM passSamuel Pitoiset2018-05-251-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It's recommended by the instruction combining pass, and RadeonSI also runs it. This pass used to segfault with one shader of F12017 in the past, but it no longer crashes. Maybe the LLVM IR generated by RADV has changed. Polaris10: Totals from affected shaders: SGPRS: 441352 -> 441648 (0.07 %) VGPRS: 310888 -> 300784 (-3.25 %) Spilled SGPRs: 13576 -> 12983 (-4.37 %) Code Size: 22560328 -> 22420544 (-0.62 %) bytes Max Waves: 40755 -> 41366 (1.50 %) Vega10: Totals from affected shaders: SGPRS: 442848 -> 442000 (-0.19 %) VGPRS: 310396 -> 300460 (-3.20 %) Spilled SGPRs: 13708 -> 12906 (-5.85 %) Code Size: 22479428 -> 22336216 (-0.64 %) bytes Max Waves: 45783 -> 46506 (1.58 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix dumping compute shader on the graphics queueSamuel Pitoiset2018-05-251-5/+8
| | | | | | | The graphics pipeline can be NULL. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_dump_pipeline_state() helperSamuel Pitoiset2018-05-251-6/+11
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: rework how shaders are dumped when generating a hang reportSamuel Pitoiset2018-05-251-26/+15
| | | | | | | Use a flag for the active stages instead. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unused parameter in radv_dump_annotated_shader()Samuel Pitoiset2018-05-251-8/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: call nir_lower_io_to_temporaries for VS, GS, TES and FSSamuel Pitoiset2018-05-241-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: call nir_split_var_copies() before nir_lower_var_copies()Samuel Pitoiset2018-05-241-0/+3
| | | | | | | | This doesn't nothing special currently because we don't create any copy_var instructions, but this is needed for the next patch. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix computation of user sgprs for 32-bit pointersSamuel Pitoiset2018-05-221-1/+3
| | | | | | | With 32-bit pointers we only need one user SGPR per desc set. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: drop user_sgpr_info::sgpr_countSamuel Pitoiset2018-05-221-13/+11
| | | | | | | It's only used inside allocate_user_sgprs(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for 32-bit pointers in user data SGPRsSamuel Pitoiset2018-05-224-21/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We still use 64-bit GPU pointers for all ring buffers because llvm.amdgcn.implicit.buffer.ptr doesn't seem to support 32-bit GPU pointers for now. This can be improved later anyways. Vega10: Totals from affected shaders: SGPRS: 1008722 -> 1026710 (1.78 %) VGPRS: 706580 -> 707136 (0.08 %) Spilled SGPRs: 22555 -> 22209 (-1.53 %) Spilled VGPRs: 75 -> 75 (0.00 %) Code Size: 34819208 -> 35202140 (1.10 %) bytes Max Waves: 175423 -> 175086 (-0.19 %) Polaris10: Totals from affected shaders: SGPRS: 1029849 -> 1036517 (0.65 %) VGPRS: 709984 -> 708872 (-0.16 %) Spilled SGPRs: 22672 -> 22309 (-1.60 %) Spilled VGPRs: 82 -> 66 (-19.51 %) Scratch size: 76 -> 60 (-21.05 %) dwords per thread Code Size: 34915336 -> 35309752 (1.13 %) bytes Max Waves: 151221 -> 151677 (0.30 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add set_loc_shader_ptr() helperSamuel Pitoiset2018-05-221-7/+13
| | | | | | | This helper will hep for switching to 32-bit GPU pointers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate descriptor BOs in the 32-bit addr spaceSamuel Pitoiset2018-05-221-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate the upload BO in the 32-bit addr spaceSamuel Pitoiset2018-05-221-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set amdgpu-32bit-address-high-bits LLVM attributeSamuel Pitoiset2018-05-223-0/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/winsys: allow to allocate BOs in the 32-bit addr spaceSamuel Pitoiset2018-05-222-1/+3
| | | | | | | This introduces a new flag called RADEON_FLAG_32BIT. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/winsys: request high addressSamuel Pitoiset2018-05-221-4/+6
| | | | | | | This is needed for 32-bit GPU pointers. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix centroid interpolationSamuel Pitoiset2018-05-211-3/+0
| | | | | | | | | | | | | | | It's legal to set the centroid and sample interpolation modes when MSAA disabled. So, we have to initialize the centroid inputs because the hardware doesn't. This fixes rendering issues with DXVK and The Witness, World of Warcraft, Trackmania and probably more games. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106315 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102390 CC: 18.0 18.1 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Cleanup unused prime blit path.Bas Nieuwenhuizen2018-05-212-25/+0
| | | | | | | | Since we have the common WSI code, we use vkCmdCopyImageToBuffer instead. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: Fix SRGB compute copies.Bas Nieuwenhuizen2018-05-212-0/+42
| | | | | | | | | | | | | | | SRGB stores are broken. We had compensation code in the resolve path but none in the copy path. Since we don't want any conversion and it does not matter for DCC, just make everything UNORM instead. This happened to cause wrong colors for the PRIME path, as that uses image->buffer copies which always use the compute path. CC: 18.0 18.1 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106587 Reviewed-by: Dave Airlie <[email protected]>
* radv: fix VK_EXT_descriptor_indexingChristoph Haag2018-05-201-1/+1
| | | | | | | | GetPhysicalDeviceProperties2KHR() was crashing because features was null Fixes: 0e10790558b "radv: Enable VK_EXT_descriptor_indexing." CC: 18.1 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: pass radv_nir_compiler_options directly to create_llvm_function()Samuel Pitoiset2018-05-181-4/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: add radv_emit_shader_pointer() helperSamuel Pitoiset2018-05-173-13/+18
| | | | | | | For future work (support for 32-bit GPU pointers). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add some helpers for cleaning up radv_get_preamble_cs()Samuel Pitoiset2018-05-171-86/+128
| | | | | | | Because this function looks a bit ugly to me. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd: remove support for LLVM 4.0Marek Olšák2018-05-174-19/+7
| | | | | | | It doesn't support GFX9. Acked-by: Dave Airlie <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radv: only declare the ESGS rings for pre GFX9 chipsSamuel Pitoiset2018-05-171-4/+10
| | | | | | | GFX9 uses LDS instead. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allow to print GPU info with RADV_DEBUG=infoSamuel Pitoiset2018-05-172-0/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not emit unnecessary ES output storesSamuel Pitoiset2018-05-171-3/+23
| | | | | | | | | | | | | | | | | | | | | | GFX9: Totals from affected shaders: SGPRS: 472 -> 464 (-1.69 %) VGPRS: 576 -> 584 (1.39 %) Code Size: 45432 -> 44324 (-2.44 %) bytes Max Waves: 40 -> 40 (0.00 %) VI: SGPRS: 720 -> 720 (0.00 %) VGPRS: 728 -> 728 (0.00 %) Code Size: 45348 -> 43992 (-2.99 %) bytes Max Waves: 120 -> 120 (0.00 %) This affects Rise of Tomb Raider and the three Vulkan demos that use a geometry shader (geometryshader, deferredshadows and viewportarray). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not emit unnecessary GS output storesSamuel Pitoiset2018-05-171-0/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only pass the global BO list at submit time if enabledSamuel Pitoiset2018-05-171-2/+6
| | | | | | | | That way the winsys might use a faster path when the global BO list is NULL. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove the radv_finishme() when compiling shadersSamuel Pitoiset2018-05-171-4/+0
| | | | | | | | Having an entrypoint different than "main" doesn't mean we have multiple shaders per module. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove radv_device::llvm_supports_spillSamuel Pitoiset2018-05-173-7/+1
| | | | | | | It's always true. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add generated files to .gitignore(s)Dieter Nützel2018-05-151-0/+1
| | | | | Signed-off-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: reduce the number of parameters export by the GS copy shaderSamuel Pitoiset2018-05-141-4/+3
| | | | | | | | | | By using the geometry shader output usage mask. This improves all Vulkan demos that use a geometry shader (ie. geometryshader, deferredshadows, viewportarray). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: scan the geometry shader output usage maskSamuel Pitoiset2018-05-142-0/+9
| | | | | | | | For reducing the number of parameters that are exported by the GS copy shader. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: run the shader info pass before emitting the GS copy shaderSamuel Pitoiset2018-05-141-0/+2
| | | | | | | For further optimizations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: check that layout isn't NULL in radv_nir_shader_info_pass()Samuel Pitoiset2018-05-141-1/+1
| | | | | | | | An upcoming patch will run the shader info pass on the geometry shader just before emitting the GS copy shader. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Disable texel buffers with A2 SNORM/SSCALED/SINT for pre-vega.Bas Nieuwenhuizen2018-05-141-0/+19
| | | | | | | | | The hardware always interprets the alpha as unsigned and fixing it in the shader is going to add unacceptable overheads. CC: 18.0 18.1 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Fix up 2_10_10_10 alpha sign.Bas Nieuwenhuizen2018-05-144-13/+98
| | | | | | | | | | | | Pre-Vega HW always interprets the alpha for this format as unsigned, so we have to implement a fixup to do the sign correctly for signed formats. v2: Improve indexing mess. CC: 18.0 18.1 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106480 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add support for IMG_DATA_FORMAT_32_32_32.Bas Nieuwenhuizen2018-05-142-4/+7
| | | | | | | | | | Basic sampling support for linear tiling. No CTS regressions, but it seems the blitting coverage is not very extensive. https://bugs.freedesktop.org/show_bug.cgi?id=106331 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Translate logic ops.Bas Nieuwenhuizen2018-05-141-2/+43
| | | | | | | | | | | radeonsi could pass them through but the enum changed between Gallium and Vulkan, so we have to translate. In progress I made the register defines a bit more readable. CC: 18.0 18.1 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100430 Reviewed-by: Samuel Pitoiset <[email protected]>