aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan/radv_private.h
Commit message (Collapse)AuthorAgeFilesLines
* radv: add some infrastructure for fresh forks for each secure compileTimothy Arceri2019-11-251-1/+14
| | | | | | | | | | | | | In the following commits we want to be able to fork an existing lightweight fork created at device creation time. In order for the user facing process to communicate with this new fresh fork we create some members here to hold FIFO file descriptors and a unique id. Here we also add a new fork enum that we use to tell the lightweight process to create a fresh fork. For more information on why we create a fresh fork see the following commits.
* radv: Do not change scratch settings while shaders are active.Bas Nieuwenhuizen2019-11-201-4/+8
| | | | | | | | | | | When the scratch ringbuffer settings are changed, the shader unit has to be idle or we will have shaders using old and new settings. That combination is not supported on the HW (likely the offset is ringbuffer idx * WAVESIZE * 1024). CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: implement VK_AMD_device_coherent_memorySamuel Pitoiset2019-11-181-0/+4
| | | | | | | | | | | This extension adds the device coherent and device uncached memory types. It's known to be slower than non-device coherent memory but it might be useful for debugging. This is only exposed for chips that support L2 uncached. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove useless RADV_DEBUG=unsafemath debug optionSamuel Pitoiset2019-11-151-6/+5
| | | | | | | This option is useless and shouldn't be used at all. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement VK_EXT_subgroup_size_controlSamuel Pitoiset2019-11-061-0/+5
| | | | | | | | | | | | | | | | This extension allows to control the subgroup size by allowing a varying subgroup size and also specifying a required subgroup size. This implementation only allows to specify a required subgroup size for compute shaders because there is some caveats with other shader stages (eg. NGG with geometry shader). This basically allows apps to use Wave32 for compute shaders. This extension is enabled for all chips but only GFX10 supports Wave32. ACO doesn't support it. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Add wait-before-submit support for timelines.Bas Nieuwenhuizen2019-10-301-1/+15
| | | | | | | | | | | | | | This is actually a non-threaded implementation. I'd summarize this as event-based submission. When submit happens we walk a tree of submissions that depend on the syncobj signal operations to be submitted and if those submission we no other dependencies we start to execute them immediately. Or, well I still use a list to avoid issues with long chains and the stacksize when using recursion. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add timelines with a VK_KHR_timeline_semaphore impl.Bas Nieuwenhuizen2019-10-301-0/+31
| | | | | | | | | | This does not fully do wait-before-submit, to be done in a follow up patch. For kernels without support for timeline syncobjs, this adds an implementation of non-shareable timelines using legacy syncobjs. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Split semaphore into two parts as enum+union.Bas Nieuwenhuizen2019-10-301-4/+17
| | | | | | This is in preparation to adding more types. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: add radv_sc_read() helperTimothy Arceri2019-10-301-0/+3
| | | | | | | | | | This is a function with timeout support for reading from the pipe between processes used for secure compile. Initially we hardcode the timeout to 5 seconds. We can adjust the timeout limit in future if needed. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix empty-body instructionEric Engestrom2019-10-271-1/+1
| | | | | | | Fixes: 8d43e2b2ded0fe3c82d4 ("meson: add -Werror=empty-body to disallow `if(x);`") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: add radv_device_use_secure_compile() helperTimothy Arceri2019-10-261-0/+6
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add some new members to radv device and instance for secure compileTimothy Arceri2019-10-261-0/+21
| | | | | | | These will be used by the following commits to hold information about the forked secure compile processes. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_secure_compile_type enumTimothy Arceri2019-10-261-0/+11
| | | | | | | This will be used to identify information being passed between the parent and secure process during a secure compile. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix a performance regression with graphics depth/stencil clearsSamuel Pitoiset2019-10-231-0/+5
| | | | | | | | | | | | | | | | | | | | I recently changed the slow depth/stencil clear path to make sure depth values are explicitly exported by the fragment shader. This is actually only useful when VK_EXT_depth_range_unrestricted is enabled. While this path is correct, it introduced a performance regression with Heroes of the Storm, Shadow of Mordor (Vulkan beta) and probably more titles. This is because it prevents the hardware to do some optimizations like discarding fragments. This commit re-introduces the previous (a bit faster) slow depth/stencil clear path and it selects the unrestricted path only if VK_EXT_depth_range_unrestricted is enabled. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/863 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not create meta pipelines with 16 samplesSamuel Pitoiset2019-10-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | The driver only supports up to 8 samples, so it's useless to create more pipelines than needed. This fixes a conditional jump reported by Valgrind on GFX10: ==194282== Conditional jump or move depends on uninitialised value(s) ==194282== at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242) ==194282== by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334) ==194282== by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440) ==194282== by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764) ==194282== by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788) ==194282== by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114) ==194282== by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297) ==194282== by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277) ==194282== by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363) ==194282== by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080 This is caused by an out of bound access of 'fmask_array' (ie. index is 4 as for 16 samples). Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix DCC fast clear code for intensity formatsSamuel Pitoiset2019-10-141-0/+2
| | | | | | | | | | This fixes a rendering issue with DiRT 4 on GFX10. Only GFX10 was affected because intensity formats are different. Cc: 19.2 <[email protected]> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1923 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Expose image handle compat types for Android handles.Bas Nieuwenhuizen2019-10-101-0/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Allow Android image binding.Bas Nieuwenhuizen2019-10-101-0/+5
| | | | | | Using delayed layout of images. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/android: Add android hardware buffer import/export.Bas Nieuwenhuizen2019-10-101-0/+11
| | | | | | Support does not include images yet. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Deal with Android external formats.Bas Nieuwenhuizen2019-10-101-0/+3
| | | | | | | | To abstract things a bit, this adds a helper function in radv_android.c. However, this means we have to link in radv_android.c on non-android as well, which means some scaffolding changes. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Derive android usage from create flags.Bas Nieuwenhuizen2019-10-101-0/+4
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/android: Add android hardware buffer field to device memory.Bas Nieuwenhuizen2019-10-101-0/+4
| | | | | | | You cannot go from BO to Android hardware buffer, so for export we have to remember it. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add VK_ANDROID_external_memory_android_hardware_buffer.Bas Nieuwenhuizen2019-10-101-0/+13
| | | | | | Still disabled but now we can add entrypoints. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: use a compute shader for copying timestamp query resultsSamuel Pitoiset2019-10-101-0/+1
| | | | | | | | | | | | | | When the timestamp is not ready (ie. UINT64_MAX), the availabily bit should be zero. The previous code used to copy the timestamp value as the availabily bit and that's completely wrong. Because it's not that simple to emit a conditional with the CP, the driver now uses a compute shader for copying timestamp query results. Fixes dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: fix NGG streamout with triangle strips for VSSamuel Pitoiset2019-10-021-0/+1
| | | | | | | | | | The number of vertices has to be adjusted with the output primitive type. This fixes dEQP-VK.transform_feedback.simple.triangle_strip_*. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add radv_device::use_nggSamuel Pitoiset2019-10-021-0/+3
| | | | | | | Trivial. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/aco: Setup alternate path in RADV to support the experimental ACO compilerDaniel Schürmann2019-09-191-0/+4
| | | | | | | | | | LLVM remains default and ACO can be enabled with RADV_PERFTEST=aco. Co-authored-by: Daniel Schürmann <[email protected]> Co-authored-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: allocate GDS/OA buffer objects for NGG streamoutSamuel Pitoiset2019-09-161-0/+4
| | | | | | | This allocates two BOs for GFX10 NGG streamout. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: add an option to switch from legacy to NGG streamoutSamuel Pitoiset2019-09-161-0/+3
| | | | | | | | | | This internal option is turned off by default because NGG streamout still hangs. It seems like it's related to GDS as RadeonSI. That option will be turned on once all issues are resolved. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store engine nameLionel Landwerlin2019-09-151-0/+3
| | | | | | | | | | | We'll use this later for a new driconfig matching parameter. v2: Avoid leak in device creation error case (Bas) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: 19.2 <[email protected]>
* radv: do not pass all compiler options to the shader info passSamuel Pitoiset2019-09-101-1/+3
| | | | | | | Only the pipeline layout and the shader keys are needed. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: merge radv_shader_variant_info into radv_shader_infoSamuel Pitoiset2019-09-061-3/+3
| | | | | | | Having two different structs is useless. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add rbplus_allowed to ac_gpu_infoSamuel Pitoiset2019-08-271-2/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_tc_compat_zrange_bug to ac_gpu_infoSamuel Pitoiset2019-08-271-1/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_gfx9_scissor_bug to ac_gpu_infoSamuel Pitoiset2019-08-271-1/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add cpdma_prefetch_writes_memory to ac_gpu_infoSamuel Pitoiset2019-08-271-1/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_out_of_order_rast to ac_gpu_infoSamuel Pitoiset2019-08-271-1/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_load_ctx_reg_pkt to ac_gpu_infoSamuel Pitoiset2019-08-271-3/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_rbplus to ac_gpu_infoSamuel Pitoiset2019-08-271-1/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_dcc_constant_encode to ac_gpu_infoSamuel Pitoiset2019-08-271-3/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_distributed_tess to ac_gpu_infoSamuel Pitoiset2019-08-271-1/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_clear_state to ac_gpu_infoSamuel Pitoiset2019-08-271-1/+0
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: add mipmap support for the clear depth/stencil valuesSamuel Pitoiset2019-08-261-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add mipmap support for the TC-compat zrange bugSamuel Pitoiset2019-08-261-1/+10
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Hash Wave32 settings in shader key.Bas Nieuwenhuizen2019-08-121-0/+3
| | | | | | | Can result in different shaders. Fixes: 8a86908e9a7 "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders" Reviewed-by: Dave Airlie <[email protected]>
* radv: Add device argument for dcc compression check.Bas Nieuwenhuizen2019-08-071-1/+2
| | | | | | Because it is about to be generation dependent. Reviewed-by: Dave Airlie <[email protected]>
* radv: Disable compression for compute DCC decompress store.Bas Nieuwenhuizen2019-08-071-0/+1
| | | | | | | Previously we relied on stores not using DCC but that is going to change, so disable compression explicitly. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add extra struct to image view creation.Bas Nieuwenhuizen2019-08-071-1/+5
| | | | | | | For extra args. Unlike image creation, I'm not embedding the vk struct in there, so all the inline structs can be kept. Reviewed-by: Dave Airlie <[email protected]>
* radv: Pass through render loop detection to internal layout decisions.Bas Nieuwenhuizen2019-08-071-0/+5
| | | | | | | | And do nothing with it yet. Everything outside a renderpass has no render loop. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add render loop detection in renderpass.Bas Nieuwenhuizen2019-08-071-0/+1
| | | | | | | | | | | | | | VK spec 7.3: "Applications must ensure that all accesses to memory that backs image subresources used as attachments in a given renderpass instance either happen-before the load operations for those attachments, or happen-after the store operations for those attachments." So the only renderloops we can have is with input attachments. Detect these. Reviewed-by: Dave Airlie <[email protected]>