aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* radv: set noalias/dereferenceable LLVM attributes based on param typesSamuel Pitoiset2019-01-281-13/+7
| | | | | | | | Instead of using this useless array_params_mask variable. This should set these two attributes to streamout buffers too. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: simplify allocating user SGPRS for descriptor setsSamuel Pitoiset2019-01-281-68/+34
| | | | | | | | Unnecesary to check the current stages if desc_set_used_mask is used. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove radv_userdata_info::indirect fieldSamuel Pitoiset2019-01-283-12/+6
| | | | | | | Always false. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/ac: fix some fp16 handlingTimothy Arceri2019-01-281-1/+1
| | | | | | Fixes: b722b29f10d4 ("radv: add support for 16bit input/output") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Remove unused variable.Bas Nieuwenhuizen2019-01-271-1/+0
| | | | Trivial.
* radv: add device->instance extension dependenciesNiklas Haas2019-01-271-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | From the vulkan spec 33.3 "Extension Dependencies": "Any device extension that has an instance extension dependency that is not enabled by vkCreateInstance is considered to be unsupported, hence it must not be returned by vkEnumerateDeviceExtensionProperties for any VkPhysicalDevice child of the instance." Therefore we need to check whether the instance-level extensions are actually enabled when deciding to support a device-level extension or not. Furthermore, we need to do this for all instance-level extensions of any (transitive) device-level extension dependency, due to the following paragraph: "If an extension is supported (as queried by vkEnumerateInstanceExtensionProperties or vkEnumerateDeviceExtensionProperties), then required extensions of that extension must also be supported for the same instance or physical device." Finally, because some of these vulkan extensions may be implicitly promoted to future vulkan core API versions, we can also satisfy the dependency if the vulkan API version is high enough. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: correctly use vulkan 1.0 by defaultNiklas Haas2019-01-271-1/+1
| | | | | | | | | | | | From the vulkan spec 3.2 "Instances": "Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing an apiVersion of 0 is equivalent to providing an apiVersion of VK_MAKE_VERSION(1,0,0)." Fixes: ffa15861ef7c924a33e1f "radv: UseEnumerateInstanceVersion for the default version." Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix computing number of user SGPRs for streamout buffersSamuel Pitoiset2019-01-251-0/+3
| | | | | | | Streamout buffers are emitted like push constants. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always pass the GFX9 fence data to si_cs_emit_cache_flush()Samuel Pitoiset2019-01-232-16/+4
| | | | | | | Remove two useless checks. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: compute the GFX9 fence VA at allocation timeSamuel Pitoiset2019-01-233-9/+8
| | | | | | | Instead of doing every time we emit cache flushes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only allocate the GFX9 fence and EOP BOs for the gfx queueSamuel Pitoiset2019-01-231-1/+2
| | | | | | | | | | It's invalid to emit a ZPASS_DONE event on the compute queue, and the fence BO is unused on the compute queue (ie. we don't flush CB or DB caches). This saves some space in the upload BO. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove old_fence parameter from si_cs_emit_write_event_eop()Samuel Pitoiset2019-01-234-9/+7
| | | | | | | | This parameter is actually useless as the immediate value can always be zero. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: improve gathering of load_push_constants with dynamic bindingsSamuel Pitoiset2019-01-233-1/+7
| | | | | | | | | | | | For example, if a pipeline has two stages VS and FS. And if only the fragment stage needs dynamic bindings, we shouldn't allocate an extra user SGPR for the vertex stage. Of course, if the vertex stage loads constants, it needs an user SGPR. This should reduce the number of SET_SH_REG packets that are emitted. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: correct WRITE_DATA.DST_SEL definitionsMarek Olšák2019-01-223-10/+10
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: prevent dirtying of dynamic state when it does not changeRhys Perry2019-01-211-16/+75
| | | | | | | | DXVK often sets dynamic state without actually changing it. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: avoid context rolls when binding graphics pipelinesRhys Perry2019-01-213-108/+141
| | | | | | | | | | | | | | | | | | It's common in some applications to bind a new graphics pipeline without ending up changing any context registers. This has a pipline have two command buffers: one for setting context registers and one for everything else. The context register command buffer is only emitted if it differs from the previous pipeline's. v2: ensure late scissor emission is done when radv_emit_rbplus_state() is called v2: make use of cmd_buffer->state.workaround_scissor_bug v3: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add missed situations for scissor bug workaroundRhys Perry2019-01-212-24/+43
| | | | | | | | v2: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: pass radv_draw_info to radv_emit_draw_registers()Rhys Perry2019-01-211-60/+58
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: replace more nir_load_system_value calls with builder functionsKarol Herbst2019-01-213-10/+10
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_function to nir_var_function_tempKarol Herbst2019-01-191-4/+4
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: initialize the per-queue descriptor BO only onceSamuel Pitoiset2019-01-181-24/+23
| | | | | | | Totally useless to write the descriptors inside the loop. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not write unused descriptors to the per-queue BOSamuel Pitoiset2019-01-181-124/+128
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: reduce size of the per-queue descriptor BOSamuel Pitoiset2019-01-181-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: drop unused code related to 16 sample locationsSamuel Pitoiset2019-01-183-13/+0
| | | | | | | The driver only supports up to 8 sample locations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use dithered alpha-to-coverageRhys Perry2019-01-161-4/+5
| | | | | | | | This matches the behaviour of AMDVLK and hides banding. It is also seems to be allowed by the Vulkan spec. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for VK_EXT_memory_budgetSamuel Pitoiset2019-01-156-1/+124
| | | | | | | | | | | | A simple Vulkan extension that allows apps to query size and usage of all exposed memory heaps. The different usage values are not really accurate because they are per drm-fd, but they should be close enough. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add two small helpers for getting VRAM and visible VRAM sizesSamuel Pitoiset2019-01-151-5/+16
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unnecessary returns in GetPhysicalDevice*Properties()Samuel Pitoiset2019-01-151-4/+4
| | | | | | | | These functions return nothing. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Set partial_vs_wave for pipelines with just GS, not tess.Bas Nieuwenhuizen2019-01-151-8/+20
| | | | | | | | | | | | | | | | Looking at -pro we need to enable it for pipelines with just a GS too. This seems to reduce the hangs from https://bugs.freedesktop.org/show_bug.cgi?id=109242 on a RX 550 to the point where I can't reproduce, after the false start with the wd_switch_on_eop patch due to flakiness. (but people are reporting it does not fix the issue completely for them on polaris 11) CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Only use 32 KiB per threadgroup on Stoney.Bas Nieuwenhuizen2019-01-141-1/+10
| | | | | | | | | | | | Causes hangs on some machines. What works for dEQP-VK.tessellation.shader_input_output.barrier: - running num_patches = 6 (which limits LDS to 32 KiB) - running num_patches = 8, and artificially cutting LDS size at 32 KiB. CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: remove a few more unnecessary KHR suffixesEric Engestrom2019-01-103-11/+11
| | | | | | Cc: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (v1)
* ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsicsRhys Perry2019-01-091-0/+2
| | | | | | | | | | | | | | Fixes artifacts in World of Warcraft when Multi-sample Alpha-Test is enabled with DXVK. It also fixes artifacts with Fallout 4's god rays with DXVK. Various piglit interpolateAt*() tests under NIR are also fixed. v2: formatting fix update commit message to include Fallout 4 and the Fixes tag Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106595 Signed-off-by: Rhys Perry <[email protected]>
* radv: skip draws with instance_count == 0Samuel Pitoiset2019-01-091-0/+13
| | | | | | | Loosely based on RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable variable pointersSamuel Pitoiset2019-01-091-1/+1
| | | | | | | | | | | | | | | | | | | The Vulkan spec 1.1.97 says: "variablePointers specifies whether the implementation supports the SPIR-V VariablePointers capability. When this feature is not enabled, shader modules must not declare the VariablePointers capability." As the SPIR-V feature is enabled, we should turn on the extension feature as well. All dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.* pass with the khronos internal repo. Note that a bunch of them fails with the public repo, but it's expected as they violate the specification. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: get rid of bunch of KHR suffixesSamuel Pitoiset2019-01-0910-164/+164
| | | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Eric Engestrom <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]>
* nir: rename global/local to private/function memoryKarol Herbst2019-01-081-4/+4
| | | | | | | | | | | | | | | | | | the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Sort supported capabilitiesJason Ekstrand2019-01-071-12/+12
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: Add support for using derefs for UBO/SSBO accessJason Ekstrand2019-01-081-0/+1
| | | | | | | | | For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Add explicit pointer typesJason Ekstrand2019-01-081-0/+4
| | | | | | | | Instead of baking in uvec2 for UBO and SSBO pointers and uint for push constant and shared memory pointers, make it configurable. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Move propagation of cast derefs to a new nir_opt_deref passJason Ekstrand2019-01-081-1/+1
| | | | | | | | | We're going to want to do more deref optimizations going forward and this gives us a central place to do them. Also, cast propagation will get a bit more complicated with the addition of ptr_as_array derefs. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* radv: Fix rasterization precision bits.Bas Nieuwenhuizen2019-01-071-3/+3
| | | | | | | | | | | | | | | Note that these limits are exact, not a "precision is at least x", as texel coords also get snapped to a multiple of this step size before filtering. This fixes CTS tests dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat dEQP-VK.texture.explicit_lod.2d.sizes.57x35_nearest_linear_mipmap_nearest_repeat Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109151 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Remove unused variable.Bas Nieuwenhuizen2019-01-071-1/+0
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Remove device path.Bas Nieuwenhuizen2019-01-072-3/+0
| | | | | | | | unused and gcc complains about strncpy. (from what I can see because strncpy does not leave a 0 byte on truncate. That said we don't use it so this does not fix a real bug). Reviewed-by: Samuel Pitoiset <[email protected]>
* nir: rename nir_link_constant_varyings() nir_link_opt_varyings()Timothy Arceri2019-01-021-2/+2
| | | | | | | | | | The following patches will add support for an additional optimisation so this function will no longer just optimise varying constants. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radv: Do a cache flush if needed before reading predicates.Bas Nieuwenhuizen2018-12-311-0/+2
| | | | | | | | | | | | | | This caused random failures for two conditional rendering tests: dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_discard dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_no_discard These wrote the predicate with the vertex shader, did a barrier and then started the conditional rendering. However the cache flushes for the barrier only happen on first draw, so after the predicate has been read. Fixes: e45ba51ea45 "radv: add support for VK_EXT_conditional_rendering" Reviewed-by: Dave Airlie <[email protected]>
* radv: Fix wrongly positioned paren.Bas Nieuwenhuizen2018-12-211-1/+1
| | | | | | Trivial. Fixes: 9f0bfbed11f "radv: Work around non-renderable 128bpp compressed 3d textures on GFX9."
* radv: enable shaderStorageImageMultisample feature on GFX8+Samuel Pitoiset2018-12-203-4/+4
| | | | | | | Untested on older chips. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for FMASK expandSamuel Pitoiset2018-12-207-0/+335
| | | | | | | Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: initialize FMASK for images in fully expanded modeSamuel Pitoiset2018-12-204-0/+39
| | | | | | | The value depends on the number of samples. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: compute optimal VM alignment for imported buffersSamuel Pitoiset2018-12-201-1/+30
| | | | | | | | | | This fixes GPU hangs on GFX9 with dEQP-VK.memory.external_memory_host.bind_image_memory_and_render.with_zero_offset.* Copied from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>