summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv/winsys: fix hash when adding internal buffersSamuel Pitoiset2019-01-301-1/+1
| | | | | | | | This fixes serious stuttering in Shadow Of The Tomb Raider. Fixes: 50fd253bd6e ("radv/winsys: Add priority handling during submit.") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use the correct LLVM processor name on Raven2Marek Olšák2019-01-291-1/+1
| | | | Reviewed-by: Alex Deucher <[email protected]>
* radv: Enable VK_EXT_memory_priority.Bas Nieuwenhuizen2019-01-293-5/+20
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/winsys: Add priority handling during submit.Bas Nieuwenhuizen2019-01-293-49/+115
| | | | | | | | | Switched to the raw bo list api to avoid having to use 2 arrays for everything. This was introduced in libdrm 2.4.97 which we already depend upon. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/winsys: Set winsys bo priority on creation.Bas Nieuwenhuizen2019-01-2912-29/+82
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: re-enable fast depth clears for 16-bit surfaces on VISamuel Pitoiset2019-01-291-8/+0
| | | | | | | | | | | | | | | | | | | This has been disabled some months ago because it introduced rendering issues with Shadow Of Warrier II (DXVK). This game is no longer affected, I wonder if 824cfc1ee5e ("radv: rework the TC-compat HTILE hardware bug with COND_EXEC") fixed the problem. I checked The Forest on my Polaris, and it renders fine too. According to Phillip, this gives +5.5% with Rise Of The Tomb Raider and DXVK. This is because DXVK uses 16-bit depth surfaces while the native port from Feral uses 32-bit depth surfaces. Unfortunately, Shadow Of The Tomb Raider isn't affected because it clears each layer of a D16 array texture individually. So it doesn't hit the fast clear path. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set noalias/dereferenceable LLVM attributes based on param typesSamuel Pitoiset2019-01-281-13/+7
| | | | | | | | Instead of using this useless array_params_mask variable. This should set these two attributes to streamout buffers too. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: simplify allocating user SGPRS for descriptor setsSamuel Pitoiset2019-01-281-68/+34
| | | | | | | | Unnecesary to check the current stages if desc_set_used_mask is used. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove radv_userdata_info::indirect fieldSamuel Pitoiset2019-01-283-12/+6
| | | | | | | Always false. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/ac: fix some fp16 handlingTimothy Arceri2019-01-282-2/+2
| | | | | | Fixes: b722b29f10d4 ("radv: add support for 16bit input/output") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Remove unused variable.Bas Nieuwenhuizen2019-01-271-1/+0
| | | | Trivial.
* radv: add device->instance extension dependenciesNiklas Haas2019-01-271-0/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | From the vulkan spec 33.3 "Extension Dependencies": "Any device extension that has an instance extension dependency that is not enabled by vkCreateInstance is considered to be unsupported, hence it must not be returned by vkEnumerateDeviceExtensionProperties for any VkPhysicalDevice child of the instance." Therefore we need to check whether the instance-level extensions are actually enabled when deciding to support a device-level extension or not. Furthermore, we need to do this for all instance-level extensions of any (transitive) device-level extension dependency, due to the following paragraph: "If an extension is supported (as queried by vkEnumerateInstanceExtensionProperties or vkEnumerateDeviceExtensionProperties), then required extensions of that extension must also be supported for the same instance or physical device." Finally, because some of these vulkan extensions may be implicitly promoted to future vulkan core API versions, we can also satisfy the dependency if the vulkan API version is high enough. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: correctly use vulkan 1.0 by defaultNiklas Haas2019-01-271-1/+1
| | | | | | | | | | | | From the vulkan spec 3.2 "Instances": "Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing an apiVersion of 0 is equivalent to providing an apiVersion of VK_MAKE_VERSION(1,0,0)." Fixes: ffa15861ef7c924a33e1f "radv: UseEnumerateInstanceVersion for the default version." Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir_to_llvm: fix clamp shadow reference for more hardwareTimothy Arceri2019-01-261-1/+1
| | | | | | | | | | | Fixes the following piglit test on my VEGA and matches the behaviour in the tgsi backend. tests/spec/glsl-1.10/execution/samplers/glsl-fs-shadow2D-clamp-z.shader_test Fixes: 625dcbbc4566 ("amd/common: pass address components individually to ac_build_image_intrinsic") Reviewed-by: Marek Olšák <[email protected]>
* radv: fix computing number of user SGPRs for streamout buffersSamuel Pitoiset2019-01-251-0/+3
| | | | | | | Streamout buffers are emitted like push constants. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always pass the GFX9 fence data to si_cs_emit_cache_flush()Samuel Pitoiset2019-01-232-16/+4
| | | | | | | Remove two useless checks. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: compute the GFX9 fence VA at allocation timeSamuel Pitoiset2019-01-233-9/+8
| | | | | | | Instead of doing every time we emit cache flushes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only allocate the GFX9 fence and EOP BOs for the gfx queueSamuel Pitoiset2019-01-231-1/+2
| | | | | | | | | | It's invalid to emit a ZPASS_DONE event on the compute queue, and the fence BO is unused on the compute queue (ie. we don't flush CB or DB caches). This saves some space in the upload BO. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove old_fence parameter from si_cs_emit_write_event_eop()Samuel Pitoiset2019-01-234-9/+7
| | | | | | | | This parameter is actually useless as the immediate value can always be zero. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: improve gathering of load_push_constants with dynamic bindingsSamuel Pitoiset2019-01-233-1/+7
| | | | | | | | | | | | For example, if a pipeline has two stages VS and FS. And if only the fragment stage needs dynamic bindings, we shouldn't allocate an extra user SGPR for the vertex stage. Of course, if the vertex stage loads constants, it needs an user SGPR. This should reduce the number of SET_SH_REG packets that are emitted. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir_to_llvm: fix interpolateAt* for structsTimothy Arceri2019-01-231-12/+13
| | | | | | | This fixes the arb_gpu_shader5 interpolateAt* tests that contain structs. Acked-by: Marek Olšák <[email protected]>
* ac/nir_to_llvm: add bindless support for uniform handlesTimothy Arceri2019-01-231-0/+28
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: correct WRITE_DATA.DST_SEL definitionsMarek Olšák2019-01-224-12/+12
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* mesa: add MESA_SHADER_KERNELKarol Herbst2019-01-211-2/+2
| | | | | | | | used for CL kernels Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: prevent dirtying of dynamic state when it does not changeRhys Perry2019-01-211-16/+75
| | | | | | | | DXVK often sets dynamic state without actually changing it. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: avoid context rolls when binding graphics pipelinesRhys Perry2019-01-213-108/+141
| | | | | | | | | | | | | | | | | | It's common in some applications to bind a new graphics pipeline without ending up changing any context registers. This has a pipline have two command buffers: one for setting context registers and one for everything else. The context register command buffer is only emitted if it differs from the previous pipeline's. v2: ensure late scissor emission is done when radv_emit_rbplus_state() is called v2: make use of cmd_buffer->state.workaround_scissor_bug v3: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add missed situations for scissor bug workaroundRhys Perry2019-01-212-24/+43
| | | | | | | | v2: rename "workaround_scissor_bug" to "context_roll_without_scissor_emitted" Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: pass radv_draw_info to radv_emit_draw_registers()Rhys Perry2019-01-211-60/+58
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: replace more nir_load_system_value calls with builder functionsKarol Herbst2019-01-213-10/+10
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_shared to nir_var_mem_sharedKarol Herbst2019-01-191-4/+4
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_function to nir_var_function_tempKarol Herbst2019-01-192-7/+7
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir_to_llvm: fix interpolateAt* for arraysTimothy Arceri2019-01-191-19/+58
| | | | | | | | | | | This builds on the recent interpolate fix by Rhys ee8488ea3b99. This fixes the arb_gpu_shader5 interpolateAt* tests that contain arrays. Fixes: ee8488ea3b99 ("ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsics") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: initialize the per-queue descriptor BO only onceSamuel Pitoiset2019-01-181-24/+23
| | | | | | | Totally useless to write the descriptors inside the loop. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not write unused descriptors to the per-queue BOSamuel Pitoiset2019-01-181-124/+128
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: reduce size of the per-queue descriptor BOSamuel Pitoiset2019-01-181-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: drop unused code related to 16 sample locationsSamuel Pitoiset2019-01-183-13/+0
| | | | | | | The driver only supports up to 8 sample locations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir_to_llvm: add support for structs to get_sampler_desc()Timothy Arceri2019-01-171-19/+26
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/nir_to_llvm: fix regression in bindless supportTimothy Arceri2019-01-171-1/+6
| | | | | | This wasn't ported over when deref support was implemented. Reviewed-by: Marek Olšák <[email protected]>
* ac/nir_to_llvm: fix type handling in image codeTimothy Arceri2019-01-171-15/+12
| | | | | | | | | | The current code only strips off arrays and cannot find the type for images that are struct members. Instead of trying to get the image type from the variable, we just get it directly from the deref instruction. Reviewed-by: Marek Olšák <[email protected]>
* radv: use dithered alpha-to-coverageRhys Perry2019-01-161-4/+5
| | | | | | | | This matches the behaviour of AMDVLK and hides banding. It is also seems to be allowed by the Vulkan spec. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: don't trash L1 caches for store operations with writeonly memorySamuel Pitoiset2019-01-161-5/+15
| | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* winsys/amdgpu: fix whitespaceMarek Olšák2019-01-151-1/+1
|
* radv: add support for VK_EXT_memory_budgetSamuel Pitoiset2019-01-156-1/+124
| | | | | | | | | | | | A simple Vulkan extension that allows apps to query size and usage of all exposed memory heaps. The different usage values are not really accurate because they are per drm-fd, but they should be close enough. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add two small helpers for getting VRAM and visible VRAM sizesSamuel Pitoiset2019-01-151-5/+16
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unnecessary returns in GetPhysicalDevice*Properties()Samuel Pitoiset2019-01-151-4/+4
| | | | | | | | These functions return nothing. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Set partial_vs_wave for pipelines with just GS, not tess.Bas Nieuwenhuizen2019-01-151-8/+20
| | | | | | | | | | | | | | | | Looking at -pro we need to enable it for pipelines with just a GS too. This seems to reduce the hangs from https://bugs.freedesktop.org/show_bug.cgi?id=109242 on a RX 550 to the point where I can't reproduce, after the false start with the wd_switch_on_eop patch due to flakiness. (but people are reporting it does not fix the issue completely for them on polaris 11) CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add missing 16-bit types to glsl_base_to_llvm_type()Samuel Pitoiset2019-01-141-1/+6
| | | | | | | | Fix crashes with dEQP-VK.spirv_assembly.instruction.compute.workgroup_memory.*16 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Only use 32 KiB per threadgroup on Stoney.Bas Nieuwenhuizen2019-01-141-1/+10
| | | | | | | | | | | | Causes hangs on some machines. What works for dEQP-VK.tessellation.shader_input_output.barrier: - running num_patches = 6 (which limits LDS to 32 KiB) - running num_patches = 8, and artificially cutting LDS size at 32 KiB. CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: set cache policy when loading/storing buffer imagesSamuel Pitoiset2019-01-141-14/+11
| | | | | | | This was missing. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: add get_cache_policy() helper and use itSamuel Pitoiset2019-01-141-12/+26
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>