summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: initialize the per-queue descriptor BO only onceSamuel Pitoiset2019-01-181-24/+23
| | | | | | | Totally useless to write the descriptors inside the loop. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not write unused descriptors to the per-queue BOSamuel Pitoiset2019-01-181-124/+128
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: reduce size of the per-queue descriptor BOSamuel Pitoiset2019-01-181-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: drop unused code related to 16 sample locationsSamuel Pitoiset2019-01-183-13/+0
| | | | | | | The driver only supports up to 8 sample locations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir_to_llvm: add support for structs to get_sampler_desc()Timothy Arceri2019-01-171-19/+26
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/nir_to_llvm: fix regression in bindless supportTimothy Arceri2019-01-171-1/+6
| | | | | | This wasn't ported over when deref support was implemented. Reviewed-by: Marek Olšák <[email protected]>
* ac/nir_to_llvm: fix type handling in image codeTimothy Arceri2019-01-171-15/+12
| | | | | | | | | | The current code only strips off arrays and cannot find the type for images that are struct members. Instead of trying to get the image type from the variable, we just get it directly from the deref instruction. Reviewed-by: Marek Olšák <[email protected]>
* radv: use dithered alpha-to-coverageRhys Perry2019-01-161-4/+5
| | | | | | | | This matches the behaviour of AMDVLK and hides banding. It is also seems to be allowed by the Vulkan spec. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: don't trash L1 caches for store operations with writeonly memorySamuel Pitoiset2019-01-161-5/+15
| | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* winsys/amdgpu: fix whitespaceMarek Olšák2019-01-151-1/+1
|
* radv: add support for VK_EXT_memory_budgetSamuel Pitoiset2019-01-156-1/+124
| | | | | | | | | | | | A simple Vulkan extension that allows apps to query size and usage of all exposed memory heaps. The different usage values are not really accurate because they are per drm-fd, but they should be close enough. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add two small helpers for getting VRAM and visible VRAM sizesSamuel Pitoiset2019-01-151-5/+16
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unnecessary returns in GetPhysicalDevice*Properties()Samuel Pitoiset2019-01-151-4/+4
| | | | | | | | These functions return nothing. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Set partial_vs_wave for pipelines with just GS, not tess.Bas Nieuwenhuizen2019-01-151-8/+20
| | | | | | | | | | | | | | | | Looking at -pro we need to enable it for pipelines with just a GS too. This seems to reduce the hangs from https://bugs.freedesktop.org/show_bug.cgi?id=109242 on a RX 550 to the point where I can't reproduce, after the false start with the wd_switch_on_eop patch due to flakiness. (but people are reporting it does not fix the issue completely for them on polaris 11) CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add missing 16-bit types to glsl_base_to_llvm_type()Samuel Pitoiset2019-01-141-1/+6
| | | | | | | | Fix crashes with dEQP-VK.spirv_assembly.instruction.compute.workgroup_memory.*16 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Only use 32 KiB per threadgroup on Stoney.Bas Nieuwenhuizen2019-01-141-1/+10
| | | | | | | | | | | | Causes hangs on some machines. What works for dEQP-VK.tessellation.shader_input_output.barrier: - running num_patches = 6 (which limits LDS to 32 KiB) - running num_patches = 8, and artificially cutting LDS size at 32 KiB. CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: set cache policy when loading/storing buffer imagesSamuel Pitoiset2019-01-141-14/+11
| | | | | | | This was missing. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: add get_cache_policy() helper and use itSamuel Pitoiset2019-01-141-12/+26
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: Restore v4i32 suffix for llvm.SI.load.const intrinsicMichel Dänzer2019-01-141-1/+1
| | | | | | | | It was accidentally dropped in commit e4803ab7d2b6 "amd/common: use llvm.amdgcn.s.buffer.load for LLVM 8.0", breaking the universe with LLVM 7. Trivial.
* amd/common/vi+: enable SMEM loads with GLC=1Nicolai Hähnle2019-01-141-3/+7
| | | | | | Only on LLVM 8.0+, which supports the new intrinsic. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: use llvm.amdgcn.s.buffer.load for LLVM 8.0Nicolai Hähnle2019-01-141-4/+8
| | | | | | llvm.SI.load.const is deprecated. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove a few more unnecessary KHR suffixesEric Engestrom2019-01-103-11/+11
| | | | | | Cc: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (v1)
* ac/nir,radv,radeonsi/nir: use correct indices for interpolation intrinsicsRhys Perry2019-01-093-1/+6
| | | | | | | | | | | | | | Fixes artifacts in World of Warcraft when Multi-sample Alpha-Test is enabled with DXVK. It also fixes artifacts with Fallout 4's god rays with DXVK. Various piglit interpolateAt*() tests under NIR are also fixed. v2: formatting fix update commit message to include Fallout 4 and the Fixes tag Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver') Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106595 Signed-off-by: Rhys Perry <[email protected]>
* radv: skip draws with instance_count == 0Samuel Pitoiset2019-01-091-0/+13
| | | | | | | Loosely based on RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable variable pointersSamuel Pitoiset2019-01-091-1/+1
| | | | | | | | | | | | | | | | | | | The Vulkan spec 1.1.97 says: "variablePointers specifies whether the implementation supports the SPIR-V VariablePointers capability. When this feature is not enabled, shader modules must not declare the VariablePointers capability." As the SPIR-V feature is enabled, we should turn on the extension feature as well. All dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.* pass with the khronos internal repo. Note that a bunch of them fails with the public repo, but it's expected as they violate the specification. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: get rid of bunch of KHR suffixesSamuel Pitoiset2019-01-0910-164/+164
| | | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Eric Engestrom <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]>
* nir: rename global/local to private/function memoryKarol Herbst2019-01-082-7/+7
| | | | | | | | | | | | | | | | | | the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Sort supported capabilitiesJason Ekstrand2019-01-071-12/+12
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: Add support for using derefs for UBO/SSBO accessJason Ekstrand2019-01-081-0/+1
| | | | | | | | | For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Add explicit pointer typesJason Ekstrand2019-01-081-0/+4
| | | | | | | | Instead of baking in uvec2 for UBO and SSBO pointers and uint for push constant and shared memory pointers, make it configurable. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Move propagation of cast derefs to a new nir_opt_deref passJason Ekstrand2019-01-081-1/+1
| | | | | | | | | We're going to want to do more deref optimizations going forward and this gives us a central place to do them. Also, cast propagation will get a bit more complicated with the addition of ptr_as_array derefs. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* radv: Fix rasterization precision bits.Bas Nieuwenhuizen2019-01-071-3/+3
| | | | | | | | | | | | | | | Note that these limits are exact, not a "precision is at least x", as texel coords also get snapped to a multiple of this step size before filtering. This fixes CTS tests dEQP-VK.texture.explicit_lod.2d.sizes.31x55_nearest_linear_mipmap_nearest_repeat dEQP-VK.texture.explicit_lod.2d.sizes.57x35_nearest_linear_mipmap_nearest_repeat Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109151 Reviewed-by: Samuel Pitoiset <[email protected]>
* amd/common: Add some parentheses to silence warning.Bas Nieuwenhuizen2019-01-071-2/+2
| | | | | | | | | | | | [1/59] Compiling C object 'src/amd/common/src@amd@common@@amd_common@sta/ac_nir_to_llvm.c.o'. ../mesa/src/amd/common/ac_nir_to_llvm.c: In function ‘get_inst_tessfactor_writemask’: ../mesa/src/amd/common/ac_nir_to_llvm.c:4089:32: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses] writemask = ((1 << num_comps + 1) - 1) << first_component; ~~~~~~~~~~^~~ ../mesa/src/amd/common/ac_nir_to_llvm.c:4091:33: warning: suggest parentheses around ‘+’ inside ‘<<’ [-Wparentheses] writemask = (((1 << num_comps + 1) - 1) << first_component) << 4; Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Remove unused variable.Bas Nieuwenhuizen2019-01-071-1/+0
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Remove device path.Bas Nieuwenhuizen2019-01-072-3/+0
| | | | | | | | unused and gcc complains about strncpy. (from what I can see because strncpy does not leave a 0 byte on truncate. That said we don't use it so this does not fix a real bug). Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: remove unused variable from ac_build_ddxyMarek Olšák2019-01-071-1/+1
| | | | trivial
* radv: Implement buffer stores with less than 4 components.Bas Nieuwenhuizen2019-01-071-5/+14
| | | | | | | | | We started using it in the btoi paths for r32g32b32, and the LLVM IR checker will complain about it because we end up with intrinsics with the wrong type extension in the name. Fixes: 593996bc02 ("radv: implement buffer to image operations for R32G32B32") Reviewed-by: Samuel Pitoiset <[email protected]>
* nir: rename nir_link_constant_varyings() nir_link_opt_varyings()Timothy Arceri2019-01-021-2/+2
| | | | | | | | | | The following patches will add support for an additional optimisation so this function will no longer just optimise varying constants. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* ac/nir_to_llvm: add ac_are_tessfactors_def_in_all_invocs()Timothy Arceri2019-01-022-0/+163
| | | | | | | | | | | The following patch will use this with the radeonsi NIR backend but I've added it to ac so we can use it with RADV in future. This is a NIR implementation of the tgsi function tgsi_scan_tess_ctrl(). Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: Do a cache flush if needed before reading predicates.Bas Nieuwenhuizen2018-12-311-0/+2
| | | | | | | | | | | | | | This caused random failures for two conditional rendering tests: dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_discard dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_no_discard These wrote the predicate with the vertex shader, did a barrier and then started the conditional rendering. However the cache flushes for the barrier only happen on first draw, so after the predicate has been read. Fixes: e45ba51ea45 "radv: add support for VK_EXT_conditional_rendering" Reviewed-by: Dave Airlie <[email protected]>
* radv: Fix wrongly positioned paren.Bas Nieuwenhuizen2018-12-211-1/+1
| | | | | | Trivial. Fixes: 9f0bfbed11f "radv: Work around non-renderable 128bpp compressed 3d textures on GFX9."
* radv: enable shaderStorageImageMultisample feature on GFX8+Samuel Pitoiset2018-12-203-4/+4
| | | | | | | Untested on older chips. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for FMASK expandSamuel Pitoiset2018-12-207-0/+335
| | | | | | | Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: initialize FMASK for images in fully expanded modeSamuel Pitoiset2018-12-204-0/+39
| | | | | | | The value depends on the number of samples. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: restrict fmask lookup to image load intrinsicsSamuel Pitoiset2018-12-201-1/+1
| | | | | | | | | | | We don't ever want to do the fmask lookup on a atomic or store, the fmask should have been decompressed if the surface has been moved to IMAGE_LAYOUT. Original patch by Dave Airlie. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: compute optimal VM alignment for imported buffersSamuel Pitoiset2018-12-201-1/+30
| | | | | | | | | | This fixes GPU hangs on GFX9 with dEQP-VK.memory.external_memory_host.bind_image_memory_and_render.with_zero_offset.* Copied from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Work around non-renderable 128bpp compressed 3d textures on GFX9.Bas Nieuwenhuizen2018-12-205-8/+41
| | | | | | | | | | | Exactly what title says, the new addrlib does not allow the above with certain dimensions that the CTS seems to hit. Work around it by not allowing the app to render to it via compat with other 128bpp formats and do not render to it ourselves during copies. Fixes: 776b9113656 "amd/addrlib: update Mesa's copy of addrlib" Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: fix subpass image transitions with multiviewsSamuel Pitoiset2018-12-201-0/+11
| | | | | | | | The driver needs to decompress all image layers if a fast depth/color clear has been performed. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: drop the amdgpu-skip-threshold=1 workaround for LLVM 8Samuel Pitoiset2018-12-201-3/+9
| | | | | | | | | This workaround has been introduced by 135e4d434f6 for fixing DXVK GPU hangs with many games. It is no longer needed since LLVM r345718. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: remove the bitfield_extract workaround for LLVM 8Samuel Pitoiset2018-12-201-9/+15
| | | | | | | | This workaround has been introduced by 3d41757788a and it is no longer needed since LLVM r346422. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>