aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
...
* radeonsi: add PKT3_CONTEXT_REG_RMWMarek Olšák2019-08-271-0/+1
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radv: make use of has_ls_vgpr_init_bugSamuel Pitoiset2019-08-273-2/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add has_ls_vgpr_init_bug to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_msaa_sample_loc_bug to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add rbplus_allowed to ac_gpu_infoSamuel Pitoiset2019-08-276-13/+13
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_tc_compat_zrange_bug to ac_gpu_infoSamuel Pitoiset2019-08-276-6/+7
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_gfx9_scissor_bug to ac_gpu_infoSamuel Pitoiset2019-08-276-8/+9
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add cpdma_prefetch_writes_memory to ac_gpu_infoSamuel Pitoiset2019-08-275-4/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_out_of_order_rast to ac_gpu_infoSamuel Pitoiset2019-08-275-6/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_load_ctx_reg_pkt to ac_gpu_infoSamuel Pitoiset2019-08-275-10/+8
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_rbplus to ac_gpu_infoSamuel Pitoiset2019-08-275-5/+7
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_dcc_constant_encode to ac_gpu_infoSamuel Pitoiset2019-08-275-8/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_distributed_tess to ac_gpu_infoSamuel Pitoiset2019-08-275-6/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_clear_state to ac_gpu_infoSamuel Pitoiset2019-08-275-16/+18
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: drop llvm8 from some load/store helpersSamuel Pitoiset2019-08-271-115/+75
| | | | | | | | Cleanup. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: add mipmap support for the clear depth/stencil valuesSamuel Pitoiset2019-08-262-27/+59
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add mipmap support for the TC-compat zrange bugSamuel Pitoiset2019-08-263-24/+51
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate metadata space for mipmapped depth/stencil imagesSamuel Pitoiset2019-08-261-2/+2
| | | | | | | | For each mipmaps, the driver will store the clear values (8-bytes) and the TC-compat zrange value (4-bytes). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: decompress mipmapped depth/stencil images during transitionsSamuel Pitoiset2019-08-263-12/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add mipmaps support for decompress/resummarizeSamuel Pitoiset2019-08-261-25/+32
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_process_depth_image_layer() helperSamuel Pitoiset2019-08-261-63/+75
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Remove gfx9_stride_size_workaround_for_atomicConnor Abbott2019-08-263-11/+1
| | | | | | | | | The workaround was entirely in common code, and it's needed in radeonsi too so just always do it when necessary. Fixes KHR-GL45.shader_image_load_store.advanced-allStages-oneImage on gfx9 with LLVM 8. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: add a workaround for viewing a slice of 3D as a 2D imageConnor Abbott2019-08-261-3/+32
| | | | | | | | | | | | GL and Vulkan allow you to bind a single layer of a 3D texture to a 2D image, and we weren't implementing a workaround for that on gfx9 that TGSI was. Copy it over. Fixes KHR-GL45.shader_image_load_store.non-layered_binding with radeonsi NIR. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix getting the index type size for uint8_tSamuel Pitoiset2019-08-261-1/+1
| | | | | | | | | | 16-bit and 32-bit values match hardware values but 8-bit doesn't. This fixes dEQP-VK.pipeline.input_assembly.* with 8-bit index. Fixes: 372c3dcfdb8 ("radv: implement VK_EXT_index_type_uint8") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]
* radv: Change memory type order for GPUs without dedicated VRAMAlex Smith2019-08-241-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | Put the uncached GTT type at a higher index than the visible VRAM type, rather than having GTT first. When we don't have dedicated VRAM, we don't have a non-visible VRAM type, and the property flags for GTT and visible VRAM are identical. According to the spec, for types with identical flags, we should give the one with better performance a lower index. Previously, apps which follow the spec guidance for choosing a memory type would have picked the GTT type in preference to visible VRAM (all Feral games will do this), and end up with lower performance. On a Ryzen 5 2500U laptop (Raven Ridge), this improves average FPS in the Rise of the Tomb Raider benchmark by up to ~30%. Tested a couple of other (Feral) games and saw similar improvement on those as well. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: 19.2 <[email protected]> (Bas: CCing this to 19.2-rc due to high impact and limited complexity)
* radv: additional query fixesAndres Rodriguez2019-08-171-7/+8
| | | | | | | | | Make sure we read the updated data from the gpu in cases where WAIT_BIT is not set. Cc: 19.1 19.2 <[email protected] Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Assert GS input index is constantConnor Abbott2019-08-231-0/+1
| | | | | | If it's not we silently ignore indir_index which is definitely a bug. Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Handle const array offsets in get_deref_offset()Connor Abbott2019-08-231-6/+11
| | | | | | | | | | | Some users of this function (e.g. GS inputs) currently only work with constant offsets. We got lucky since all the tests used an array index of 0, so the non-constant part was always 0. But we still need to handle this. This doesn't fix any CTS test, but was noticed while debugging one. Reviewed-by: Marek Olšák <[email protected]>
* radv/gfx10: do not use NGG with NAVI14Samuel Pitoiset2019-08-231-0/+1
| | | | | | Cc: 19.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0Samuel Pitoiset2019-08-231-1/+2
| | | | | | | | | Only gfx9 and older use it to get InstanceID in VGPR1. Ported from RadeonSI. Cc: 19.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac,radv,radeonsi: remove LLVM 7 supportSamuel Pitoiset2019-08-2310-302/+62
| | | | | | | | Now that LLVM 9 will be released soon, we will only support LLVM 8, 9 and master (10). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: Disable NGG for geometry shaders.Bas Nieuwenhuizen2019-08-221-0/+20
| | | | | | | | A bunch of remaining issues including some that affect users. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111248 Fixes: ee21bd7440c "radv/gfx10: implement NGG support (VS only)" Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: fix exclusive scans on GFX8-GFX9Samuel Pitoiset2019-08-221-4/+3
| | | | | | | | | | | | This fixes a regression introduced with scan&reduce operations on GFX10. Note that some subgroups CTS still fail on GFX10 but I assume it's a different issue. This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*. Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations" Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add RADV_DEBUG=allentrypointsAndres Rodriguez2019-08-212-8/+20
| | | | | | | | | | | | This debug option allows vkGet[Instance/Device]ProcAddr() to succeed even if the extension associated with the requested entrypoint was not enabled. This has come in handy in a few instances when debugging VR applications, so I thought it would be good to have a cleaned up version upstreamed. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Add explicit signs to image min/max intrinsicsJason Ekstrand2019-08-212-21/+38
| | | | | | | | | | | This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radv: implement VK_AMD_shader_core_properties2Samuel Pitoiset2019-08-212-0/+10
| | | | | | | Trivial extension that matches PAL. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: force enable VK_AMD_shader_ballot for Wolfenstein YoungbloodSamuel Pitoiset2019-08-211-0/+8
| | | | | | | | | | | | | | This gives a nice boost, +20% at this time on my Vega 56. Shader ballot should be enabled by default at some point but it reduces performance a bit (-6%) with Wolfeinstein II. Enable it only for Youngblood at the moment, like what we did for Talos in the past. As a bonus point, it gets rid of some minor artifacts that only happens when ballot is disabled for some reasons. Cc: 19.2 <[email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add a new debug option called RADV_DEBUG=noshaderballotSamuel Pitoiset2019-08-212-0/+2
| | | | | | | | | Shader ballot will be enabled by default for Wolfenstein Youngblood. This follows what we did for sisched. Cc: 19.2 <[email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allow to enable VK_AMD_shader_ballot only on GFX8+Samuel Pitoiset2019-08-212-2/+3
| | | | | | | | Scans aren't implemented on SI/CIK. Cc: 19.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Emit VGT_GS_ONCHIP_CNTL for tess on GFX10.Bas Nieuwenhuizen2019-08-211-0/+8
| | | | | | | | Otherwise hangs are possible. This register was already set for GS and NGG. Fixes: 5eaed7ecfce "radv/gfx10: enable support for NAVI10, NAVI12 and NAVI14" Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Use correct vgpr_comp_cnt for VS if both prim_id and instance_id are ↵Bas Nieuwenhuizen2019-08-211-2/+4
| | | | | | | | | needed. Should take the max of the 2. Fixes: ea337c8b7e9 "radv/gfx10: fix VS input VGPRs with the legacy path" Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/gfx10: hardcode some depth+stencil formats in the format tableSamuel Pitoiset2019-08-211-0/+5
| | | | | | | | | | | The script doesn't handle them correctly and D16_UNORM_S8_UINT isn't supported by the hardware, mark it as invalid. This fixes warning when generating gfx10_format_table.h. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111393 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: tidy up gfx10_format_table.pySamuel Pitoiset2019-08-211-11/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: do not emit PA_SC_TILE_STEERING_OVERRIDE twiceSamuel Pitoiset2019-08-201-2/+0
| | | | | | | CLEAR_STATE emits it for us. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not emit PKT3_CONTEXT_CONTROL with AMDGPU 3.6.0+Samuel Pitoiset2019-08-202-6/+12
| | | | | | | It's emitted by the kernel. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Add Renoir support.Bas Nieuwenhuizen2019-08-192-3/+6
| | | | | | | | | Took the freedom to enable dfsm even though I don't have benchmark results yet, but it seems Raven-like. Rest is from radeonsi. Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi/nir: always lower ballot masks as 64-bit, codegen handles itMarek Olšák2019-08-194-4/+14
| | | | | | This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks. This solution is better, because the IR isn't dependent on wave32.
* ac/nir: set image=true when loading FMASK for imagesMarek Olšák2019-08-191-1/+1
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/nir: Fix store_scratch with a non-full writemaskConnor Abbott2019-08-183-5/+42
| | | | | | | | | | | | By adding one more helper to ac_llvm_build, we can also easily keep vector stores together. Fixes the tests/spec/glsl-1.30/execution/fs-large-local-array-vec4.shader_test piglit test. Fixes: 74470baebbd ("ac/nir: Lower large indirect variables to scratch") Reviewed-by: Marek Olšák <[email protected]
* Revert "radv/gfx10: Enable DCC for storage images."Bas Nieuwenhuizen2019-08-163-18/+4
| | | | | | | Quite useless without DCC for LAYOUT_GENERAL. Fixes: b4dad3afaa0 Revert "radv: Do not decompress on LAYOUT_GENERAL." Acked-by: Dave Airlie <[email protected]>