summaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* amd/common: Introduce ac_get_fs_input_vgpr_cnt.Timur Kristóf2019-09-262-0/+59
| | | | | | | | | | | Add a function called ac_get_fs_input_vgpr_cnt which will return the number of input VGPRs used by an AMD shader. Previously, radv and radeonsi had the same code duplicated, but this commit also allows them to share this code. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: Add num_shared_vgprs to ac_shader_config for GFX10.Timur Kristóf2019-09-262-0/+20
| | | | | | | | | In GFX10 wave64 mode, shared VGPRs allow the two wave halves to share some data with each other. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: Extract some helper functions to ac_shader_util.Timur Kristóf2019-09-265-117/+131
| | | | | | | | | | This commit moves ac_get_tbuffer_format, ac_get_sampler_dim and ac_get_image_dim into ac_shader_util, thus enabling them to be used by compilers other than LLVM. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: Move ac_export_mrt_z to ac_llvm_build.Timur Kristóf2019-09-264-75/+76
| | | | | | | | | The aim of this commit is to keep ac_shader_util LLVM-free, since we would like to use it in ACO later. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: force unnormalized coordinates for RECTMarek Olšák2019-09-231-1/+3
| | | | | | This fixes VAAPI. Reviewed-by: Connor Abbott <[email protected]>
* ac/nir: port Z compare value clamping from radeonsiMarek Olšák2019-09-231-9/+25
| | | | | | This fixes some dEQP tests. Reviewed-by: Connor Abbott <[email protected]>
* ac: stop using PCI IDs for chip identificationMarek Olšák2019-09-231-15/+58
| | | | | | PCI IDs for amdgpu will be removed from Mesa. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radv/aco: Setup alternate path in RADV to support the experimental ACO compilerDaniel Schürmann2019-09-191-0/+3
| | | | | | | | | | LLVM remains default and ACO can be enabled with RADV_PERFTEST=aco. Co-authored-by: Daniel Schürmann <[email protected]> Co-authored-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: move ac_get_num_physical_vgprs into radeon_infoMarek Olšák2019-09-182-10/+2
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move ac_get_num_physical_sgprs into radeon_infoMarek Olšák2019-09-182-12/+12
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move ac_get_max_wave64_per_simd into radeon_infoMarek Olšák2019-09-182-16/+4
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move num_sdp_interfaces into radeon_infoMarek Olšák2019-09-182-0/+15
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move PBB MAX_ALLOC_COUNT into radeon_infoMarek Olšák2019-09-182-0/+33
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: Remove DEBUG workaroundMichel Dänzer2019-09-171-6/+0
| | | | | | As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG. Reviewed-by: Timothy Arceri <[email protected]>
* ac: replace HAVE_LLVM with LLVM_VERSION_MAJOR for atomic-optimizationsMarek Olšák2019-09-111-1/+1
| | | | trivial
* radeonsi: move texture storage allocation outside of radeonsiMarek Olšák2019-09-092-2/+65
| | | | | | possible code sharing with radv Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: move HTILE allocation outside of radeonsiMarek Olšák2019-09-092-3/+11
| | | | | | | ac_surface computes it for amdgpu. radeon_drm_surface computes it for radeon. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/surface: add RADEON_SURF_NO_FMASKMarek Olšák2019-09-092-4/+8
| | | | | | This controls FMASK and CMASK computation for MSAA. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: fix wave occupancy computationsMarek Olšák2019-09-091-3/+19
| | | | | Cc: 19.2 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: use fma on gfx10Marek Olšák2019-09-092-1/+9
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: enable LLVM atomic optimizationsMarek Olšák2019-09-091-1/+9
|
* amd: replace major llvm version checks with LLVM_VERSION_MAJOREric Engestrom2019-09-063-14/+18
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* ac/nir: Support load_constant intrinsicsConnor Abbott2019-09-051-0/+55
| | | | | | | Setup a constant global variable that LLVM will stick in a .rodata section and generate PC-relative loads for. Reviewed-by: Marek Olšák <[email protected]>
* radv/radeonsi: Don't count read-only data when reporting code sizeConnor Abbott2019-09-052-0/+7
| | | | | | | | | | We usually use these counts as a simple way to figure out if a change reduces the number of instructions or shrinks an instruction. However, since .rodata sections aren't executed, we shouldn't be counting their size for this analysis. Make the linker return the total executable size, and use it to report the more useful size in both drivers. Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Fix gather4 integer wa with unnormalized coordinatesConnor Abbott2019-09-031-4/+26
| | | | | | | | | | | | | | This adds a bit of unneccesary code on radeonsi, since whether unnormalized coordinates are used is known at compile time with GL, but I wasn't sure if it was worth the few instructions to plumb everything through, especially for something so rare -- my shader-db doesn't have any instances where this changes anything. Fixes CTS tests I created at https://github.com/cwabbott0/VK-GL-CTS/tree/unnorm-gather-tests Acked-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Rewrite gather4 integer workaround based on radeonsiConnor Abbott2019-09-031-65/+85
| | | | | | | | | | The workaround was originally written based on amdgpu-pro traces, but since then radeonsi has got its own slightly different version. Use the radeonsi version instead, to be consistent and because it'll be slightly more convenient for handling unnormalized coordinates. Acked-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: drop now useless lookup_interp_param from ABISamuel Pitoiset2019-08-302-8/+32
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: import linear/perspective PS input parameters from radv/radeonsiSamuel Pitoiset2019-08-301-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add PKT3_CONTEXT_REG_RMWMarek Olšák2019-08-271-0/+1
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: add has_ls_vgpr_init_bug to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_msaa_sample_loc_bug to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add rbplus_allowed to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+11
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_tc_compat_zrange_bug to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_gfx9_scissor_bug to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add cpdma_prefetch_writes_memory to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+3
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_out_of_order_rast to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_load_ctx_reg_pkt to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+6
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_rbplus to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_dcc_constant_encode to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+5
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_distributed_tess to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add has_clear_state to ac_gpu_infoSamuel Pitoiset2019-08-272-0/+7
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: drop llvm8 from some load/store helpersSamuel Pitoiset2019-08-271-115/+75
| | | | | | | | Cleanup. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Remove gfx9_stride_size_workaround_for_atomicConnor Abbott2019-08-262-5/+1
| | | | | | | | | The workaround was entirely in common code, and it's needed in radeonsi too so just always do it when necessary. Fixes KHR-GL45.shader_image_load_store.advanced-allStages-oneImage on gfx9 with LLVM 8. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: add a workaround for viewing a slice of 3D as a 2D imageConnor Abbott2019-08-261-3/+32
| | | | | | | | | | | | GL and Vulkan allow you to bind a single layer of a 3D texture to a 2D image, and we weren't implementing a workaround for that on gfx9 that TGSI was. Copy it over. Fixes KHR-GL45.shader_image_load_store.non-layered_binding with radeonsi NIR. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Assert GS input index is constantConnor Abbott2019-08-231-0/+1
| | | | | | If it's not we silently ignore indir_index which is definitely a bug. Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: Handle const array offsets in get_deref_offset()Connor Abbott2019-08-231-6/+11
| | | | | | | | | | | Some users of this function (e.g. GS inputs) currently only work with constant offsets. We got lucky since all the tests used an array index of 0, so the non-constant part was always 0. But we still need to handle this. This doesn't fix any CTS test, but was noticed while debugging one. Reviewed-by: Marek Olšák <[email protected]>
* ac,radv,radeonsi: remove LLVM 7 supportSamuel Pitoiset2019-08-236-290/+58
| | | | | | | | Now that LLVM 9 will be released soon, we will only support LLVM 8, 9 and master (10). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: fix exclusive scans on GFX8-GFX9Samuel Pitoiset2019-08-221-4/+3
| | | | | | | | | | | | This fixes a regression introduced with scan&reduce operations on GFX10. Note that some subgroups CTS still fail on GFX10 but I assume it's a different issue. This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive*. Fixes: 227c29a80de "amd/common/gfx10: implement scan & reduce operations" Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Add explicit signs to image min/max intrinsicsJason Ekstrand2019-08-211-17/+30
| | | | | | | | | | | This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radeonsi/nir: always lower ballot masks as 64-bit, codegen handles itMarek Olšák2019-08-193-2/+11
| | | | | | This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks. This solution is better, because the IR isn't dependent on wave32.