aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* ac: add radeon_info::is_amdgpu instead of checking drm_major == 3Marek Olšák2019-06-142-5/+9
| | | | | | and clean up Reviewed-by: Samuel Pitoiset <[email protected]>
* amd/common: add support for AMD_shader_ballot functionsDaniel Schürmann2019-06-131-0/+20
| | | | Reviewed-by: Connor Abbott <[email protected]>
* amd/rtld: layout and relocate LDS symbolsNicolai Hähnle2019-06-122-19/+235
| | | | | | | | | | | Upcoming changes to LLVM will emit LDS objects as symbols in the ELF symbol table, with relocations that will be resolved with this change. Callers will also be able to define LDS symbols that are shared between shader parts. This will be used by radeonsi for the ESGS ring in gfx9+ merged shaders. Reviewed-by: Marek Olšák <[email protected]>
* amd/common: use ARRAY_SIZE for the LLVM command line optionsNicolai Hähnle2019-06-121-2/+2
| | | | | | This is more convenient for changing it around during debug. Reviewed-by: Marek Olšák <[email protected]>
* amd/common: add ac_compile_module_to_elfNicolai Hähnle2019-06-122-7/+83
| | | | | | | A new variant of ac_compile_module_to_binary that allows us to keep the entire ELF around. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use ac_shader_configNicolai Hähnle2019-06-121-0/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* amd/common: add a more powerful runtime linkerNicolai Hähnle2019-06-124-0/+653
| | | | | | | | | Using an explicit linker instead of just concatenating .text sections will allow us to start using .rodata sections and explicit descriptions of data on LDS that is shared between stages. Reviewed-by: Marek Olšák <[email protected]>
* amd/common: clarify ac_shader_binary::lds_sizeNicolai Hähnle2019-06-121-1/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* amd/common: extract ac_parse_shader_binary_configNicolai Hähnle2019-06-122-34/+47
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use the ac helper for index buffer stores in the culling shaderMarek Olšák2019-06-113-3/+5
|
* ac/nir: Remove stale TODOConnor Abbott2019-06-061-1/+7
| | | | | | While we're here, copy the comment explaining this from radeonsi. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: rename LLVM <= 7 helpers for readabilityMarek Olšák2019-06-041-37/+37
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: fix a typo in ac_build_wg_scan_bottomMarek Olšák2019-06-041-1/+1
| | | | | Cc: 19.1 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: mark some texture intrinsics as convergentRhys Perry2019-06-041-0/+18
| | | | | | | | | | | | Otherwise LLVM can sink them and their texture coordinate calculations into divergent branches. v2: simplify the conditions on which the intrinsic is marked as convergent v3: only mark as convergent in FS and CS with derivative groups Cc: <[email protected]> Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac,radv: do not emit vec3 for raw load/store on SISamuel Pitoiset2019-06-043-7/+19
| | | | | | | | It's unsupported, only load/store format with vec3 are supported. Fixes: 6970a9a6ca9 ("ac,radv: remove the vec3 restriction with LLVM 9+")" Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/registers: don't use the si, cik, vi names, use gfxNMarek Olšák2019-06-032-4/+4
| | | | trivial
* amd/common: use generated register headerNicolai Hähnle2019-06-035-16341/+13
|
* amd/common: unify PITCH_GFX6 and PITCH_GFX9Nicolai Hähnle2019-06-032-6/+6
| | | | | | | | | | | The definition of the fields differs, but PITCH_GFX9 is a mere extension of PITCH_GFX6 that does not conflict with any other fields. This aligns the definitions with what will be generated from the register JSON. The information about how large the fields really are is preserved in the register database.
* amd/common: rename R_3F2_CONTROL to IB_CONTROL for disambiguationNicolai Hähnle2019-06-032-2/+2
| | | | | | | This "register" name collides with R_370_CONTROL. This aligns the definitions with what will be generated from the register JSON.
* amd/common: cleanup DATA_FORMAT/NUM_FORMAT field namesNicolai Hähnle2019-06-033-13/+13
| | | | | | | | | | The field layout wasn't actually changed in gfx9, so having the suffix isn't very useful. The field *contents* were changed, but this is reflected in the V_xxx_xxx definitions and is taken into account by the ac_debug logic based on the register JSON. This aligns the definitions with what will be generated from the register JSON.
* amd/common: derive ac_debug tables from register JSONNicolai Hähnle2019-06-033-176/+130
|
* ac: use amdgpu-flat-work-group-sizeMarek Olšák2019-06-032-0/+11
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac,radv: remove the vec3 restriction with LLVM 9+Samuel Pitoiset2019-06-033-10/+15
| | | | | | | | | | | | | | | | | | | | | | | | | This changes requires LLVM r356755. 32706 shaders in 16744 tests Totals: SGPRS: 1448848 -> 1455984 (0.49 %) VGPRS: 1016684 -> 1016220 (-0.05 %) Spilled SGPRs: 25871 -> 25815 (-0.22 %) Spilled VGPRs: 122 -> 122 (0.00 %) Scratch size: 11964 -> 11956 (-0.07 %) dwords per thread Code Size: 55324500 -> 55301152 (-0.04 %) bytes Max Waves: 235660 -> 235586 (-0.03 %) Totals from affected shaders: SGPRS: 293704 -> 300840 (2.43 %) VGPRS: 246716 -> 246252 (-0.19 %) Spilled SGPRs: 159 -> 103 (-35.22 %) Scratch size: 188 -> 180 (-4.26 %) dwords per thread Code Size: 8653664 -> 8630316 (-0.27 %) bytes Max Waves: 60811 -> 60737 (-0.12 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: treat Mullins as Kabini, remove the enumMarek Olšák2019-05-274-9/+0
| | | | it's the same design
* nir: Drop imov/fmov in favor of one mov instructionJason Ekstrand2019-05-241-2/+1
| | | | | | | | | | | | | | | | The difference between imov and fmov has been a constant source of confusion in NIR for years. No one really knows why we have two or when to use one vs. the other. The real reason is that they do different things in the presence of source and destination modifiers. However, without modifiers (which many back-ends don't have), they are identical. Now that we've reworked nir_lower_to_source_mods to leave one abs/neg instruction in place rather than replacing them with imov or fmov instructions, we don't need two different instructions at all anymore. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Acked-by: Rob Clark <[email protected]>
* radv: add a workaround for Monster Hunter World and LLVM 7&8Samuel Pitoiset2019-05-172-3/+5
| | | | | | | | | | | | | | The load/store optimizer pass doesn't handle WaW hazards correctly and this is the root cause of the reflection issue with Monster Hunter World. AFAIK, it's the only game that are affected by this issue. This is fixed with LLVM r361008, but we need a workaround for older LLVM versions unfortunately. Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: match radeonsi code in ac_shader_binary_read_configMarek Olšák2019-05-161-3/+3
|
* r600+radeonsi: use ctx_query_reset_status on radeonMarek Olšák2019-05-162-3/+0
| | | | This allows a nice cleanup, because the winsys always handles it.
* winsys/amdgpu: add a parallel compute IB coupled with a gfx IBMarek Olšák2019-05-162-0/+9
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: add LLVM code for triangle cullingMarek Olšák2019-05-163-0/+336
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: rename SI-CIK-VI to GFX6-GFX7-GFX8Marek Olšák2019-05-1510-60/+60
| | | | | | | | | | | | Acked-by: Dave Airlie <[email protected]> We already use GFX9 and I don't want us to have confusing naming in the driver. GFXn naming is better from the driver perspective, because it's the real version of the gfx portion of the hw. Also, CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI. It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have nothing to do with GFXn and they have their own version numbers.
* ac: add comments to chip enumsMarek Olšák2019-05-151-11/+11
| | | | | Reviewed-by: Alex Deucher <[email protected]> (except GFX2 changes) Reviewed-by: Dave Airlie <[email protected]> (except <= GFX5 changes)
* ac: use 1D GEPs for descriptors and constantsMarek Olšák2019-05-142-10/+7
| | | | | | | just a cleanup Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: add ac_build_opencoded_fetch_formatNicolai Hähnle2019-05-132-0/+343
| | | | | | | Implement software emulation of buffer_load_format for all types required by vertex buffer fetches. Reviewed-by: Marek Olšák <[email protected]>
* radv: apply the indexing workaround for atomic buffer operations on GFX9Samuel Pitoiset2019-05-032-5/+8
| | | | | | | | | | | | | | | | | | | Because the new raw/struct intrinsics are buggy with LLVM 8 (they weren't marked as source of divergence), we fallback to the old instrinsics for atomic buffer operations only. This means we need to apply the indexing workaround for GFX9. The load/store operations still use the new LLVM 8 intrinsics. The fact that we need another workaround is painful but we should be able to clean up that a bit once LLVM 7 support will be dropped. This fixes a GPU hang with AC Odyssey and some rendering problems with Nioh. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573 Fixes: 31164cf5f70 ("ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: tidy up ac_build_llvm8_tbuffer_{load,store}Samuel Pitoiset2019-05-021-13/+13
| | | | | | | | For consistency with ac_build_llvm8_buffer_{load,store}_common helpers and that will help a bit for removing the vec3 restriction. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* delete autotools .gitignore filesEric Engestrom2019-04-291-1/+0
| | | | | | | | One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* ac,ac/nir: use a better sync scope for shared atomicsRhys Perry2019-04-293-9/+72
| | | | | | | | | | | | | | | | | | | | | | | | | https://reviews.llvm.org/rL356946 (present in LLVM 9 and later) changed the meaning of the "system" sync scope, making it no longer restricted to the memory operation's address space. So a single address space sync scope is needed for shared atomic operations (such as "system-one-as" or "workgroup-one-as") otherwise buffer_wbinvl1 and s_waitcnt instructions can be created at each shared atomic operation. This mostly reimplements LLVMBuildAtomicRMW and LLVMBuildAtomicCmpXchg to allow for more sync scopes and uses the new functions in ac->nir with the "workgroup-one-as" or "workgroup" sync scopes. F1 2017 (4K, Ultra High settings, TAA), avg FPS : 59 -> 59.67 (+1.14%) Strange Brigade (4K, ~highest settings), avg FPS : 51.5 -> 51.6 (+0.19%) RotTR/mountain (4K, VeryHigh settings, FXAA), avg FPS : 57.2 -> 57.2 (+0.0%) RotTR/tomb (4K, VeryHigh settings, FXAA), avg FPS : 42.5 -> 43.0 (+1.17%) RotTR/valley (4K, VeryHigh settings, FXAA), avg FPS : 40.7 -> 41.6 (+2.21%) Warhammer II/fallen, avg FPS : 31.63 -> 31.83 (+0.63%) Warhammer II/skaven, avg FPS : 37.77 -> 38.07 (+0.79%) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Add support for planes.Bas Nieuwenhuizen2019-04-252-4/+19
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add REWIND and GDS registers to register headersMarek Olšák2019-04-231-0/+16
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: add ac_get_i1_sgpr_maskMarek Olšák2019-04-232-0/+18
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: add radeon_info::is_pro_graphicsMarek Olšák2019-04-232-0/+5
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: add radeon_info::marketing_name, replacing the winsys callbackMarek Olšák2019-04-232-0/+3
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac/nir: use the new raw/struct SSBO atomic intrisics for comp_swapSamuel Pitoiset2019-04-191-2/+1
| | | | | | | | | | | | This is actually fixed now. This change requires LLVM r358579. Make sure to have it in your tree, otherwise the following piglit will hang: tests/spec/arb_shader_storage_buffer_object/execution/ssbo-atomicCompSwap-int.shader_test Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: only use the new raw/struct SSBO atomic intrinsics with LLVM 9+Samuel Pitoiset2019-04-191-1/+4
| | | | | | | | They are buggy with older LLVM version, see r358579. Fixes: 78c551aca1c ("ac/nir: use new LLVM 8 intrinsics for SSBO atomics except cmpswap") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+Samuel Pitoiset2019-04-191-1/+4
| | | | | | | | | They are buggy with LLVM 8 because they weren't marked as source of divergence, see r358579. Fixes: dd0172e865f ("radv: Use structured intrinsics instead of indexing workaround for GFX9.")" Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac: use struct/raw store intrinsics for 8-bit/16-bit int with LLVM 9+Samuel Pitoiset2019-04-171-14/+34
| | | | | | | | This changes requires LLVM r356465. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: use struct/raw load intrinsics for 8-bit/16-bit int with LLVM 9+Samuel Pitoiset2019-04-171-12/+38
| | | | | | | | This changes requires LLVM r356465. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add support for more types with struct/raw LLVM intrinsicsSamuel Pitoiset2019-04-171-20/+26
| | | | | | | | LLVM 9+ now supports 8-bit and 16-bit types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: add 64-bit SSBO atomic operations supportSamuel Pitoiset2019-04-171-3/+7
| | | | | | | | Except compare&swap which is still buggy. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>