aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi/si_compute.c
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: use simple_mtx_t instead of mtx_tMarek Olšák2019-10-071-5/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* rename pipe_llvm_program_header to pipe_binary_program_headerKarol Herbst2019-09-211-1/+1
| | | | | | | | We want to use it for other formats as well, so give it a more generic name Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* gallium: add blob field to pipe_llvm_program_headerKarol Herbst2019-09-211-3/+1
| | | | | | | | makes it easier to consume a IR_NATIVE binary Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* radeonsi: fix scratch buffer WAVESIZE setting leading to corruptionMarek Olšák2019-08-271-1/+5
| | | | | Cc: 19.2 19.1 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: align scratch and ring buffer allocations for faster memory accessMarek Olšák2019-08-271-1/+2
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: move some global shader cache flags to per-binary flagsMarek Olšák2019-08-271-1/+1
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: fix the legacy pipeline by storing as_ngg in the shader cacheMarek Olšák2019-08-271-1/+1
| | | | | | It could load an NGG shader when we want a legacy shader and vice versa. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/nir: always lower ballot masks as 64-bit, codegen handles itMarek Olšák2019-08-191-1/+1
| | | | | | This fixes KHR-GL45.shader_ballot_tests.ShaderBallotBitmasks. This solution is better, because the IR isn't dependent on wave32.
* radeonsi: allocate and resize global_buffers as neededMarek Olšák2019-08-191-2/+21
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: remove the always_nir optionMarek Olšák2019-08-121-1/+1
| | | | tgsi_to_nir is no longer optional if NIR is enabled.
* gallium: add AMD-specific compute TGSI enumsMarek Olšák2019-08-121-1/+1
| | | | for tgsi_to_nir
* radeonsi: release NIR in the right place to fix crashesMarek Olšák2019-07-301-1/+1
|
* radeonsi/nir: add an option to convert TGSI to NIRMarek Olšák2019-07-301-1/+6
| | | | | | Use at your own risk. Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi/gfx10: implement Wave32Marek Olšák2019-07-191-5/+7
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/rtld: add support for Wave32Marek Olšák2019-07-191-0/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: remove what appears to be legacy compute codeMarek Olšák2019-07-191-35/+6
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi: remove si_program::use_code_object_v2Marek Olšák2019-07-191-5/+3
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi: add si_shader_selector into si_computeMarek Olšák2019-07-191-57/+52
| | | | | | | | Now we can assume that shader->selector is always set. This will simplify some code. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* radeonsi: fix leaked compute shader NIRMarek Olšák2019-07-191-0/+1
| | | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* ac: import ac_get_compute_resource_limits() from RadeonSISamuel Pitoiset2019-07-121-33/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/gfx10: launch 2 compute waves per CU before going onto the next CUMarek Olšák2019-07-091-2/+9
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radeonsi/gfx10: set HS/GS/CS.WGP_MODEMarek Olšák2019-07-091-0/+1
| | | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radeonsi: fix and clean up shader_type passingMarek Olšák2019-07-091-4/+3
| | | | | | | | | - don't pass it via a parameter if it can be derived from other parameters - set shader_type for ac_rtld_open - use enum pipe_shader_type instead of unsigned Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radeonsi/gfx10: setup registers for OpenGL computeNicolai Hähnle2019-07-031-2/+11
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/gfx10: add si_context::emit_cache_flushNicolai Hähnle2019-07-031-1/+1
| | | | | | | | | | | The introduction of GCR_CNTL makes cache flush handling on gfx10 sufficiently different that it makes sense to just use a separate function. Since emit_cache_flush is called quite early during context init, we initialize the pointer explicitly in si_create_context. Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: rename and re-document cache flush flagsMarek Olšák2019-06-241-1/+1
| | | | | | | SMEM and VMEM caches are L0 on gfx10. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/rtld: layout and relocate LDS symbolsNicolai Hähnle2019-06-121-3/+6
| | | | | | | | | | | Upcoming changes to LLVM will emit LDS objects as symbols in the ELF symbol table, with relocations that will be resolved with this change. Callers will also be able to define LDS symbols that are shared between shader parts. This will be used by radeonsi for the ESGS ring in gfx9+ merged shaders. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use the new run-time linker for shadersNicolai Hähnle2019-06-121-26/+37
| | | | | | | v2: - fix a memory leak Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: return bool from si_shader_binary_uploadNicolai Hähnle2019-06-121-3/+3
| | | | | | We didn't really use error codes anyway. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: let si_shader_create return a booleanNicolai Hähnle2019-06-121-1/+1
| | | | | | We didn't really use error codes anyway. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use ac_shader_configNicolai Hähnle2019-06-121-7/+7
| | | | Reviewed-by: Marek Olšák <[email protected]>
* amd/common: use SH{0,1}_CU_EN definitions only of COMPUTE_STATIC_THREAD_MGMT_SE0Nicolai Hähnle2019-06-031-5/+5
| | | | | | | The automatic header generation unifies identical registers in a series and only emits definitions for the first one. This is mostly to avoid emitting excessive definitions for CB registers, but special-casing an exception for this family of registers doesn't seem worth it.
* radeonsi: add a cs parameter into si_cp_copy_dataMarek Olšák2019-05-161-1/+1
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add threadgroups_per_cu param into si_get_compute_resource_limitsMarek Olšák2019-05-161-3/+6
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: make si_initialize_compute reusableMarek Olšák2019-05-161-7/+7
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: extract COMPUTE_RESOURCE_LIMITS code into a helperMarek Olšák2019-05-161-12/+20
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* ac: rename SI-CIK-VI to GFX6-GFX7-GFX8Marek Olšák2019-05-151-13/+13
| | | | | | | | | | | | Acked-by: Dave Airlie <[email protected]> We already use GFX9 and I don't want us to have confusing naming in the driver. GFXn naming is better from the driver perspective, because it's the real version of the gfx portion of the hw. Also, CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI. It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have nothing to do with GFXn and they have their own version numbers.
* radeonsi/nir: call radeonsi nir opts before the scan passTimothy Arceri2019-05-011-0/+1
| | | | | | | | | | | | | | Some of the opts are not called in the general optimastion loop in the state trackers glsl -> nir conversion. We need to call the radeonsi specific optimisation once before scanning over the nir otherwise we can end up gathering info on code that is later removed. Fixes an assert in the piglit test: ./bin/varying-struct-centroid_gles3 Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add BOs after need_cs_spaceMarek Olšák2019-04-241-3/+3
| | | | | | | | need_cs_space may clear the buffer list. Fixes: 951d60f8cdc88 "radeonsi: delay adding BOs at the beginning of IBs until the first draw" Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: delay adding BOs at the beginning of IBs until the first drawMarek Olšák2019-04-231-0/+3
| | | | | | | so that bound compute shader resources won't be added when they are not needed and same for graphics. Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: add si_cp_copy_dataMarek Olšák2019-04-231-16/+5
| | | | | Tested-by: Dieter Nützel <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi: Remove implicit const cast.Bas Nieuwenhuizen2019-03-171-1/+1
| | | | | Fixes: b9e02fe138e "gallium: add pipe_grid_info::last_block" Reviewed-by: Eric Engestrom <[email protected]>
* gallium: add pipe_grid_info::last_blockMarek Olšák2019-03-151-1/+1
| | | | | | | | | The OpenMAX state tracker will use this. RadeonSI is adapted to use pipe_grid_info::last_block instead of its internal state. Acked-by: Leo Liu <[email protected]>
* radeonsi: always use compute rings for clover on CI and newer (v2)Marek Olšák2019-02-261-6/+9
| | | | | | initialize all non-compute context functions to NULL. v2: fix SI
* radeonsi: rename r600_resource -> si_resourceMarek Olšák2019-01-221-15/+15
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: rename rscreen -> sscreenMarek Olšák2019-01-221-2/+2
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: add compute_last_block to configure the partial block fieldsJiang, Sonny2019-01-221-5/+27
|
* ac: correct PKT3_COPY_DATA definitionsMarek Olšák2018-10-061-1/+1
|
* radeonsi: let internal compute dispatches tune WAVES_PER_SHMarek Olšák2018-08-291-0/+8
|
* radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSIMarek Olšák2018-08-291-3/+13
|