summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2Marek Olšák2018-07-031-3/+27
| | | | | | | | | Cc: 18.1 <[email protected]> (cherry picked from commit 41f80373b46604f585497086f971a43aeea7f0c1) Conflicts fixed by Dylan Conflicts: src/gallium/drivers/radeonsi/si_blit.c
* radeonsi: always put persistent buffers into GTT on radeonMarek Olšák2018-06-201-1/+5
| | | | | | | | This improves performance for certain games. Cc: 18.1 <[email protected]> Tested-by: Dieter Nützel <[email protected]> (cherry picked from commit 9322974ec716b8c3b2e326559f663ff087daa38c)
* ac/gpu_info: add kernel_flushes_hdp_before_ibMarek Olšák2018-06-201-4/+2
| | | | | | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit b81149e258a492ed0c81058fb535f6bfdacb36da) Conflicts: src/amd/common/ac_gpu_info.c Conflicts resolved by Dylan
* radeonsi/gfx9: fix si_get_buffer_from_descriptors for 48-bit pointersMarek Olšák2018-06-151-2/+2
| | | | | | | | | This fixes: GL45-CTS.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations Cc: 18.0 18.1 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (cherry picked from commit 6d671078a8eb683a4a978ca4f9d4e41cbb399bf8)
* radeonsi: fix possible truncation on renderer stringTimothy Arceri2018-06-111-1/+1
| | | | | | | | Fixes truncation warning in gcc 8.1 Fixes: 8539c9bf3158 ("gallium/radeon: add the kernel version into the renderer string") Reviewed-by: Michel Dänzer <[email protected]> (cherry picked from commit 03c370d2f164847abad88c1af7c159db23014947)
* radeonsi: Fix crash on shaders using MSAA image load/storeAlex Smith2018-06-011-1/+7
| | | | | | | | | | | | | The value returned by tgsi_util_get_texture_coord_dim() does not account for the sample index. This means image_fetch_coords() will not fetch it, leading to a null deref in ac_build_image_opcode() which expects it to be present (the return value of ac_num_coords() *does* include the sample index). Signed-off-by: Alex Smith <[email protected]> Cc: "18.1" <[email protected]> Reviewed-by: Marek Olšák <[email protected]> (cherry picked from commit 01a2414045bd819267821423dbf77c3655cc214d)
* radeonsi: fix incorrect parentheses around VS-PS varying eliminationMarek Olšák2018-05-301-2/+2
| | | | | | | | I don't know if it caused issues. Cc: 18.0 18.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit 92ea9329e5eacf9a44ed30b3d72038a411eb771a)
* radeonsi/gfx9: work around a GPU hang due to broken indirect indexing in LLVMMarek Olšák2018-05-111-0/+9
| | | | | | | Fixes: 6d19120da85 "radeonsi/gfx9: workaround for INTERP with indirect indexing" Cc: 18.1 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit 597b9e881083533b987dbcbb8f679ca1eefff974)
* radeonsi/gfx9: workaround for INTERP with indirect indexingMarek Olšák2018-04-301-6/+13
| | | | | | | | and clean up the conditions. Reviewed-by: Nicolai Hähnle <[email protected]> Cc: 18.0 18.1 <[email protected]> (cherry picked from commit 6d19120da851c0d3f97376c733d674f7c8ab0457)
* radeonsi: generate image load/store/atomic ops using ac_build_image_opcodeNicolai Hähnle2018-04-201-131/+99
| | | | | | In preparation of dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass address components individually to ac_build_image_intrinsicNicolai Hähnle2018-04-201-144/+78
| | | | | | This is in preparation for the new image intrinsics. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass new enum ac_image_dim to ac_build_image_opcodeNicolai Hähnle2018-04-201-2/+48
| | | | | | | This is in preparation for the new, dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <[email protected]>
* radeonsi/nir: fix crash in test involving the sample maskNicolai Hähnle2018-04-201-1/+2
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi/nir: set FS properties only when scanning a fragment shaderNicolai Hähnle2018-04-201-1/+2
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: fix error paths of si_texture_transfer_mapNicolai Hähnle2018-04-201-13/+12
| | | | | | | trans is zero-initialized, but trans->resource is setup immediately so needs to be dereferenced. Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: add support for VegaMMarek Olšák2018-04-185-2/+10
| | | | Acked-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: fix a hang with an empty first IBMarek Olšák2018-04-181-3/+4
| | | | | | | | This packet causes the no-op IB detection to fail, so the IB is always submitted. Also fix the no-op IB detection by moving the begin call. Cc: 18.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't emit partial flushes for internal CS flushes onlyMarek Olšák2018-04-167-11/+14
| | | | | Tested-by: Benedikt Schemmer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement mechanism for IBs without partial flushes at the end (v6)Marek Olšák2018-04-162-16/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | (This patch doesn't enable the behavior. It will be enabled in a later commit.) Draw calls from multiple IBs can be executed in parallel. v2: do emit partial flushes on SI v3: invalidate all shader caches at the beginning of IBs v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed, only do this for flushes invoked internally v5: empty IBs should wait for idle if the flush requires it v6: split the commit If we artificially limit the number of draw calls per IB to 5, we'll get a lot more IBs, leading to a lot more partial flushes. Let's see how the removal of partial flushes changes GPU utilization in that scenario: With partial flushes (time busy): CP: 99% SPI: 86% CB: 73: Without partial flushes (time busy): CP: 99% SPI: 93% CB: 81% Tested-by: Benedikt Schemmer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: restore si_emit_cache_flush call at the end of IBsMarek Olšák2018-04-131-0/+2
| | | | Fixes: 918b798668c "radeonsi: make sure CP DMA is idle at the end of IBs"
* gallium: move ddebug, noop, rbug, trace to auxiliary to improve build timesMarek Olšák2018-04-132-2/+2
| | | | which also simplifies the build scripts.
* radeonsi: make sure CP DMA is idle at the end of IBsMarek Olšák2018-04-133-2/+16
|
* radeonsi: always prefetch later shaders after the draw packetMarek Olšák2018-04-133-26/+75
| | | | | | | | | so that the draw is started as soon as possible. v2: only prefetch the API VS and VBO descriptors Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: emit shader pointers before cache flushes & waitsMarek Olšák2018-04-131-13/+7
| | | | | | | | This code was written with the constant engine in mind. We can simplify it now. Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi/gfx9: don't use the workaround for gather4 + stencilMarek Olšák2018-04-131-2/+11
| | | | | | | it doesn't seem to be needed. Acked-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: disable TC-compat HTILE on Tonga and IcelandMarek Olšák2018-04-131-0/+7
| | | | | Acked-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: force 2D tiling on VI only when TC-compat HTILE is really enabledMarek Olšák2018-04-131-9/+7
| | | | | | | just pass the flag that indicates it. Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: don't flush HTILE if there is no HTILE clearMarek Olšák2018-04-131-2/+2
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: merge 2 identical if statements in si_clearMarek Olšák2018-04-131-9/+2
| | | | | | | and other cleanups Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: don't do GFX-specific texture decompression for computeMarek Olšák2018-04-131-10/+10
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: simplify generating the renderer stringMarek Olšák2018-04-131-11/+8
| | | | | | | HAVE_LLVM > 0 is a tautology. Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: use PIPE_FORMAT_P016 format for VP9 profile2Leo Liu2018-04-121-1/+2
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: cap VP9 support to progressive bufferLeo Liu2018-04-121-0/+2
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: cap VP9 support to RavenLeo Liu2018-04-121-0/+4
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: correctly parse disassembly with labelsNicolai Hähnle2018-04-111-31/+32
| | | | | | | | | | LLVM now emits labels as part of the disassembly string, which is very useful but breaks the old parsing approach. Use the semicolon to detect the boundary of instructions instead of going by line breaks. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass -O halt_waves to umr for hang debuggingNicolai Hähnle2018-04-111-2/+2
| | | | | | | | | | | This will give us meaningful wave information in the case of a hang where shaders are still running in an infinite loop. Note that we call umr multiple times for different sections of the ddebug hang dump, and so the wave information will not necessarily match up between sections. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add shader binary padding for UMRMarek Olšák2018-04-101-3/+15
|
* radeonsi: autotools: add si_build_pm4.h in dist tarballJuan A. Suarez Romero2018-04-101-0/+1
| | | | | | | | Fixes: 5777488406c ("radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h") Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radeonsi/nir: tidy up si_nir_load_sampler_desc()Timothy Arceri2018-04-101-5/+3
| | | | | | | | This makes it easier to follow the code, and also initialises dynamic_index which will be useful for adding bindless textures support. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: set uses_bindless_images for imagesTimothy Arceri2018-04-101-1/+16
| | | | | | V2: add missing intrinsics (Spotted-by: Samuel Pitoiset) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: don't add bindless samplers/images to declared bitmasksTimothy Arceri2018-04-101-6/+6
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: convert dispatch packet to little endianBas Vermeulen2018-04-091-12/+12
| | | | | | | | | | | | | | | The parameters for the compute engine are wrong when using an E8860 on a big endian machine. To fix this, convert the contents of struct dispatch_packet to little endian. This ensures that get_global_id(0) and similar functions in the OpenCL code get the correct endian values, and makes my simple OpenCL program work correctly. Signed-off-by: Bas Vermeulen <[email protected]> Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: correct si_vgt_param_key on big endian machinesBas Vermeulen2018-04-091-0/+13
| | | | | | | | | | | | Using mesa OpenCL failed on a big endian PowerPC machine because si_vgt_param_key is using bitfields and a 32 bit int for an index into an array. Fix si_vgt_param_key to work correctly on both little endian and big endian machines. Signed-off-by: Bas Vermeulen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: don't set RB+ registers on GFX9 chips without RB+Marek Olšák2018-04-091-6/+1
| | | | | | CLEAR_STATE initializes them properly. Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: Reorder checks in si_check_render_feedbackJan Vesely2018-04-051-3/+3
| | | | | | | | | si_get_total_colormask accesses NULL pointer on compute shaders Fixes crashes on clover Fixes: 0669dca9c00261849cee14d69fdea0a5e323c7f7 ("radeonsi: skip DCC render feedback checking if color writes are disabled") CC: Marek Olšák <[email protected]> Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix a crash if ps_shader.cso is NULL in si_get_total_colormaskMarek Olšák2018-04-051-0/+3
|
* radeonsi: remove more R600 referencesMarek Olšák2018-04-052-2/+1
| | | | Acked-by: Timothy Arceri <[email protected]>
* radeonsi: try to fix mesonMarek Olšák2018-04-051-6/+33
| | | | | | | | | | | | | | | This is not fully tested. Meson can't link LLVM even though automake can. PATH=/usr/llvm/x86_64-linux-gnu/bin:$PATH meson build/ -Dgallium-va=false \ -Dplatforms=x11,drm -Dgallium-drivers=radeonsi -Ddri-drivers= \ -Dgallium-omx=disabled -Dgallium-xvmc=false -Dgles1=false \ -Dtexture-float=true -Dvulkan-drivers= src/gallium/auxiliary/libgallium.a(gallivm_lp_bld_misc.cpp.o): (.data.rel.ro._ZTI26DelegatingJITMemoryManager[_ZTI26DelegatingJITMemoryManager]+0x10): undefined reference to `typeinfo for llvm::RTDyldMemoryManager' Acked-by: Timothy Arceri <[email protected]>
* radeonsi: don't build libradeon.la separatelyMarek Olšák2018-04-053-2/+28
| | | | | | for better parallelism Acked-by: Timothy Arceri <[email protected]>
* radeonsi: clean up GET_MAX_VIEWPORT_RANGE definitionMarek Olšák2018-04-051-2/+2
| | | | Acked-by: Timothy Arceri <[email protected]>