| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Cc: 18.1 <[email protected]>
(cherry picked from commit 41f80373b46604f585497086f971a43aeea7f0c1)
Conflicts fixed by Dylan
Conflicts:
src/gallium/drivers/radeonsi/si_blit.c
|
|
|
|
|
|
|
|
| |
This improves performance for certain games.
Cc: 18.1 <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
(cherry picked from commit 9322974ec716b8c3b2e326559f663ff087daa38c)
|
|
|
|
|
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit b81149e258a492ed0c81058fb535f6bfdacb36da)
Conflicts:
src/amd/common/ac_gpu_info.c
Conflicts resolved by Dylan
|
|
|
|
|
|
|
|
|
| |
This fixes:
GL45-CTS.pipeline_statistics_query_tests_ARB.functional_compute_shader_invocations
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit 6d671078a8eb683a4a978ca4f9d4e41cbb399bf8)
|
|
|
|
|
|
|
|
| |
Fixes truncation warning in gcc 8.1
Fixes: 8539c9bf3158 ("gallium/radeon: add the kernel version into the renderer string")
Reviewed-by: Michel Dänzer <[email protected]>
(cherry picked from commit 03c370d2f164847abad88c1af7c159db23014947)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The value returned by tgsi_util_get_texture_coord_dim() does not
account for the sample index. This means image_fetch_coords() will not
fetch it, leading to a null deref in ac_build_image_opcode() which
expects it to be present (the return value of ac_num_coords() *does*
include the sample index).
Signed-off-by: Alex Smith <[email protected]>
Cc: "18.1" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 01a2414045bd819267821423dbf77c3655cc214d)
|
|
|
|
|
|
|
|
| |
I don't know if it caused issues.
Cc: 18.0 18.1 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit 92ea9329e5eacf9a44ed30b3d72038a411eb771a)
|
|
|
|
|
|
|
| |
Fixes: 6d19120da85 "radeonsi/gfx9: workaround for INTERP with indirect indexing"
Cc: 18.1 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
(cherry picked from commit 597b9e881083533b987dbcbb8f679ca1eefff974)
|
|
|
|
|
|
|
|
| |
and clean up the conditions.
Reviewed-by: Nicolai Hähnle <[email protected]>
Cc: 18.0 18.1 <[email protected]>
(cherry picked from commit 6d19120da851c0d3f97376c733d674f7c8ab0457)
|
|
|
|
|
|
| |
In preparation of dimension-aware LLVM image intrinsics.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This is in preparation for the new image intrinsics.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This is in preparation for the new, dimension-aware LLVM image
intrinsics.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
trans is zero-initialized, but trans->resource is setup immediately so
needs to be dereferenced.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This packet causes the no-op IB detection to fail, so the IB is always
submitted. Also fix the no-op IB detection by moving the begin call.
Cc: 18.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Benedikt Schemmer <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(This patch doesn't enable the behavior. It will be enabled in a later
commit.)
Draw calls from multiple IBs can be executed in parallel.
v2: do emit partial flushes on SI
v3: invalidate all shader caches at the beginning of IBs
v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed,
only do this for flushes invoked internally
v5: empty IBs should wait for idle if the flush requires it
v6: split the commit
If we artificially limit the number of draw calls per IB to 5, we'll get
a lot more IBs, leading to a lot more partial flushes. Let's see how
the removal of partial flushes changes GPU utilization in that scenario:
With partial flushes (time busy):
CP: 99%
SPI: 86%
CB: 73:
Without partial flushes (time busy):
CP: 99%
SPI: 93%
CB: 81%
Tested-by: Benedikt Schemmer <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Fixes: 918b798668c "radeonsi: make sure CP DMA is idle at the end of IBs"
|
|
|
|
| |
which also simplifies the build scripts.
|
| |
|
|
|
|
|
|
|
|
|
| |
so that the draw is started as soon as possible.
v2: only prefetch the API VS and VBO descriptors
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
| |
This code was written with the constant engine in mind.
We can simplify it now.
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
it doesn't seem to be needed.
Acked-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Acked-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
just pass the flag that indicates it.
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
and other cleanups
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
HAVE_LLVM > 0 is a tautology.
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
LLVM now emits labels as part of the disassembly string, which is very
useful but breaks the old parsing approach.
Use the semicolon to detect the boundary of instructions instead of going
by line breaks.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This will give us meaningful wave information in the case of a hang where
shaders are still running in an infinite loop.
Note that we call umr multiple times for different sections of the ddebug
hang dump, and so the wave information will not necessarily match up
between sections.
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
Fixes: 5777488406c ("radeonsi: move r600_cs.h contents into si_pipe.h,
si_build_pm4.h")
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes it easier to follow the code, and also initialises
dynamic_index which will be useful for adding bindless textures
support.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
V2: add missing intrinsics (Spotted-by: Samuel Pitoiset)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The parameters for the compute engine are wrong when using
an E8860 on a big endian machine.
To fix this, convert the contents of struct dispatch_packet
to little endian.
This ensures that get_global_id(0) and similar functions
in the OpenCL code get the correct endian values, and
makes my simple OpenCL program work correctly.
Signed-off-by: Bas Vermeulen <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using mesa OpenCL failed on a big endian PowerPC machine because
si_vgt_param_key is using bitfields and a 32 bit int for an
index into an array.
Fix si_vgt_param_key to work correctly on both little endian
and big endian machines.
Signed-off-by: Bas Vermeulen <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
CLEAR_STATE initializes them properly.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
si_get_total_colormask accesses NULL pointer on compute shaders
Fixes crashes on clover
Fixes: 0669dca9c00261849cee14d69fdea0a5e323c7f7 ("radeonsi: skip DCC render feedback checking if color writes are disabled")
CC: Marek Olšák <[email protected]>
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
| |
Acked-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is not fully tested. Meson can't link LLVM even though automake can.
PATH=/usr/llvm/x86_64-linux-gnu/bin:$PATH meson build/ -Dgallium-va=false \
-Dplatforms=x11,drm -Dgallium-drivers=radeonsi -Ddri-drivers= \
-Dgallium-omx=disabled -Dgallium-xvmc=false -Dgles1=false \
-Dtexture-float=true -Dvulkan-drivers=
src/gallium/auxiliary/libgallium.a(gallivm_lp_bld_misc.cpp.o):
(.data.rel.ro._ZTI26DelegatingJITMemoryManager[_ZTI26DelegatingJITMemoryManager]+0x10):
undefined reference to `typeinfo for llvm::RTDyldMemoryManager'
Acked-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
for better parallelism
Acked-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Acked-by: Timothy Arceri <[email protected]>
|