| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't create atomics with definitions if they are not used in NIR, but
our own DCE can remove the uses if an export turns out to be null.
Signed-off-by: Rhys Perry <[email protected]>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
(cherry picked from commit 69bed1c9186c3e24ad54089218d58c5f7b83befe)
Conflicts resolved by Dylan Baker
Conflicts:
src/amd/compiler/aco_opcodes.py
|
|
|
|
|
|
|
|
|
|
| |
ds_bpermute_b32/ds_permute_b32 are fine, I think
Signed-off-by: Rhys Perry <[email protected]>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
(cherry picked from commit ef8abfa7908974f571786e83b047b187af0e48c7)
|
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
(cherry picked from commit 8f291dc14600c614788301e3265ff7f0f48b8b0d)
|
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
(cherry picked from commit bbac52873f4248c2f545f12137bd24071a8043cc)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tested on Navi by using dEQP-VK.image.image_size.buffer.* and the GFX8
path with the size multipled by the stride.
dEQP-VK.image.image_size.buffer.* was also run with the tests modified to
use a 96bit format.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
(cherry picked from commit fcd6d8324560b5897586cbf8161f9b46bff5d11f)
|
|
|
|
|
|
|
|
|
|
| |
RADV's LLVM backend and radeonsi does the same thing.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Cc: 19.3 <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3081>
(cherry picked from commit 49bcd06f974dcd8f60b4aa7d93bf1843439126a2)
|
|
|
|
|
|
|
|
| |
This might fix a hang on Navi14.
Cc: 19.2 19.3 <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
(cherry picked from commit 186335d17d69c4a6b0ad69b82fe0744e4910645e)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a hang on Raven with Resident Evil 2.
I did not find anything more restricted to fix it:
- Setting persistent_states_per_bin to 1 fixes it too,
but likely does an internal break on any descriptor set changes
too.
- Only breaking the batch when cb_target_mask changes does not fix
it (and looking at AMDVLK comments, I suspect the code in radeonsi
should really be doing a FLUSH_DFSM).
- Always doing a FLUSH_DFSM on shader switch helps, but that is more
often than this and I don't think we should be doing that when DFSM
is disabled.
- Also emitting the existing break on framebuffer change when DFSM is
disabled does not fix the issue.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2315
CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit 7cc0702bbb955010600fcb2685edb4ba703561a8)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tiled-case is non-sensical for non-base mips, but Vulkan requires
that this function handles it but at the same time does not require
returning anything useful. So we can basically return anything.
Correct tiled pitch and offset are still required for our own WSI and
in the future getting the layouts of images with DRM format modifiers.
Both don't have to deal with images with more than 1 level though.
Fixes: 824bd0830e8 "radv: return the correct pitch for linear mipmaps on GFX10"
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2301
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2304
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 17741a0a05722245314e8ce9a3d5191feb63d9bd)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On GFX9, the pitch of a level is always the pitch of the entire image
but not on GFX10.
This fixes graphics glithes with Halo - The Master Chief Collection.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2188
CC: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 824bd0830e811a7b6347bbd5c30e0a76bc7daf60)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes 240 failing test cases in dEQP-VK.spirv_assembly which
were failing due to a bad s_ashr_i32 instruction. This commit
fixes the instruction format along with the definitions of the
instruction.
Fixes: 11f43caaeca166c96ae49dbd506b6f58dd4a13fb
Cc: 19.3 <[email protected]>
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
(cherry picked from commit 11e62a9734c631fa38f1e7b415f5b98f6a28589f)
|
|
|
|
|
|
|
|
|
|
| |
addrlib doesn't quite do it right, so do it ourselves.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2162
CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
(cherry picked from commit 88f567b5ce3c692dbee60ba58df3af7c614e4333)
|
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Fixes: 8d43e2b2ded0fe3c82d4 ("meson: add -Werror=empty-body to disallow `if(x);`")
Reviewed-By: Timur Kristóf <[email protected]>
(cherry picked from commit 51569e525afc5e7173f12b0a3f1ba0e92425407f)
|
|
|
|
|
|
|
|
|
| |
Things work the same between float and integer.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2261
CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit a435f002c40f5adc99d37e65cf6b8bd478dc8e71)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When stride is 0, it should check against the offset not the index.
This fixes black character models with Beat Saber and missing snow
with Dragon Quest.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2233
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1975
Cc: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3147>
(cherry picked from commit f3cccd05d9f6e9d05c18d1a3a5f9eb863e4f264b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a hang with geekbench.
The existence of RX 580 and NAVI10 results shows that the generations
before and after this do not have the issue. (They show up on the
website). So this is likely a GFX9 only issue.
This is not something weird like LDS size since none of the shaders
seem to use LDS.
CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3145>
(cherry picked from commit a9a3108be774aea620fa4fc726c33100d9a49add)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even without depth+stencil addrlib can (correctly!) decide to
disable tc compatible HTILE.
One example is 8x sampling with 32-bit depth on Stoney. The row size
on Stoney is 1024, while the tile size is 2048, which results in
tile splits which are not supported with tc-compat.
On Stoney, this fixes
dEQP-VK.glsl.builtin_var.fragdepth.*_list_d32_sfloat_multisample_8
CC: <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
(cherry picked from commit b53856aca31b1a1fde8cd87a6978934cd6ae94b1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
addrlib sometimes returns smaller sizes for tcCompat as it does
not seem to take into account the depth+stencil matching config
gymnastics with tcCompat.
This fixes
dEQP-VK.pipeline.render_to_image.core.2d_array.huge.height.r8g8b8a8_unorm_d32_sfloat_s8_uint
CC: <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3054>
(cherry picked from commit e197fb1c2fccf4719630d91a7c7f76308d88132b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
aarch64
__NR_select is not defined the same way across architectures, sometimes is
not even defined, like in armhf EABI and aarch64.
Signed-off-by: Luis Mendes <[email protected]>
Acked-by: Timothy Arceri <[email protected]>
Acked-by: Samuel Pitoiset <[email protected]>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2042
(cherry picked from commit 0cb5c96a83e3da2986fc8219b10671a7caea9ee5)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Global load/store instructions can't know if the offset is
out-of-bound because they don't use descriptors (no range).
Fix this by clamping the offset for arrays that are indexed
with a non-constant offset that's greater or equal to the array
size.
This fixes VM faults and GPU hangs with Dead Rising 4.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2148
Fixes: 71a67942003 ("ac/nir: Enable nir_opt_large_constants")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit a0f1a5fa051786c16de6f0062771051f8565daec)
|
|
|
|
|
|
|
|
| |
This is correct per the Vulkan spec format equivalence table.
Fixes: f36b52740a0 "radv/android: Add android hardware buffer queries."
Reviewed-by: Eric Anholt <[email protected]>
(cherry picked from commit 2e44bfc14f5c2e44ed820257615c2008955bc5bf)
|
|
|
|
|
|
| |
Fixes: 3a20ef4a3299fddc886f9d5908d8b3952dd63a54 'aco: refactor value numbering'
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
| |
Fixes: 13ab63bb62b ('radv: Implement VK_EXT_buffer_device_address.')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 35fab1ba3395604f748cd13ba82991372ca0cae7)
|
|
|
|
|
|
|
| |
Fixes: 93c8ebfa780ebd1495095e794731881aef29e7d3 'aco: Initial commit of independent AMD compiler'
Reviewed-by: Rhys Perry <[email protected]>
(cherry picked from commit 8861a82be7df2a5816254b45d390ddafad7d8711)
|
|
|
|
|
|
|
|
|
| |
LLVM and the proprietary compiler seem to do this
Fixes: b01847bd9 ("aco/gfx10: Fix mitigation of VMEMtoScalarWriteHazard.")
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
(cherry picked from commit a9fc81b098ca36d063dbdb6f69ffde1ab215d34b)
|
|
|
|
|
|
|
| |
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
(cherry picked from commit 11f43caaeca166c96ae49dbd506b6f58dd4a13fb)
|
|
|
|
|
|
|
|
| |
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2156
Fixes: 93c8ebfa780 ('aco: Initial commit of independent AMD compiler')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
(cherry picked from commit ff70ccad16a2efb3be1fbc4ca03453d38721a267)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Was totally broken ...
Removed two if(point) {} because point is always non-NULL and we
were counting on that already for counting, since we NULL our
references to semaphores without active point earlier.
Fixes: 4aa75bb3bdd "radv: Add wait-before-submit support for timelines."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2137
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit 48fc65413c8607390b2ed8cdaccac490d8c8fdae)
|
|
|
|
|
|
|
|
|
| |
They were out of sync. Besides syncing, lets ensure they never diverge
again.
Fixes: 8d2654a4197 "radv: Support VK_EXT_inline_uniform_block."
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit 4cde0e04e38ad2b9212d451cb5a84ed4ceaffd03)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implementation is loosely based on ROCm.
https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/master/ockl/src/wfredscan.cl
This fixes dEQP-VK.subgroups.arithmetic.*.subgroupexclusive* on GFX10.
Fixes: 227c29a80de ("amd/common/gfx10: implement scan & reduce operations")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit c9aa843961d2c3cb34e7cb2dc843b93d723e0692)
Conflicts resolved by Dylan Baker
|
|
|
|
|
|
|
|
|
|
|
| |
When a fragment shader includes an input variable decorated with
SampleId or SamplePosition, sample shading should be enabled
because minSampleShadingFactor is expected to be 1.0.
Cc: 19.2, 19.3 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 86a5fbfd4afb4fb53ab8ea0a13dda33b32f8b79b)
|
|
|
|
|
|
| |
Fixes: 946193ae008 "radv: add support for VK_AMD_buffer_marker"
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit 25bc9102d89f4390e0edc0a5f09fcde9de80f776)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to prevent a potential malicious pipeline tainting our
secure compile process and interfering with successive pipelines
we want to create a fresh fork for each pipeline compile.
Benchmarking has shown that simply forking on each pipeline
creation doubles the total time it takes to compile a fossilize db
collection. So instead here we fork the process at device creation
so that we have a slim copy of the device and then fork this
otherwise idle and untainted process each time we compile a
pipeline. Forking this slim copy of the device results in only a
20% increase in compile time vs a 100% increase.
Fixes: cff53da3 ("radv: enable secure compile support")
(cherry picked from commit f54c4e85ce089964e4d2ed39157f07226a41d11f)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will be used to create a communication pipe between the user
facing device and a freshly forked (per pipeline compile) slim copy
of that device.
We can't use pipe() here because the fork will not be a direct fork
of the user facing process. Instead we use a previously forked
copy of the process that was forked at device creation in order to
reduce the resources required for the fork and avoid performance
issues.
Fixes: cff53da3748d ("radv: enable secure compile support")
(cherry picked from commit 1663bb1f772dacadaec2d80f8286cfb76c4bb200)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the following commits we want to be able to fork an existing lightweight
fork created at device creation time. In order for the user facing process
to communicate with this new fresh fork we create some members here to hold
FIFO file descriptors and a unique id.
Here we also add a new fork enum that we use to tell the lightweight
process to create a fresh fork.
For more information on why we create a fresh fork see the following
commits.
(cherry picked from commit ef54f15da9ac11fafcbd6c91a7fcdac734436db8)
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the scratch ringbuffer settings are changed, the shader unit has
to be idle or we will have shaders using old and new settings.
That combination is not supported on the HW (likely the offset is
ringbuffer idx * WAVESIZE * 1024).
CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit 4eb2a1dc6fc32a047d53620a929eae0bb255f9da)
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <[email protected]>
(cherry picked from commit be1d11249bde1e041f6eb9c0acedb041ab450c4b)
|
|
|
|
|
|
|
|
|
| |
No pipeline-db changes
Signed-off-by: Rhys Perry <[email protected]>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
Reviewed-by: Daniel Schürmann <[email protected]>
(cherry picked from commit b062b92ab1a6504772a63a6b44f89b4579aef9a3)
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
(cherry picked from commit 2c98d79d114d3ed82a9e60519d666f51a1172cd3)
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
(cherry picked from commit 5a1bacb6f916d9a46a3d44830a4eb4bd3dca7d23)
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Fixes: 93c8ebfa ('aco: Initial commit of independent AMD compiler')
(cherry picked from commit f97d9334263a4dd8878c4e259fb5afcdc1334904)
|
|
|
|
|
|
|
| |
Fixes: 93c8ebfa780ebd1495095e794731881aef29e7d3 aco: Initial commit of independent AMD compiler
Reviewed-by: Rhys Perry <[email protected]>
(cherry picked from commit b6f5085dfee81d9c54fcda883d2b06742134084a)
|
|
|
|
|
|
|
| |
Fixes: 93c8ebfa780ebd1495095e794731881aef29e7d3 aco: Initial commit of independent AMD compiler
Reviewed-by: Rhys Perry <[email protected]>
(cherry picked from commit a2a6880743d7370a6425593f22d9e98317bfc3b2)
|
|
|
|
|
|
|
|
|
|
|
| |
It happens that some games try to access a vertex buffer without
a valid format. This case was incorrectly handled by
ac_get_tbuffer_format which made ACO emit an invalid instruction.
Signed-off-by: Timur Kristóf <[email protected]>
Cc: 19.3 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit 911a8261419f48dcd756f78832fa5a5f4c5b8d93)
|
|
|
|
|
|
|
|
|
| |
The workaround got accidentally moved to the wrong place
Fixes: 08d510010b7586387e363460b98e6a45bbe97164 aco: increase accuracy of SGPR limits
Reviewed-by: Samuel Pitoiset <[email protected]>
(cherry picked from commit a47e232ccd1df7a3f5dd1f92722772e8b81c90ed)
|
|
|
|
|
|
|
| |
Fixes: 86786999189c43b4a2c8e1c1a18b55cd2f369fff "aco: implement VGPR spilling"
Reviewed-by: Rhys Perry <[email protected]>
(cherry picked from commit efe737fc4f8f76f7d0b3bd8655eafc3196576a3d)
|
|
|
|
|
|
|
| |
Fixes: 86786999189c43b4a2c8e1c1a18b55cd2f369fff "aco: implement VGPR spilling"
Reviewed-by: Rhys Perry <[email protected]>
(cherry picked from commit 5c7dcb15e0cc98fe9fa5fa25f320f2bdd71187c3)
|
|
|
|
|
|
|
| |
Fixes: 86786999189c43b4a2c8e1c1a18b55cd2f369fff "aco: implement VGPR spilling"
Reviewed-by: Rhys Perry <[email protected]>
(cherry picked from commit d97c0bdd5558e4e00ede38afac879606aff5f04b)
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an app first creates a compute pipeline with
VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT set, then re-compile it
without that flag, the driver should re-compile the compute shader.
Otherwise, it will return the unoptimized one.
Fixes: ce188813bfe ("radv: add initial support for VK_PIPELINE_CREATE_DISABLE_OPTIMIZATION_BIT")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
(cherry picked from commit 9ab27647ff5379e8095a70c23dd16792f074c8c7)
|
|
|
|
|
|
|
|
|
| |
The seccomp filter allows read/write, let us make sure nobody can
do anything with this.
Fixes: cff53da3748 "radv: enable secure compile support"
Reviewed-by: Timothy Arceri <[email protected]>
(cherry picked from commit 8efb8f55a617bebe5f33b9745cc22a2490828db8)
|