| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 .
That MR keeps the clip and cull arrays split.
So we have to handle
- compact arrays with location_frac != 0
- VARYING_SLOT_CLIP_DIST1
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
Random unused vars.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Fixes: 4bb6c49375e "radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9."
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL
leaves it undefined). Performing fpow lowering in NIR would break this
behavior, preventing us from using prog_to_nir.
According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common
expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>,
which presumably does a zero-wins multiply.
Lowering in NIR results in a non-legacy multiply, where:
pow(0, 0) = 2^(log2(0) * 0)
= 2^(-INF * 0)
= 2^(-NaN)
= -NaN
which isn't the desired result.
This reverts:
- commit d6b75392067712908bdc372f1007e085439bf9f5
(ac/nir: remove emission of nir_op_fpow)
- commit 22430224fec31591432d4a3e65c6f457ba1c1653
(radeonsi/nir: enable lowering of fpow)
and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir
after enabling prog_to_nir in st/mesa later in this series.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
v2: don't use ac_get_onef()
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
v2: don't use ac_get_onef()
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
v2: don't use ac_get_onef()
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size()
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: don't use ac_get_zerof() and ac_get_onef()
v3: rename "intr" to "name"
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
So that the signature is correct and consistent, the inputs to a export
intrinsic should always be 32-bit floats.
This and the previous commit fixes a large amount crashes from
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_int_*
tests
Fixes: b722b29f10d ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
16-bit outputs are stored as 16-bit floats in the outputs array, so they
have to be bitcast.
Fixes: b722b29f10d ('radv: add support for 16bit input/output')
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
This version is better and safer.
Cc: 18.3 19.0 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Trivial.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597
Cc: 18.3 19.0 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
It uses the new LLVM intrinsics.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The elements added into a vector should have the same type as the
first one, otherwise this hits an assertion in LLVM.
Fixes: 4b3549c0846 ("radv: reduce the number of loaded channels for vertex input fetches")
reported-by: Philip Rebohle <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
normalized and scaled formats also return floats.
Fixes: 4b3549c0846 ("radv: reduce the number of loaded channels for vertex input fetches")
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
We should check that num_channels is 4, otherwise that breaks
the world. Sorry for the short breakage.
Fixes: 4b3549c0846 ("radv: reduce the number of loaded channels for vertex input fetches")
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's unnecessary to load more channels than the vertex attribute
format. The remaining channels are filled with 0 for y and z,
and 1 for w.
29077 shaders in 15096 tests
Totals:
SGPRS: 1321605 -> 1318869 (-0.21 %)
VGPRS: 935236 -> 932252 (-0.32 %)
Spilled SGPRs: 24860 -> 24776 (-0.34 %)
Code Size: 49832348 -> 49819464 (-0.03 %) bytes
Max Waves: 242101 -> 242611 (0.21 %)
Totals from affected shaders:
SGPRS: 93675 -> 90939 (-2.92 %)
VGPRS: 58016 -> 55032 (-5.14 %)
Spilled SGPRs: 172 -> 88 (-48.84 %)
Code Size: 2862740 -> 2849856 (-0.45 %) bytes
Max Waves: 15474 -> 15984 (3.30 %)
This mostly helps Croteam games (Talos/Sam2017).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
The formats will be used for reducing the number of loaded channels.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
And make ac_build_expand() a static function.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For some reasons, this breaks trees rendering in Project Cars.
Fixes: 85010585cde ("radv: only enable gl_SampleMask if MSAA is enabled too")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109401
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Fixes: 50fd253bd6e ("radv/winsys: Add priority handling during submit.")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes a critical issue.
Cc: <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109575
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes some scalar loads from shaders, but it increases
the number of SET_SH_REG packets. This is currently basic but
it could be improved if needed. Inlining dynamic offsets might
also help.
Original idea from Dave Airlie.
29077 shaders in 15096 tests
Totals:
SGPRS: 1321325 -> 1357101 (2.71 %)
VGPRS: 936000 -> 932576 (-0.37 %)
Spilled SGPRs: 24804 -> 24791 (-0.05 %)
Code Size: 49827960 -> 49642232 (-0.37 %) bytes
Max Waves: 242007 -> 242700 (0.29 %)
Totals from affected shaders:
SGPRS: 290989 -> 326765 (12.29 %)
VGPRS: 244680 -> 241256 (-1.40 %)
Spilled SGPRs: 1442 -> 1429 (-0.90 %)
Code Size: 8126688 -> 7940960 (-2.29 %) bytes
Max Waves: 80952 -> 81645 (0.86 %)
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This is needed in order to inline some push constants when possible.
This also adds a new helper for initializing the pass.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"The C standard says that compound literals which occur inside of
the body of a function have automatic storage duration associated
with the enclosing block. Older GCC releases were putting such
compound literals into the scope of the whole function, so their
lifetime actually ended at the end of containing function. This
has been fixed in GCC 9. Code that relied on this extended lifetime
needs to be fixed, move the compound literals to whatever scope
they need to accessible in."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109543
Cc: <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Gustaw Smolarczyk <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
clang points out this isn't used.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Fixes coverity warning
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This partially reverts a change from b7a93cbdede05a ("radv: Handle
VK_ATTACHMENT_UNUSED in CmdClearAttachment") which fixed actual issues
but also started to accept invalid values for the colorAttachment
field.
This change asserts that the field is valid for the current pass.
Signed-off-by: Lionel Landwerlin <[email protected]>
Fixes: b7a93cbdede05a ("radv: Handle VK_ATTACHMENT_UNUSED in CmdClearAttachment")
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
v2: Also update the release notes.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
The kernel already does it for us.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Needed for VK_EXT_buffer_device_address.
The pointers are implmemented as i8*, since I could not figure
out how to emulate setting struct offsets in LLVM based on the
SPIR-V offsets (and more weird stuff like row major matrices).
Acked-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We use a straight glsl->llvm type conversion so types should already be right.
Also even though the writemasks were changed we we not actually doing 32-bit
things, so this fails miserably.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
Can happen e.g. after a phi.
Fixes: a2b5cc3c399 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Fixes: a2b5cc3c399 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Fixes: a2b5cc3c399 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
The check was for 1 bit being set, which is clearly not what we want.
CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
For example with VK_EXT_buffer_device_address or
VK_KHR_variable_pointers.
Fixes: a2b5cc3c399 "radv: enable variable pointers"
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For the implicit casts inherent in nir.
This should probably have been done for shared memory for
VK_KHR_variable_pointers.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|