| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
So we can delay the layout until later in some import cases.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
Less duplication/complexity.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
Unused stride/offset args.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the timestamp is not ready (ie. UINT64_MAX), the availabily bit
should be zero. The previous code used to copy the timestamp value
as the availabily bit and that's completely wrong.
Because it's not that simple to emit a conditional with the CP, the
driver now uses a compute shader for copying timestamp query results.
Fixes dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise, the GPU might write timestamp queries after the reset
operation. This is similar to other query operations.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These are not needed anymore, since PhyReg has an implicit
conversion operator that can convert it to unsigned int,
which is equivalent to accessing this field.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Not sure if this instruction actually writes anything, but LLVM
disassembles a destination and sets it to NULL.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Currently just breaks up SMEM groups and fixes
FeatureVMEMtoScalarWriteHazard (name from LLVM).
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
According to LLVM, branches with an offset of 0x3f are buggy.
v2: (by Timur Kristóf)
- extract the GFX10 specific part to its own function
Signed-off-by: Rhys Perry <[email protected]>
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Reviewed-By: Timur Kristóf <[email protected]>
|
|
|
|
|
|
|
| |
These are currently not used, but could be useful later.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The DLC bit is now set to 1 for all loads when GLC is also set,
but cleared to 0 for all stores (otherwise it causes issues),
and also cleared to 0 for atomics.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
| |
Also remove img_format from aco_ir, since it can be calculated
from dfmt and nfmt. So only the assember needs to deal with it.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We'd like to use some functions, for example some
ac_shader_util functions in ACO, so we need to link
ACO to AC.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We'd like to include some of these in C++ code later.
Specifically, ACO is written in C++ and we would like to use
some of this code in ACO in order to avoid code duplication.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SGPRS: 880272 -> 838936 (-4.70 %)
VGPRS: 705316 -> 680988 (-3.45 %)
Spilled SGPRs: 1032 -> 832 (-19.38 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 252 -> 252 (0.00 %) dwords per thread
Code Size: 55150788 -> 55172436 (0.04 %) bytes
LDS: 451 -> 451 (0.00 %) blocks
Max Waves: 66178 -> 68706 (3.82 %)
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
| |
use the ones from addrlib
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
And use a new p_discard_early_exit instruction. This fixes some cases
where a definition having the same register as an operand causes issues.
v2: rename instruction to p_exit_early_if
v2: modify the existing instruction instead of creating a new one
v3: merge the "i == num - 1" IFs
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Daniel Schürmann <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The spec has probably been misinterpreted during RADV bringup.
This fixes GPU hangs with dEQP-VK.binding_model.*offset_nonzero*.
Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
NIR->LLVM and ACO already support nir_intrinsic_shader_clock.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
[212/893] Compiling C object 'src/amd/llvm/ce8261c@@amd_common_llvm@sta/ac_nir_to_llvm.c.o'.
../mesa/src/amd/llvm/ac_nir_to_llvm.c: In function ‘visit_image_atomic’:
../mesa/src/amd/llvm/ac_nir_to_llvm.c:2636:17: warning: unused variable ‘format’ [-Wunused-variable]
2636 | const GLenum format = nir_intrinsic_format(instr);
| ^~~~~~
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit is a step towards the goal of being able to build RADV
without LLVM. In the future we would like to offer the option to
use RADV solely with ACO. There is still a need for the common AMD
code located in amd/common but the LLVM specific parts need to be
separated.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Marek Olšák <[email protected]>
Acked-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I thought I fixed this, but I guess I must have broken it again.
Fixes various dEQP-VK.draw.* tests
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This simplifies ACO and allows the lowered code to be optimized (in
particular, constant folded).
Totals from affected shaders:
SGPRS: 1776 -> 1776 (0.00 %)
VGPRS: 1436 -> 1436 (0.00 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 203452 -> 203564 (0.06 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 103 -> 103 (0.00 %)
At least some of the code size increase seems to be from literals being
applied to instructions as a result of constant folding.
v2: remove fmod/frem handling in init_context()
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
| |
RADV and RadeonSI both lower these two NIR instructions.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This lowers fmod and frem at NIR level like RadeonSI. fmod is
already lowered directly in NIR->LLVM, and frem will be lowered by
LLVM anyways.
This fixes a LLVM crash with:
dEQP-VK.glsl.builtin.precision_fp16_storage32b.frem.compute.scalar.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
uintptr_t is 32 bits in a 32-bits build, resulting in shifting out
of bounds.
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|