| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes:
dEQP-VK.glsl.texture_gather.basic.2d.depth32f.*
Cc: "13.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
This one is easy to miss, because it's not documented in any internal doc.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
With nir_intrinsic_ssbo_atomic_comp_swap we run out of params.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to force persample shading when
a) shader uses sample_id
b) shader uses sample_position
c) shader uses sample qualifier.
Also since ps_iter_samples can now change independently of the
rasterizer samples we need to move setting the regs more often.
This fixes:
dEQP-VK.pipeline.multisample_interpolation.centroid_interpolate_at_consistency.*
dEQP-VK.pipeline.multisample_interpolation.centroid_qualifier_inside_primitive.137_191_1.*
dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.*
dEQP-VK.pipeline.multisample_interpolation.sample_qualifier_distinct_values.128_128_1.*
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Not sure about the deprecation path, but this intrinsic
can be lowered to SMEM loads. This results in a significant
Talos performance improvement.
v2: Fix for LLVM attribute changes.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes the fix:
radv/ac/llvm: fix regression with shadow samplers fix
Signed-off-by: Dave Airlie <[email protected]>
Cc: "13.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes b56b54cbf1d8e70c87a434da5350d11533e5fed8:
radv/ac/llvm: shadow samplers only return one value
It makes sure we only do that for shadow sampling, as
opposed to sizing requests.
Signed-off-by: Dave Airlie <[email protected]>
Cc: "13.0" <[email protected]>
|
|
|
|
|
|
|
|
| |
The intrinsic engine asserts in llvm due to this.
Reported-by: Christoph Haag <[email protected]>
Cc: "13.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code didn't limit the offsets to the number supplied, so
if we expected 3 but only got 2 we were accessing undefined memory.
This fixes random failures in:
dEQP-VK.glsl.texture_functions.texelfetchoffset.sampler2darray_*
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "13.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Ported from radeonsi.
Note that si_make_texture_descriptor() already sets img7 to the mask
value referred to in the comment.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
The sample id is packed into bits 8-12, so adjust
things properly.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This fixes a bunch of crashes in CTS tests looking for this.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This cleans up the ddxy emission along the same lines as
radeonsi. It also means we don't use LDS on VI chips we
use the dspermute interface, it also removes some duplicated
code.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The translation to llvm was failing here due to required lod.
This fixes some new SteamVR shaders.
Cc: "13.0" <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes a number of CTS tests like:
dEQP-VK.glsl.texture_gather.basic.2d.rgba8ui.size_npot.clamp_to_edge_repeat
Cc: "13.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This pulls amd/common build rules into upper level makefile,
along with amd/addlib which is already there.
v2: [Emil Velikov]
- Move NEED_RADEON_LLVM conditional, drop amd/common from SUBDIRS
- Drop AM_ from common_libamd_common_la*
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes a NULL pointer dereference for intrinsics with more than
one function attribute introduced in commit 2fdaf38.
The fix is ported from the lp_build_intrinsic changes in commit 8bdd52c.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can only read the valid samples if this is an MSAA
texture, which means the type field must be 0x14 or 0x15.
This fixes:
dEQP-VK.glsl.texture_functions.query.texturesamples.*
Cc: "13.0" <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reported-by: Jan Vesely <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Ported from corresponding changes to gallivm.
tested build against 3.9 and master.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We are going to start lowering to this in NIR code,
so prepare radv for it.
v2: handle conversion to kilp properly (nha)
Cc: "13.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was getting a random GPU hang in the renderpass simple tests,
it turns out sometimes radv emitted the wrong thing "last".
This fixes the logic to emit Z/stencil last if they occur,
and not mark a color output as last. Also this relies on the
Z/STENCIL being the first two fragment outputs, which they are
so yay.
Fixes: dEQP-VK.renderpass.simple.color_depth (random hangs)
Cc: "13.0" <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
At least when LLVM is used, which is basically always (unless you're only
building r600 without OpenCL).
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intrinsic engine asserts in llvm due to this,
as we put a vec4 into a vec1, and the next instruction
isn't expecting it.
So trim the vector at the end before inserting it.
Reported-by: Christoph Haag <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "13.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When restoring something from shader cache we won't have and don't
want to create a nir_shader this change detaches the two.
There are other advantages such as being able to reuse the
shader info populated by GLSL IR.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On a debug llvm build we'd assert on the next compare
when the return from samples_identical was i1 instead
of i32.
Cc: "13.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This was returning an inversion, so not doing as it should have.
We need to compare the fmask value with 0, and return the result
from that.
|
|
|
|
| |
We were using the wrong descriptor in the fmask picking code.
|
|
|
|
|
|
|
|
| |
This adds some comments and adds defines for the user sgprs,
so that we can move them around easier later and not have
to change/revalidate every one of these.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These were changed in LLVM r284024.
v2:
- Only use float types for vdata of llvm.amdgcn.image.store. LLVM doesn't
support integer types for this intrinsic.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This squashes all the radv development up until now into
one for merging.
History can be found:
https://github.com/airlied/mesa/tree/semi-interesting
This requires llvm 3.9 and is in no way considered
a conformant vulkan implementation. It can run a number
of vulkan applications, and supports all GPUs using
the amdgpu kernel driver.
Thanks to Intel for providing anv and spirv->nir,
and Emil Velikov for reviewing build integration.
Parts of this are:
Reviewed-by: Nicolai Hähnle <[email protected]>
Acked-by: Edward O'Callaghan <[email protected]>
Authors: Bas Nieuwenhuizen and Dave Airlie
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch switches non-TGSI compute shaders over to using the HSA
ABI described here:
https://github.com/RadeonOpenCompute/ROCm-Docs/blob/master/AMDGPU-ABI.md
The HSA ABI provides a much cleaner interface for compute shaders and allows
us to share more code in the compiler with the HSA stack.
The main changes in this patch are:
- We now pass the scratch buffer resource into the shader via user sgprs
rather than using relocations.
- Grid/Block sizes are now passed to the shader via the dispatch packet
rather than at the beginning of the kernel arguments.
Typically for HSA, the CP firmware will create the dispatch packet and set
up the user sgprs automatically. However, in Mesa we let the driver do
this work. The main reason for this is that I haven't researched how to
get the CP to do all these things, and I'm not sure if it is supported
for all GPUs.
v2:
- Add comments explaining why we are setting certain bits of the scratch
resource descriptor.
v3:
- Use amdgcn-mesa-mesa3d triple instead of amdgcn--mesa3d.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Acked-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This just moves these to a common header file.
Acked-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
Step one to merging radv would be to move some files around.
This only adds the include path to r600/radeonsi, because later
we want to avoid having to add it to the generic target paths.
Acked-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|