summaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* ac/nir: Only use the first component for SSBO atomics.Bas Nieuwenhuizen2016-12-051-2/+2
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: fix another regression since shadow fixes.Dave Airlie2016-12-051-1/+1
| | | | | | | | This fixes: dEQP-VK.glsl.texture_gather.basic.2d.depth32f.* Cc: "13.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: document a CP DMA bug that doesn't need a workaround yetMarek Olšák2016-12-011-1/+5
| | | | | | This one is easy to miss, because it's not documented in any internal doc. Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/nir: Fix out of bounds array access.Bas Nieuwenhuizen2016-11-301-1/+1
| | | | | | | With nir_intrinsic_ssbo_atomic_comp_swap we run out of params. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: force persample shading when required.Dave Airlie2016-11-292-3/+15
| | | | | | | | | | | | | | | | | | | We need to force persample shading when a) shader uses sample_id b) shader uses sample_position c) shader uses sample qualifier. Also since ps_iter_samples can now change independently of the rasterizer samples we need to move setting the regs more often. This fixes: dEQP-VK.pipeline.multisample_interpolation.centroid_interpolate_at_consistency.* dEQP-VK.pipeline.multisample_interpolation.centroid_qualifier_inside_primitive.137_191_1.* dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.* dEQP-VK.pipeline.multisample_interpolation.sample_qualifier_distinct_values.128_128_1.* Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: Fix accessing an unitialized value.Bas Nieuwenhuizen2016-11-291-1/+2
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Use different intrinsic for ubo loads.Bas Nieuwenhuizen2016-11-291-1/+29
| | | | | | | | | | | Not sure about the deprecation path, but this intrinsic can be lowered to SMEM loads. This results in a significant Talos performance improvement. v2: Fix for LLVM attribute changes. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: brown-paper bag for a forgotten else.Dave Airlie2016-11-281-1/+1
| | | | | | | | This fixes the fix: radv/ac/llvm: fix regression with shadow samplers fix Signed-off-by: Dave Airlie <[email protected]> Cc: "13.0" <[email protected]>
* radv/ac/llvm: fix regression with shadow samplers fixDave Airlie2016-11-281-3/+3
| | | | | | | | | | | This fixes b56b54cbf1d8e70c87a434da5350d11533e5fed8: radv/ac/llvm: shadow samplers only return one value It makes sure we only do that for shadow sampling, as opposed to sizing requests. Signed-off-by: Dave Airlie <[email protected]> Cc: "13.0" <[email protected]>
* radv/ac/llvm: shadow samplers only return one value.Dave Airlie2016-11-271-1/+3
| | | | | | | | The intrinsic engine asserts in llvm due to this. Reported-by: Christoph Haag <[email protected]> Cc: "13.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix texel fetch offset with 2d arrays.Dave Airlie2016-11-241-3/+4
| | | | | | | | | | | | The code didn't limit the offsets to the number supplied, so if we expected 3 but only got 2 we were accessing undefined memory. This fixes random failures in: dEQP-VK.glsl.texture_functions.texelfetchoffset.sampler2darray_* Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "13.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: add support for anisotropic filtering on SI-CIFredrik Höglund2016-11-241-0/+31
| | | | | | | | | Ported from radeonsi. Note that si_make_texture_descriptor() already sets img7 to the mask value referred to in the comment. Reviewed-by: Dave Airlie <[email protected]>
* radv: fix sample id loadingDave Airlie2016-11-221-1/+18
| | | | | | | | The sample id is packed into bits 8-12, so adjust things properly. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: add implementation of load_sample_pos intrinsic.Dave Airlie2016-11-221-0/+12
| | | | | | | This fixes a bunch of crashes in CTS tests looking for this. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: cleanup ddxy emissionDave Airlie2016-11-221-93/+43
| | | | | | | | | | This cleans up the ddxy emission along the same lines as radeonsi. It also means we don't use LDS on VI chips we use the dspermute interface, it also removes some duplicated code. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: spir-v allows texture size query with and without lod.Dave Airlie2016-11-211-1/+4
| | | | | | | | | | The translation to llvm was failing here due to required lod. This fixes some new SteamVR shaders. Cc: "13.0" <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir/llvm: fix channel in texture gather lowering code.Dave Airlie2016-11-161-1/+1
| | | | | | | | This fixes a number of CTS tests like: dEQP-VK.glsl.texture_gather.basic.2d.rgba8ui.size_npot.clamp_to_edge_repeat Cc: "13.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* amd: flatten amd/common makefile structureMauro Rossi2016-11-152-88/+0
| | | | | | | | | | | This pulls amd/common build rules into upper level makefile, along with amd/addlib which is already there. v2: [Emil Velikov] - Move NEED_RADEON_LLVM conditional, drop amd/common from SUBDIRS - Drop AM_ from common_libamd_common_la* Signed-off-by: Emil Velikov <[email protected]>
* ac/nir/llvm: Fix setting function attributes for intrinsicsDaniel Scharrer2016-11-111-1/+5
| | | | | | | | This fixes a NULL pointer dereference for intrinsics with more than one function attribute introduced in commit 2fdaf38. The fix is ported from the lp_build_intrinsic changes in commit 8bdd52c. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix texturesamples to handle single sample caseDave Airlie2016-11-111-2/+10
| | | | | | | | | | | | We can only read the valid samples if this is an MSAA texture, which means the type field must be 0x14 or 0x15. This fixes: dEQP-VK.glsl.texture_functions.query.texturesamples.* Cc: "13.0" <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fixup botched llvm API changes.Dave Airlie2016-11-101-4/+3
| | | | | Reported-by: Jan Vesely <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir/llvm: adopt to new LLVM attribute API.Dave Airlie2016-11-101-36/+108
| | | | | | | | Ported from corresponding changes to gallivm. tested build against 3.9 and master. Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: add support for discard_if intrinsic (v2)Dave Airlie2016-11-101-0/+21
| | | | | | | | | | We are going to start lowering to this in NIR code, so prepare radv for it. v2: handle conversion to kilp properly (nha) Cc: "13.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: emit correct last export when Z/stencil export is enabledDave Airlie2016-11-091-3/+5
| | | | | | | | | | | | | | | I was getting a random GPU hang in the renderpass simple tests, it turns out sometimes radv emitted the wrong thing "last". This fixes the logic to emit Z/stencil last if they occur, and not mark a color output as last. Also this relies on the Z/STENCIL being the first two fragment outputs, which they are so yay. Fixes: dEQP-VK.renderpass.simple.color_depth (random hangs) Cc: "13.0" <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* amd/common: add ac_is_sgpr_param helperNicolai Hähnle2016-11-032-0/+12
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: build also for gallium driversNicolai Hähnle2016-11-032-1/+9
| | | | | | | | At least when LLVM is used, which is basically always (unless you're only building r600 without OpenCL). Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: move llvm helper prototype to ac_llvm_util.hNicolai Hähnle2016-11-034-7/+13
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd: fix a typo in PIXEL_PIPE_STAT_RESET definitionMarek Olšák2016-11-011-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radv/ac/llvm: trim texture return valuesDave Airlie2016-10-271-1/+2
| | | | | | | | | | | | | The intrinsic engine asserts in llvm due to this, as we put a vec4 into a vec1, and the next instruction isn't expecting it. So trim the vector at the end before inserting it. Reported-by: Christoph Haag <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Cc: "13.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/radeon: fix a ZPASS comment, EVENT_WRITE_EOP fixupsMarek Olšák2016-10-261-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* nir/i965/anv/radv/gallium: make shader info a pointerTimothy Arceri2016-10-261-2/+2
| | | | | | | | | | When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <[email protected]>
* radv: use emit_icmp for samples_identicalDave Airlie2016-10-201-1/+1
| | | | | | | | | On a debug llvm build we'd assert on the next compare when the return from samples_identical was i1 instead of i32. Cc: "13.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix samples_identical return value.Dave Airlie2016-10-191-0/+3
| | | | | | | This was returning an inversion, so not doing as it should have. We need to compare the fmask value with 0, and return the result from that.
* radv: fix fmask ptr issueDave Airlie2016-10-191-4/+18
| | | | We were using the wrong descriptor in the fmask picking code.
* radv: start using defines for the user sgpr offsetsDave Airlie2016-10-192-2/+22
| | | | | | | | This adds some comments and adds defines for the user sgprs, so that we can move them around easier later and not have to change/revalidate every one of these. Signed-off-by: Dave Airlie <[email protected]>
* radv: Use new image load/store intrinsic signatures v2Tom Stellard2016-10-141-25/+108
| | | | | | | | | | These were changed in LLVM r284024. v2: - Only use float types for vdata of llvm.amdgcn.image.store. LLVM doesn't support integer types for this intrinsic. Signed-off-by: Dave Airlie <[email protected]>
* radv: Fix incorrect commentTom Stellard2016-10-141-2/+2
| | | | Signed-off-by: Dave Airlie <[email protected]>
* radv: add initial non-conformant radv vulkan driverDave Airlie2016-10-079-0/+5324
| | | | | | | | | | | | | | | | | | | | | | | This squashes all the radv development up until now into one for merging. History can be found: https://github.com/airlied/mesa/tree/semi-interesting This requires llvm 3.9 and is in no way considered a conformant vulkan implementation. It can run a number of vulkan applications, and supports all GPUs using the amdgpu kernel driver. Thanks to Intel for providing anv and spirv->nir, and Emil Velikov for reviewing build integration. Parts of this are: Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> Authors: Bas Nieuwenhuizen and Dave Airlie Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/compute: Use the HSA abi for non-TGSI compute shaders v3Tom Stellard2016-09-161-0/+534
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch switches non-TGSI compute shaders over to using the HSA ABI described here: https://github.com/RadeonOpenCompute/ROCm-Docs/blob/master/AMDGPU-ABI.md The HSA ABI provides a much cleaner interface for compute shaders and allows us to share more code in the compiler with the HSA stack. The main changes in this patch are: - We now pass the scratch buffer resource into the shader via user sgprs rather than using relocations. - Grid/Block sizes are now passed to the shader via the dispatch packet rather than at the beginning of the kernel arguments. Typically for HSA, the CP firmware will create the dispatch packet and set up the user sgprs automatically. However, in Mesa we let the driver do this work. The main reason for this is that I haven't researched how to get the CP to do all these things, and I'm not sure if it is supported for all GPUs. v2: - Add comments explaining why we are setting certain bits of the scratch resource descriptor. v3: - Use amdgcn-mesa-mesa3d triple instead of amdgcn--mesa3d. Reviewed-by: Nicolai Hähnle <[email protected]>
* amd/addrlib: move addrlib from amdgpu winsys to common codeDave Airlie2016-09-061-0/+173
| | | | | Acked-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon: move radeon_family/chip_class defintions to commonDave Airlie2016-09-061-0/+111
| | | | | | | This just moves these to a common header file. Acked-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move sid.h/r600d_common.h to a common place.Dave Airlie2016-09-062-0/+9309
Step one to merging radv would be to move some files around. This only adds the include path to r600/radeonsi, because later we want to avoid having to add it to the generic target paths. Acked-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>