summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* ac: fix get_image_coords() for radeonsiTimothy Arceri2018-09-151-1/+2
| | | | | | | | | | Because this was setting image to true we would end up calling si_load_image_desc() when we sould be calling si_load_sampler_desc(). This fixes an assert() in Deus Ex: MD Reviewed-by: Marek Olšák <[email protected]>
* radv: emit the initial config only once in the preamblesSamuel Pitoiset2018-09-144-50/+48
| | | | | | | | | It shouldn't be needed to emit the initial graphics or compute state when beginning a new command buffer. Emitting them in the preamble should be enough and this will reduce IB sizes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix setting global locations for indirect descriptorsSamuel Pitoiset2018-09-141-1/+0
| | | | | | | | | | | | Indirect descriptors only need one entry, we don't have to emit a location for every descriptors. Fixes GPU hangs with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix flushing indirect descriptorsSamuel Pitoiset2018-09-141-3/+9
| | | | | | | | | | | | | | Let say, we first bind a graphics pipeline that needs indirect descriptors sets. The userdata pointers will be emitted at draw time. Then if we bind a compute pipeline that doesn't need any indirect descriptors, the driver will re-emit them for all grpahics stages. To avoid this to happen, just check the bind point type. CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix GPU hangs with 32-bit indirect descriptorsSamuel Pitoiset2018-09-141-3/+5
| | | | | | | | | | | LLVM 6 isn't affected. Fixes GPU hangs with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: handle loc->indirect correctly for the first descriptorSamuel Pitoiset2018-09-142-11/+10
| | | | | | | | | | | | | This was wrong for descriptor #0 when all of them are indirect. This is because indirect_offset was 0 and we emitted a "normal" descriptor pointer for nothing. While we are at it remove radv_userdata_info::indirect_offset which is useless. CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: bump the maximum number of arguments to 64Samuel Pitoiset2018-09-141-1/+1
| | | | | | | | | | | Bumping to 64 should be safe enough. Fixes some crashes with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: tidy up ac_setup_rings() for the GSVS ringsSamuel Pitoiset2018-09-141-13/+34
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix setting the number of entries for GSVS on VI+Samuel Pitoiset2018-09-141-3/+0
| | | | | | | | According to RadeonSI, it's unnecessary to multiply by the stride. That field seems to always be 64. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always compute the number of components from the output maskSamuel Pitoiset2018-09-141-12/+2
| | | | | | | That removes two special cases for clip/cull distances. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: emit data contiguously in the GS->VS ring bufferSamuel Pitoiset2018-09-141-16/+12
| | | | | | | | Instead of having holes. The other ring parameters like offset and stride can be updated later. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make use of the output usage mask in GS copy shaderSamuel Pitoiset2018-09-141-0/+3
| | | | | | | | This is just for consistency because LLVM can detect and remove unused loads. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: improve a comment in si_emit_set_predication_state()Samuel Pitoiset2018-09-141-8/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix VK_EXT_conditional_rendering visibilitySamuel Pitoiset2018-09-141-4/+12
| | | | | | | | | | It's actually just the opposite. This fixes the new Sascha conditionalrender demo. CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make use of ac_unpack_param() instead of ac_build_bfe()Samuel Pitoiset2018-09-141-15/+6
| | | | | | | | Same code is generated because LLVM ends up by using bfe, but that seems cleaner to me. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix function names for VK_EXT_conditional_renderingSamuel Pitoiset2018-09-131-2/+2
| | | | | | | | | Otherwise they are not exported. CC: 18.2 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected] Signed-off-by: Samuel Pitoiset <[email protected]>
* radv: adjust ESGS ring buffer size computation on VI+Samuel Pitoiset2018-09-111-1/+5
| | | | | | | Noticed while working in this area. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Support v3 of VK_EXT_vertex_attribute_divisor.Bas Nieuwenhuizen2018-09-102-1/+8
| | | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> CC: 18.2 <[email protected]>
* radeonsi: adjust and simplify max_alloc_size determinationMarek Olšák2018-09-101-8/+8
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: fix GPU hangs with bindless textures and LLVM 7.0Marek Olšák2018-09-102-5/+51
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac: remove deprecated use of LLVMInt1Type()Marek Olšák2018-09-101-1/+1
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac: use iN_0/1 constantsMarek Olšák2018-09-102-14/+13
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac: add radeon_info::num_good_cu_per_shMarek Olšák2018-09-102-0/+4
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac: revert new LLVM 7.0 behavior for fdivMarek Olšák2018-09-101-1/+8
| | | | | Cc: 18.2 <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* nir: Drop the vs_inputs_dual_locations optionJason Ekstrand2018-09-061-1/+0
| | | | | | | | | | | | | It was very inconsistently handled; the only things that made use of it were glsl_to_nir, glspirv, and nir_gather_info. In particular, nir_lower_io completely ignored it so anyone using nir_lower_io on 64-bit vertex attributes was going to be in for a shock. Also, as of the previous commit, it's set by every driver that supports 64-bit vertex attributes. There's no longer any reason to have it be an option so let's just delete it. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: Fix CMASK dimensions.Bas Nieuwenhuizen2018-09-031-2/+2
| | | | | | | | | | Mirrors 1e40f694831 "ac/surface: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI/VI" CC: <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Use a lower max offchip buffer count.Bas Nieuwenhuizen2018-09-031-2/+22
| | | | | | | | No clue what gets fixed by this but both radeonsi and amdvlk do it. CC: <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add VEGA20 support.Bas Nieuwenhuizen2018-09-032-0/+2
| | | | | | | | | Just mirror the radeonsi bits. Since this is just adding the extra switch entries for new HW I think this should be fine for stable. CC: <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: don't expose linear depth surfaces on SI/CIK/VI either.Dave Airlie2018-09-031-3/+2
| | | | | | | | | | | ac_surface.c: gfx6_compute_surface says /* DB doesn't support linear layouts. */ Now if we expose linear depth and create a linear depth image and use CmdCopyImage to copy into it, we can't map the underlying memory and read it linearly which I think should work. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add missing support for protected memory propertiesSamuel Pitoiset2018-08-311-0/+6
| | | | | | | Fixes Vulkan CTS CL#2849. Similar to the ANV driver. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove dead code in scan_shader_output_decl()Samuel Pitoiset2018-08-311-6/+0
| | | | | | | Never used. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: remove radv_shader_context::num_output_{clips,culls}Samuel Pitoiset2018-08-311-7/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: adjust the cull dist mask in scan_shader_output_decl()Samuel Pitoiset2018-08-311-3/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: get length of the clip/cull distances array from usage maskSamuel Pitoiset2018-08-311-9/+40
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: do not recompute the output usage mask for clipdist twiceSamuel Pitoiset2018-08-311-4/+1
| | | | | | | The shader info pass takes care of this now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: gather the output usage mask for clip/cull distances correctlySamuel Pitoiset2018-08-311-0/+8
| | | | | | | It's a special case because both are combined into a single array. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: add set_output_usage_mask() helperSamuel Pitoiset2018-08-311-17/+26
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: fix passing clip/cull distances from VS to PSSamuel Pitoiset2018-08-314-1/+51
| | | | | | | | | | | | | | | | | CTS doesn't test input clip/cull distances for the fragment shader stage, which explains why this was totally broken. I wrote a simple test locally that works now. This fixes a crash with GTA V and DXVK. Note that we are exporting unused parameters from the vertex shader now, but this can't be optimized easily because we don't keep the fragment shader info... Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107477 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/radeonsi: fix CIK copy max sizeDave Airlie2018-08-311-1/+3
| | | | | | | | | | | | While adding transfer queues to radv, I started writing some tests, the first test I wrote fell over copying a buffer larger than this limit. Checked AMDVLK and found the correct limit. Cc: <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv/meta: Set num_components on image_store intrinsicsJason Ekstrand2018-08-303-0/+6
| | | | | | | | | | | | Now that image load/store intrinsics are variable-width, we need to set num_components accordingly. In 15d39f474b890, both glsl_to_nir and spirv_to_nir were updated to properly set num_components but radv meta was left behind. Fixes: 15d39f474b890 "nir: Make image load/store intrinsics..." Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* radv: Add missing checks in radv_get_image_format_properties.Bas Nieuwenhuizen2018-08-301-0/+19
| | | | | CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA for measuring SDMA performanceMarek Olšák2018-08-291-0/+4
|
* radeonsi: add flag L2_STREAM for minimal cache usageMarek Olšák2018-08-291-0/+2
|
* nir: Use a bitfield for image access qualifiersJason Ekstrand2018-08-291-2/+2
| | | | | | | | | | This commit expands the current memory access enum to contain the extra two bits provided for images. We choose to follow the SPIR-V convention of NonReadable and NonWriteable because readonly implies that you *can* read so readonly + writeonly doesn't make as much sense as NonReadable + NonWriteable. Reviewed-by: Kenneth Graunke <[email protected]>
* ac/surface: fix CMASK fast clear for NPOT textures with mipmapping on SI/CI/VIMarek Olšák2018-08-281-2/+2
| | | | | | | This fixes VM faults and corruption. Cc: 18.1 18.2 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* Revert "configure: allow building with python3"Emil Velikov2018-08-242-6/+6
| | | | | | | | | | | | | | This reverts commit ae7898dfdbe5c8dab7d11c71862353f1ae43feb0. Turns out the python scripts are _not_ fully python 3 compatible. As Ilia reported using get_xmlpool.py with LANG=C produces some weird output - see the link for details. Even though the issue was spotted with the autoconf build, it exposes a genuine problem with the script (and lack of lang handling of the meson build.) https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
* configure: allow building with python3Emil Velikov2018-08-232-6/+6
| | | | | | | | | | | | Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python3 chosen prior to python2 Signed-off-by: Emil Velikov <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* radv/gfx9: implement coherent shaders for VK_ACCESS_SHADER_READ_BITSamuel Pitoiset2018-08-231-1/+20
| | | | | | | | Single-sample color and single-sample depth (not stencil) are coherent with shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]
* amd/addrlib: Fix include path for c99_compat.hMariusz Ceier2018-08-221-1/+1
| | | | | | | | | | | | | | | | Without this patch mesa doesn't compile: In file included from ../mesa-9999/src/amd/addrlib/addrinterface.cpp:39: ../mesa-9999/src/util/macros.h:29:10: fatal error: c99_compat.h: No such file or directory #include "c99_compat.h" ^~~~~~~~~~~~~~ compilation terminated. Fixes: 15ca5ce99a80d9ebb5ef2b1aca6ea00784931de4 ("amd/addrlib: mark returnCode as MAYBE_UNUSED in") Signed-off-by: Mariusz Ceier <[email protected]> Acked-by: Kai Wasserbäch <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: use different builtin shader cache for 32bitGrazvydas Ignotas2018-08-231-9/+7
| | | | | | | | Currently if 64bit and 32bit programs are used interchangeably, radv will keep overwriting the cache. Use separate cache files to avoid that. Reviewed-by: Bas Nieuwenhuizen <[email protected]>