mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	radv: Set the RADEON_SURF_OPTIMIZE_FOR_SPACE flag for images	Alex Smith	2017-07-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This looks like a regression from df301237940 ("radv: use ac_compute_surface"). Before that, the opt4Space addrlib flag was set to true unless the image has FMASK (ac_compute_surface will similarly only set that flag for images without FMASK). This saves multiple gigabytes of VRAM on one of our games, and brings its VRAM utilisation on RADV in line with AMDGPU-PRO and NVIDIA. Signed-off-by: Alex Smith <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: don't shadow meta_va.	Dave Airlie	2017-07-18	1	-1/+1
\| \| \| \| \| \|	Coverity warned about dead code below, as meta_va was being shadowed. Signed-off-by: Dave Airlie <[email protected]>
*	ac/nir: rewrite shared variable handling (v2)	Connor Abbott	2017-07-17	1	-87/+158
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Translate the NIR variables directly to LLVM instead of lowering to a TGSI-style giant array of vec4's and then back to a variable. This should fix indirect dereferences, make shared variables more tightly packed, and make LLVM's alias analysis more precise. This should fix an upcoming Feral title, which has a compute shader that was failing to compile because the extra padding made us run out of LDS space. v2: Combine the previous two patches into one, only use this for shared variables for now until LLVM becomes smarter. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen> Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Alex Smith <[email protected]>
*	ac/gpu_info: if clock crystal frequency is 0, print an error and set 1	Marek Olšák	2017-07-17	1	-0/+4
\| \| \| \| \| \| \| \|	During bring-up, this is often 0. Prevent automatic disablement of ARB_timer_query and demotion of the OpenGL version to 3.2 by setting a non-zero frequency. Print an error message instead. Reviewed-by: Nicolai Hähnle <[email protected]>
*	ac/surface/gfx9: flags.texture currently refers to TC-compatible HTILE	Marek Olšák	2017-07-17	1	-1/+3
\| \| \| \| \| \|	This should lead to better MSAA performance on GFX9. Reviewed-by: Nicolai Hähnle <[email protected]>
*	radeonsi: merge si_llvm_get_amdgpu_target into ac_get_llvm_target	Marek Olšák	2017-07-17	2	-8/+11
\| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]>
*	radv: advertise v6 of the wayland surface extension	Emil Velikov	2017-07-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Jason updated the Khronos spec to explicitly state that Wayland surfaces must support VK_PRESENT_MODE_MAILBOX_KHR. ANV did so since day one (back in 2015) Cc: [email protected] Cc: Bas Nieuwenhuizen <[email protected]> Cc: Dave Airlie <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	radv: predicate cmask eliminate when using DCC.	Dave Airlie	2017-07-17	6	-7/+153
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When using DCC some clear values don't require a cmask eliminate step. This patch adds support for black and black with alpha 1, there are other values, but I don't have access to a comprehensive list. This works by setting the cmask eliminate predicate when doing the fast clear, and later when doing the cmask elimination making sure the draws are predicated. This increases the fps on Sascha Willems deferred. Tonga: 580fps->670fps on a Tonga PRO card. Polaris 730->850fps Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv/clear: add r32g32b32a32 fast clear support (v2)	Dave Airlie	2017-07-17	2	-2/+26
\| \| \| \| \| \| \| \| \| \| \| \| \|	We can only fast clear 128-bit images if the r/g/b channels are the same, and we are using DCC. For DCC we'll bail out on translate if this isn't true, and we catch cmask clears explicitly. v2: remove 64-bit block (Bas), add uint32 as well. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	amd/addrlib: fix typo in api name.	Dave Airlie	2017-07-17	11	-16/+16
\| \| \| \| \| \| \| \|	This fixes the misspelling of ALIGNMENTS in addrlib. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: set cb base tile swizzles for MRT speedups (v4)	Dave Airlie	2017-07-17	5	-2/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch uses addrlib to workout the tile swizzles according to the surface index. It seems to produce the same values as amdgpu-pro for the deferred test. v2: don't apply swizzle to CMASK. the eg docs don't mention it, and we clearly don't align cmask for that. v3: disable surf index for dedicated images, as these will most likely be shared, and I don't think the metadata has space for this info in it yet. v4: update for shareable images, rename combined_swizzle to tile_swizzle This gets the deferred demo from 730->950fps on my rx480. (dcc cmask elim predication patches get it further) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: allow clear merging for depth/stencil with no care stencil	Dave Airlie	2017-07-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Some of the Sascha Willems demos pick a D32/S8 format for the depth buffer, then do a LOAD_OP_CLEAR/LOAD_OP_DONT_CARE on it, which means we don't get to merge the undefined->depth and clear htile transitions. This add the stencil aspect to the pending clears if there is a depth clear pending and the stencil aspect is don't care. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: Remove NV dedicated alloc extension.	Bas Nieuwenhuizen	2017-07-15	1	-4/+0
\| \| \| \| \| \| \|	To not confuse apps in thinking it might be faster. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Andres Rodriguez <[email protected]>
*	radv: Use the KHR dedicated alloc for the WSI.	Bas Nieuwenhuizen	2017-07-15	1	-2/+2
\| \| \| \| \| \| \| \|	NV isn't valid for external images anymore. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Fixes: 6ddc64b93ea "radv: Add support for VK_KHR_dedicated_allocation." Reviewed-by: Andres Rodriguez <[email protected]>
*	radv: Implement VK_KHR_external_memory	Jason Ekstrand	2017-07-15	5	-3/+199
\| \| \| \| \| \| \| \| \|	This effectively reverts commit 43a171878bb4b5aedb36a. Technically, VK_KHR_get_memory_requirements2 and VK_KHR_dedicated_allocation are required for the KHR version but this at least restores the removed functionality. This patch builds but has received zero testing. Acked-by: Dave Airlie <[email protected]>
*	radv: Add support for VK_KHR_dedicated_allocation.	Bas Nieuwenhuizen	2017-07-15	2	-2/+35
\| \| \| \| \| \|	Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Acked-by: Dave Airlie <[email protected]>
*	radv: Add support for VK_KHR_get_memory_requirements2.	Bas Nieuwenhuizen	2017-07-15	2	-0/+32
\| \| \| \| \| \| \| \| \|	Fished the SparseImage call out of the headers as the spec missed the definition. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Acked-by: Dave Airlie <[email protected]>
*	radv: Drop support for VK_KHX_external_semaphore_*	Jason Ekstrand	2017-07-15	4	-188/+2
\| \| \| \| \| \| \|	These have been formally deprecated by Khronos never to be shipped again. The KHR versions should be implemented/used instead. Acked-by: Dave Airlie <[email protected]>
*	radv: Fix descriptors for cube images with VK_IMAGE_USAGE_STORAGE_BIT	Alex Smith	2017-07-13	3	-29/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a cube image has VK_IMAGE_USAGE_STORAGE_BIT set, the type in an image view's descriptor was set to a 2D array (and a few other fields adjusted accordingly). This is correct when the image view is actually bound as a storage image, but not when bound as a sampled image. In that case the type should be set as a cube. Fix by generating 2 sets of descriptors at view creation time for both storage and non-storage usage, and then choose between them based on descriptor type when writing descriptor sets. v2: Generate storage descriptors for images with TRANSFER_DST, since those may be used as storage images internally. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: Fix possible invalid free of dynamic descriptors	Alex Smith	2017-07-13	1	-1/+0
\| \| \| \| \| \| \| \| \|	This free was left in after dynamic descriptors were changed to not be allocated separately from the descriptor set, and can cause a crash. Fixes: 39644fa40a3 ("radv: Don't allocate dynamic descriptors separately") Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv/ac: drop setting xnack	Dave Airlie	2017-07-09	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \|	Since radv uses compute rings and we can't know when we are setting up the shaders what ring they are to be used on, we should just use the default xnack setting. This may be suboptimal in some places, but if we hit a problem, we likely should try and address this between llvm and mesa. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: add support for using addrlib max alignment.	Dave Airlie	2017-07-09	5	-4/+14
\| \| \| \| \| \| \| \|	Rather than using 64k, use what addrlib returns as the base alignment for vulkan allocations. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: Add compute htile clear for combined depth+stencil surfaces.	Bas Nieuwenhuizen	2017-07-08	1	-9/+7
\| \| \| \| \| \| \| \|	Figured out the clear value when we have a combined depth stencil surface. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	ac/nir: Fix ordering of parameters for image atomic cmpswap intrinsics	Alex Smith	2017-07-07	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The NIR parameters are ordered "compare, data", matching GLSL, but both the image and buffer LLVM intrinsics take them the other way around. This is already handled correctly for SSBO atomics. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
*	radv: don't overallocate depth/stencil formats	Dave Airlie	2017-07-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	For depth/stencil formats the surface layer allocates the stencil separately, so we don't need to include it in the bpe. This reduces the side of d32s8 allocates to something closer to pro. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: enable sisched toggle in perftest flags.	Dave Airlie	2017-07-06	5	-2/+10
\| \| \| \| \| \| \| \| \|	RADV_PERFTEST=sisched to enable it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/llvm: set xnack like radeonsi does.	Dave Airlie	2017-07-06	1	-1/+3
\| \| \| \| \| \| \|	Use family, but only set xnack+ for gfx9. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/llvm: create features list using snprintf.	Dave Airlie	2017-07-06	1	-2/+5
\| \| \| \| \| \| \|	Just more moving code around before adding things to it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/radv: change api to create target machine	Dave Airlie	2017-07-06	3	-7/+14
\| \| \| \| \| \| \| \|	This just modifies the API to make it easier to add other flags to target machine creation. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: add support for cmd predication.	Dave Airlie	2017-07-06	5	-30/+64
\| \| \| \| \| \| \| \|	This doesn't get used yet, it just adds support to various PKT3 emissions to enable it later. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/nir: Move VS position exports before param exports.	Bas Nieuwenhuizen	2017-07-05	1	-55/+54
\| \| \| \| \| \| \| \|	According to Nicolai the SX can already start work when all the position exports are done, so do those first. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	radv: Always set depthbuffer using image format instead of iview format.	Bas Nieuwenhuizen	2017-07-05	1	-2/+2
\| \| \| \| \| \| \| \|	We have some cases where changing between depth and stencil only aspect was causing hangs. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
*	radv: Disable depth & stencil tests when the depthbuffer doesn't support it.	Bas Nieuwenhuizen	2017-07-05	6	-11/+36
\| \| \| \| \|	Signed-off-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
*	radv: enable Int64 capability (v2)	Dave Airlie	2017-07-03	2	-1/+2
\| \| \| \| \| \| \| \| \|	I'm not 100% sure this is all wired up but it looks like it is. v2: actually enable extension. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: fix 64-bit shifts	Connor Abbott	2017-07-03	1	-3/+12
\| \| \| \| \| \| \| \| \|	NIR always makes the shift amount 32 bits, but LLVM asserts if the two sources aren't the same type. Zero-extend the shift amount to make LLVM happy. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: implement 64-bit packing and unpacking	Connor Abbott	2017-07-03	2	-0/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We implement the split opcodes, and tell NIR to lower the original ones. The lowering to LLVM is a little more complicated, but NIR can optimize the split ones a little better, and some NIR lowering passes that we might want to use (particularly for doubles) emit the split ones. This should fix pack/unpackDouble2x32, which seems like a bug since when we enabled the Float64 capability. It will also fix pack/unpackInt2x32 when we enable the Int64 capability. Fixes: 798ae37c ("radv: Enable Float64 support.") Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: Use v4i32 variant of llvm.SI.load.const.	Bas Nieuwenhuizen	2017-06-30	1	-3/+1
\| \| \| \| \| \| \| \| \| \|	We apparently still used v16i8 .... As radeonsi doesn't use it with LLVM version checks I don't think we need them either. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	ac/nir: remove last remnants of v16i8	Dave Airlie	2017-06-28	3	-9/+3
\| \| \| \| \| \| \|	llvm doesn't need this workaround anymore. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/nir: Use correct LLVM intrinsics for atomic ops on imageBuffers	Alex Smith	2017-06-28	1	-29/+34
\| \| \| \| \| \| \| \|	The buffer intrinsics should be used instead of the image ones. Signed-off-by: Alex Smith <[email protected]> Cc: <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: assert printfs will fit	James Legg	2017-06-28	1	-5/+12
\| \| \| \| \|	Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: Make intrinsic_name buffer long enough	James Legg	2017-06-28	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	When using cmpswap on an image, it was being trunctated to lvm.amdgcn.image.atomic.cmpswa, with the coords type missing entirely. v2: Add stable CC CC: <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac/nir: convert emit helpers to ac_llvm_context	Nicolai Hähnle	2017-06-27	1	-117/+118
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/nir: remove unused nir_to_llvm_context::has_ddxy	Nicolai Hähnle	2017-06-27	1	-2/+0
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/nir: implement nir_op_f2b	Nicolai Hähnle	2017-06-27	1	-0/+12
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/nir: implement nir_op_{b2i,i2b}	Nicolai Hähnle	2017-06-27	1	-0/+20
\| \| \| \| \| \| \|	Booleans in NIR are ~0 for true, b2i returns 0/1. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/nir: convert type helpers to ac_llvm_context	Nicolai Hähnle	2017-06-27	1	-95/+95
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/llvm: fix type of second llvm.cttz.* parameter	Nicolai Hähnle	2017-06-27	1	-1/+1
\| \| \| \| \| \| \| \|	LLVM has required an i1 here for a long time. llvm.ctlz.* was fixed in commit edd23e06067 ("ac/llvm: fix various findMSB bugs"). Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac/shader_info: fix a comment	Nicolai Hähnle	2017-06-27	1	-2/+6
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac: add ac_llvm_context::v8i32	Nicolai Hähnle	2017-06-27	2	-0/+2
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	ac: add ac_llvm_context::{i,f}32_{0,1}	Nicolai Hähnle	2017-06-27	2	-0/+10
\| \| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>