mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	anv: do not open random render node(s)	Emil Velikov	2017-03-15	2	-17/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	drmGetDevices2() provides us with enough flexibility to build heuristics upon. Opening a random node on the other hand will wake up the device, regardless if it's the one we're interested or not. v2: Rebase, explicitly require/check for libdrm v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia) v4: Rebase Cc: Jason Ekstrand <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1) Tested-by: Mike Lothian <[email protected]>
*	radv: do not open random render node(s)	Emil Velikov	2017-03-15	1	-12/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	drmGetDevices2() provides us with enough flexibility to build heuristics upon. Opening a random node on the other hand will wake up the device, regardless if it's the one we're interested or not. v2: Rebase. v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia) Cc: Michel Dänzer <[email protected]> Cc: Dave Airlie <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]> (v1) Tested-by: Mike Lothian <[email protected]>
*	radv/winsys: use drmGetDevice2 API	Emil Velikov	2017-03-15	2	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	Analogous to previous commit v2: Add explicit require_libdrm check. Cc: Dave Airlie <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> (v1) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]> (v1) Tested-by: Mike Lothian <[email protected]>
*	winsys/amdgpu: use drmGetDevice2 API	Emil Velikov	2017-03-15	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	Analogous to previous commit Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98502 Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Mike Lothian <[email protected]>
*	loader: use drmGetDevice[s]2 API	Emil Velikov	2017-03-15	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \|	By this allows us to fetch the device list/info w/o the revision field. At the moment retrieving the latter wakes up the device. Note: kernel patch to resolve that should be in 4.10. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Mike Lothian <[email protected]>
*	autoconf/scons: bump libdrm to 2.4.75	Emil Velikov	2017-03-15	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	We'll be using the drmGetDevice[s]2 API in src/loader with next patch. v2: Rebase. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]> (v1) Tested-by: Mike Lothian <[email protected]>
*	util/sha1: drop _mesa_sha1_{update, format} return type	Emil Velikov	2017-03-15	4	-16/+14
\| \| \| \| \| \| \| \| \|	Unused/unchecked by any of the callers. v2: Fix the glsl cases that have crept in since v1 Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
*	util/sha1: rework _mesa_sha1_{init,final}	Emil Velikov	2017-03-15	6	-64/+44
\| \| \| \| \| \| \| \| \| \| \| \|	Rather than having an extra memory allocation [that we currently do not and act accordingly] just make the API take an pointer to a stack allocated instance. This and follow-up steps will effectively make the _mesa_sha1_foo simple define/inlines around their SHA1 counterparts. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
*	util/sha1: add non-typedef name for the SHA1_CTX struct	Emil Velikov	2017-03-15	2	-1/+4
\| \| \| \| \| \| \| \| \| \|	Using typedef(s) is not always the answer and makes it harder for people to do clever (or one might call nasty) things with the code. Add a struct name which we will use with follow-up commit. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
*	radv: Remove unused descriptor set field.	Bas Nieuwenhuizen	2017-03-15	1	-1/+0
\| \| \| \| \| \|	Trivial. Signed-off-by: Bas Nieuwenhuizen <[email protected]>
*	r600: refactor binding code for attach buffer to CB.	Dave Airlie	2017-03-15	1	-33/+78
\| \| \| \| \| \| \| \| \|	This refactors out the code and fixes it up to be used for images later. It uses the code in the current RAT binding for compute. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	r600: refactor out CB setup.	Dave Airlie	2017-03-15	1	-104/+143
\| \| \| \| \| \| \| \| \|	This moves the code to create CB info out into a separate function so it can be reused in images code to create RATs. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	r600: refactor texture resource words setup code.	Dave Airlie	2017-03-15	1	-88/+131
\| \| \| \| \| \| \| \|	This refactors out the code to setup a texture resource so we can reuse it later from the images code. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	r600: factor out the code to initialise a buffer resource.	Dave Airlie	2017-03-15	1	-29/+51
\| \| \| \| \| \| \| \| \| \|	This takes the code required to initialise a buffer resource out of the texture buffer code, into it's own function. This is going to be used for the image support later. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	r600g: make framebuffer atom rely on dual src blend state.	Dave Airlie	2017-03-15	4	-2/+7
\| \| \| \| \| \| \| \|	In order to make ARB_shader_image_load_store, we have to share the CB space with RATs, so we should only steal the dual src space if we have dual src enabled. Signed-off-by: Dave Airlie <[email protected]>
*	intel/debug: Add a common INTEL_DEBUG=nohiz option	Jason Ekstrand	2017-03-14	5	-6/+4
\| \| \| \| \| \| \| \|	The GL driver had a driconf option (which doesn't make much sense) and the Vulkan driver had a hand-rolled environment variable. Instead, let's tie both into the INTEL_DEBUG mechanism and unify things. Reviewed-by: Topi Pohjolainen <[email protected]>
*	anv/image: Move handling of INTEL_VK_HIZ	Jason Ekstrand	2017-03-14	1	-2/+2
\| \| \| \| \| \| \|	This makes it so that you don't get an "Implement gen7 HiZ" perf warning when you manually disable HiZ on gen8. Reviewed-by: Topi Pohjolainen <[email protected]>
*	radv: trivial tidy ups	Timothy Arceri	2017-03-15	2	-5/+2
\| \| \| \|	Reviewed-by: Edward O'Callaghan <[email protected]>
*	util/disk_cache: scale cache according to filesystem size	Alan Swanson	2017-03-15	1	-3/+8
\| \| \| \| \| \| \| \|	Select higher of current 1G default or 10% of filesystem where cache is located. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
*	util/disk_cache: actually enforce cache size	Alan Swanson	2017-03-15	2	-4/+24
\| \| \| \| \| \| \| \| \| \|	Currently only a one in one out eviction so if at max_size and cache files were to constantly increase in size then so would the cache. Restrict to limit of 8 evictions per new cache entry. V2: (Timothy Arceri) fix make check tests Reviewed-by: Grazvydas Ignotas <[email protected]>
*	util/disk_cache: use LRU eviction rather than random eviction	Alan Swanson	2017-03-15	1	-43/+34
\| \| \| \| \| \| \| \| \| \| \|	Still using fast random selection of two-character subdirectory in which to check cache files rather than scanning entire cache. v2: Factor out double strlen call v3: C99 declaration of variables where used Reviewed-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	util/disk_cache: don't fallback to an empty cache dir on evict	Timothy Arceri	2017-03-15	1	-6/+27
\| \| \| \| \| \| \| \| \| \|	If we fail to randomly select a two letter cache dir, don't select an empty dir on fallback. In real world use we should never hit the fallback path but it can be hit by tests when the cache is set to a very small max value. Reviewed-by: Grazvydas Ignotas <[email protected]>
*	util/disk_cache: use a thread queue to write to shader cache	Timothy Arceri	2017-03-15	2	-13/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should help reduce any overhead added by the shader cache when programs are not found in the cache. To avoid creating any special function just for the sake of the tests we add a one second delay whenever we call dick_cache_put() to give it time to finish. V2: poll for file when waiting for thread in test V3: fix poll delay to really be 100ms, and simplify the wait function Reviewed-by: Grazvydas Ignotas <[email protected]>
*	util/disk_cache: add helpers for creating/destroying disk cache put jobs	Timothy Arceri	2017-03-15	1	-0/+40
\| \| \| \| \| \| \|	V2: Make a copy of the data so we don't have to worry about it being freed before we are done compressing/writing. Reviewed-by: Grazvydas Ignotas <[email protected]>
*	util/disk_cache: add thread queue to disk cache	Timothy Arceri	2017-03-15	1	-1/+15
\| \| \| \| \|	Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
*	radv/ac: workaround regression in llvm 4.0 release	Dave Airlie	2017-03-15	1	-1/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	LLVM 4.0 released with a pretty messy regression, that hopefully get fixed in the future. This work around was proposed by Tom, and it fixes the CTS regressions here at least, I'm not sure if this will cause any major side effects, but correctness over speed and all that. radeonsi should possibly consider the same workaround until an llvm fix can be found. Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv/ac: gather4 cube workaround integer	Dave Airlie	2017-03-15	1	-1/+71
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fix is extracted from amdgpu-pro shader traces. It appears the gather4 workaround for integer types doesn't work for cubes, so instead if forces a float scaled sample, then converts to integer. It modifies the descriptor before calling the gather. This also produces some ugly asm code for reasons specified in the patch, llvm could probably do better than dumping sgprs to vgprs. This fixes: dEQP-VK.glsl.texture_gather.basic.cube.rgba8* Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	radv: Set driver version to mesa version;	Bas Nieuwenhuizen	2017-03-15	1	-1/+23
\| \| \| \| \| \| \| \| \| \| \| \| \|	I couldn't really find an encoding in the spec. I'm not sure it prescribes VK_MAKE_VERSION format, but vulkan.gpuinfo.org interprets it that way by default. vulkaninfo gives the raw number, so we could alternatively do something like 17001000, but that doesn't show up right on vulkan.gpuinfo.org again. Looking at that site, the -pro driver also uses VK_MAKE_VERSION, so keeping consistency is probably best. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
*	radv: Increase api version to 1.0.42.	Bas Nieuwenhuizen	2017-03-15	1	-1/+1
\| \| \| \| \| \| \| \| \|	I've skimmed to changes from 1.0.5 to 1.0.42 and I think we have all changes. We're still not conformant ofcourse, but this should not regress stuff, Signed-off-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
*	util/vk: Add helpers for finding an extension struct	Jason Ekstrand	2017-03-15	1	-0/+17
\| \| \| \|	Reviewed-by: Dave Airlie <[email protected]>
*	radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBuffer	Alex Smith	2017-03-14	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Need to flush before updating the buffer to ensure that the copy is ordered after previous accesses (assuming the app has performed the appropriate barriers). This fixes potential issues due to draws prior to an update reading the new buffer content, despite having the necessary barriers between them. Signed-off-by: Alex Smith <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	radv: Emit cache flushes before CP DMA.	Bas Nieuwenhuizen	2017-03-14	1	-0/+3
\| \| \| \| \| \| \| \|	The flushes could be due to TRANSFER barriers. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	Convert sed(1) syntax to be compatible with FreeBSD and OpenBSD	Jan Beich	2017-03-14	1	-10/+10
\| \| \| \| \| \| \| \| \| \| \| \|	BSD regex library doesn't support extended RE escapes (e.g. \+) and shorthand character classes (e.g. \s, \S) and SVR4-style word delimiters[1] (on DragonFly and NetBSD). Both GNU and BSD sed support -E and -r to enable extended RE but OS X still lacks -r. [1] https://www.illumos.org/issues/516 Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Eric Engestrom <[email protected]> (GNU sed)
*	anv: Properly enumerate physical devices when none are present	Jason Ekstrand	2017-03-14	1	-2/+5
\|
*	nir/constant_expressions: Refactor helper functions	Jason Ekstrand	2017-03-14	1	-24/+27
\| \| \| \| \| \| \| \|	Apart from avoiding some unneeded size cases, this shouldn't have any actual functional impact. Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	nir: Rework conversion opcodes	Jason Ekstrand	2017-03-14	22	-308/+218
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <[email protected]>
*	i965/fs: Re-arrange conversion operations	Jason Ekstrand	2017-03-14	1	-36/+31
\| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/vec4: Get rid of the type parameter from to/from_double	Jason Ekstrand	2017-03-14	2	-24/+15
\| \| \| \|	Reviewed-by: Topi Pohjolainen <[email protected]>
*	glsl/nir: Use nir_type_conversion_op	Jason Ekstrand	2017-03-14	1	-37/+32
\| \| \| \| \| \|	Using the helper is way better than hand-coding the universe. Reviewed-by: Eric Anholt <[email protected]>
*	nir: Rewrite nir_type_conversion_op	Jason Ekstrand	2017-03-14	1	-63/+92
\| \| \| \| \| \| \| \| \|	The original version was very convoluted and tried way too hard to not just have the nested switch statement that it needs. Let's just write the obvious code and then we know it's correct. This fixes a bunch of missing cases particularly with int64. Reviewed-by: Plamena Manolova <[email protected]>
*	nir: Add a get_nir_type_for_glsl_base_type helper	Jason Ekstrand	2017-03-14	1	-2/+8
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	nir/validate: Rework ALU bit-size rule validation	Jason Ekstrand	2017-03-14	1	-32/+33
\| \| \| \| \| \| \| \| \| \| \|	The original bit-size validation wasn't capable of properly dealing with instructions with variable bit sizes. An attempt was made to handle it by looking at source and destinations but, because the validation was done in validate_alu_(src\|dest), it didn't really have the needed information. The new validation code is much more straightforward and should be more correct. Reviewed-by: Eric Anholt <[email protected]>
*	nir/validate: Validate that bit sizes and components always match	Jason Ekstrand	2017-03-14	1	-38/+63
\| \| \| \| \| \| \| \| \| \| \| \| \|	We've always required bit sizes to match but the rules for number of components have been a bit loose. You've never been allowed to source from something with less components than you consume, but more has always been fine. This changes the validator to require that they match exactly. The fact that they don't always match has been a source of confusion in NIR for quite some time and it's time we got rid of it. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir: Make image_size a variable-width intrinsic	Jason Ekstrand	2017-03-14	3	-11/+16
\| \| \| \| \| \|	Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	i965/fs: Use num_components from the SSA def in image intrinsics	Jason Ekstrand	2017-03-14	1	-2/+1
\| \| \| \| \| \|	Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir/lower_tex: Use tex_instr_dest_size for txs destinations	Jason Ekstrand	2017-03-14	1	-1/+2
\| \| \| \| \| \| \| \| \|	Using coord_components of the source texture is correct for everything except cube maps where it's off by one. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir/spirv: Restrict the number of channels in texture coordinates	Jason Ekstrand	2017-03-14	1	-1/+2
\| \| \| \| \| \| \| \| \|	Some SPIR-V texturing instructions pack more than the texture coordinate into the coordinate source. We need to mask off the unused channels. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir/copy_prop: Respect the source's number of components	Jason Ekstrand	2017-03-14	1	-33/+96
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the near future we are going to require that the num_components in a src dereference match the num_components of the SSA value being dereferenced. To do that, we need copy_prop to not remove our MOVs from a larger SSA value into an instruction that uses fewer channels. Because we suddenly have to know how many components each source has, this makes the pass a bit more complicated. Fortunately, copy propagation is the only pass that cares about the number of components are read by any given source so it's fairly contained. Shader-db results on Sky Lake: total instructions in shared programs: 13318947 -> 13320265 (0.01%) instructions in affected programs: 260633 -> 261951 (0.51%) helped: 324 HURT: 1027 Looking through the hurt programs, about a dozen are hurt by 3 instructions and the rest are all hurt by 2 instructions. From a spot-check of the shaders, the story is always the same: They get a vec4 from somewhere (frequently an input) and use the first two or three components as a texture coordinate. Because of the vector component mismatch, we have a mov or, more likely, a vecN sitting between the texture instruction and the input. This means that the back-end inserts a bunch of MOVs and split_virtual_grfs() goes to town. Because the texture coordinate is also used by some other calculation, register coalesce can't combine them back together and we end up with an extra 2 MOV instructions in our shader. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir/intrinsics: Make load_barycentric_input take a 2-component coor	Jason Ekstrand	2017-03-14	1	-1/+3
\| \| \| \| \| \|	Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: "17.0 13.0" <[email protected]>
*	anv/blorp: Only set a clear color for resolves if fast-cleared	Jason Ekstrand	2017-03-14	1	-1/+2
\| \| \| \| \| \| \|	Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: "17.0" <[email protected]>