summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallium: add TGSI opcodes TEX_LZ and TXF_LZMarek Olšák2017-03-154-5/+39
| | | | for better code generation in radeonsi
* gallium: add PIPE_CAP_TGSI_TEX_TXF_LZMarek Olšák2017-03-1517-0/+18
|
* radeonsi: disable sinking common instructions down to the end blockSamuel Pitoiset2017-03-151-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Initially this was a workaround for a bug introduced in LLVM 4.0 in the SimplifyCFG pass that caused image instrinsics to disappear (because they were badly sunk). Finally, this is a win because it decreases SGPR spilling and increases the number of waves a bit. Although, shader-db results are good I think we might want to remove it in the future once the issue is fixed. For now, enable it for LLVM >= 4.0. This also fixes a rendering issue with the speedometer in Dirt Rally. More information can be found here https://reviews.llvm.org/D26348. Thanks to Dave Airlie for the patch. v2: - add a FIXME comment - use if (HAVE_LLVM >= 0x0400) instead Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99484 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97988 Signed-off-by: Samuel Pitoiset <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add missing compute shader entry in tgsi_get_processor_name()Samuel Pitoiset2017-03-151-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: clean up tex_fetch_ptrs()Samuel Pitoiset2017-03-151-6/+4
| | | | | | | | Will also help when the src sampler register will be TGSI_FILE_CONSTANT for bindless. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* configure.ac: bump pthread-stubs requirementEmil Velikov2017-03-151-1/+1
| | | | | | | | | | | | | On platforms that require it, we bump the requirement to 0.4 or later. Due to an issue with the project [design] any version earlier than it, is bound to cause issues. For the specifics see the pthread-stubs README Cc: Uli Schlachter <[email protected]> Cc: Jonathan Gray <[email protected]> Cc: Jean-Sébastien Pédron <[email protected]> Cc: François Tigeot <[email protected]> Cc: Tobias Nygren <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* glx: don't expose systemTimeExtension for DRI2/DRI3/DRISWEmil Velikov2017-03-153-4/+0
| | | | | | Used/applicable to only dri1 drivers. Signed-off-by: Emil Velikov <[email protected]>
* anv: do not open random render node(s)Emil Velikov2017-03-152-17/+39
| | | | | | | | | | | | | | | drmGetDevices2() provides us with enough flexibility to build heuristics upon. Opening a random node on the other hand will wake up the device, regardless if it's the one we're interested or not. v2: Rebase, explicitly require/check for libdrm v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia) v4: Rebase Cc: Jason Ekstrand <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1) Tested-by: Mike Lothian <[email protected]>
* radv: do not open random render node(s)Emil Velikov2017-03-151-12/+36
| | | | | | | | | | | | | | | | drmGetDevices2() provides us with enough flexibility to build heuristics upon. Opening a random node on the other hand will wake up the device, regardless if it's the one we're interested or not. v2: Rebase. v3: Return VK_ERROR_INCOMPATIBLE_DRIVER for no devices (Ilia) Cc: Michel Dänzer <[email protected]> Cc: Dave Airlie <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]> (v1) Tested-by: Mike Lothian <[email protected]>
* radv/winsys: use drmGetDevice2 APIEmil Velikov2017-03-152-2/+3
| | | | | | | | | | | | | Analogous to previous commit v2: Add explicit require_libdrm check. Cc: Dave Airlie <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> (v1) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]> (v1) Tested-by: Mike Lothian <[email protected]>
* winsys/amdgpu: use drmGetDevice2 APIEmil Velikov2017-03-151-2/+2
| | | | | | | | | | | Analogous to previous commit Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98502 Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Mike Lothian <[email protected]>
* loader: use drmGetDevice[s]2 APIEmil Velikov2017-03-151-3/+3
| | | | | | | | | | | | By this allows us to fetch the device list/info w/o the revision field. At the moment retrieving the latter wakes up the device. Note: kernel patch to resolve that should be in 4.10. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Mike Lothian <[email protected]>
* autoconf/scons: bump libdrm to 2.4.75Emil Velikov2017-03-152-2/+2
| | | | | | | | | | | We'll be using the drmGetDevice[s]2 API in src/loader with next patch. v2: Rebase. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]> (v1) Tested-by: Mike Lothian <[email protected]>
* util/sha1: drop _mesa_sha1_{update, format} return typeEmil Velikov2017-03-154-16/+14
| | | | | | | | | Unused/unchecked by any of the callers. v2: Fix the glsl cases that have crept in since v1 Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
* util/sha1: rework _mesa_sha1_{init,final}Emil Velikov2017-03-156-64/+44
| | | | | | | | | | | | Rather than having an extra memory allocation [that we currently do not and act accordingly] just make the API take an pointer to a stack allocated instance. This and follow-up steps will effectively make the _mesa_sha1_foo simple define/inlines around their SHA1 counterparts. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
* util/sha1: add non-typedef name for the SHA1_CTX structEmil Velikov2017-03-152-1/+4
| | | | | | | | | | Using typedef(s) is not always the answer and makes it harder for people to do clever (or one might call nasty) things with the code. Add a struct name which we will use with follow-up commit. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
* radv: Remove unused descriptor set field.Bas Nieuwenhuizen2017-03-151-1/+0
| | | | | | Trivial. Signed-off-by: Bas Nieuwenhuizen <[email protected]>
* r600: refactor binding code for attach buffer to CB.Dave Airlie2017-03-151-33/+78
| | | | | | | | | This refactors out the code and fixes it up to be used for images later. It uses the code in the current RAT binding for compute. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: refactor out CB setup.Dave Airlie2017-03-151-104/+143
| | | | | | | | | This moves the code to create CB info out into a separate function so it can be reused in images code to create RATs. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: refactor texture resource words setup code.Dave Airlie2017-03-151-88/+131
| | | | | | | | This refactors out the code to setup a texture resource so we can reuse it later from the images code. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600: factor out the code to initialise a buffer resource.Dave Airlie2017-03-151-29/+51
| | | | | | | | | | This takes the code required to initialise a buffer resource out of the texture buffer code, into it's own function. This is going to be used for the image support later. Reviewed-by: Edward O'Callaghan <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: make framebuffer atom rely on dual src blend state.Dave Airlie2017-03-154-2/+7
| | | | | | | | In order to make ARB_shader_image_load_store, we have to share the CB space with RATs, so we should only steal the dual src space if we have dual src enabled. Signed-off-by: Dave Airlie <[email protected]>
* intel/debug: Add a common INTEL_DEBUG=nohiz optionJason Ekstrand2017-03-145-6/+4
| | | | | | | | The GL driver had a driconf option (which doesn't make much sense) and the Vulkan driver had a hand-rolled environment variable. Instead, let's tie both into the INTEL_DEBUG mechanism and unify things. Reviewed-by: Topi Pohjolainen <[email protected]>
* anv/image: Move handling of INTEL_VK_HIZJason Ekstrand2017-03-141-2/+2
| | | | | | | This makes it so that you don't get an "Implement gen7 HiZ" perf warning when you manually disable HiZ on gen8. Reviewed-by: Topi Pohjolainen <[email protected]>
* radv: trivial tidy upsTimothy Arceri2017-03-152-5/+2
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* util/disk_cache: scale cache according to filesystem sizeAlan Swanson2017-03-151-3/+8
| | | | | | | | Select higher of current 1G default or 10% of filesystem where cache is located. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
* util/disk_cache: actually enforce cache sizeAlan Swanson2017-03-152-4/+24
| | | | | | | | | | Currently only a one in one out eviction so if at max_size and cache files were to constantly increase in size then so would the cache. Restrict to limit of 8 evictions per new cache entry. V2: (Timothy Arceri) fix make check tests Reviewed-by: Grazvydas Ignotas <[email protected]>
* util/disk_cache: use LRU eviction rather than random evictionAlan Swanson2017-03-151-43/+34
| | | | | | | | | | | Still using fast random selection of two-character subdirectory in which to check cache files rather than scanning entire cache. v2: Factor out double strlen call v3: C99 declaration of variables where used Reviewed-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* util/disk_cache: don't fallback to an empty cache dir on evictTimothy Arceri2017-03-151-6/+27
| | | | | | | | | | If we fail to randomly select a two letter cache dir, don't select an empty dir on fallback. In real world use we should never hit the fallback path but it can be hit by tests when the cache is set to a very small max value. Reviewed-by: Grazvydas Ignotas <[email protected]>
* util/disk_cache: use a thread queue to write to shader cacheTimothy Arceri2017-03-152-13/+75
| | | | | | | | | | | | | | This should help reduce any overhead added by the shader cache when programs are not found in the cache. To avoid creating any special function just for the sake of the tests we add a one second delay whenever we call dick_cache_put() to give it time to finish. V2: poll for file when waiting for thread in test V3: fix poll delay to really be 100ms, and simplify the wait function Reviewed-by: Grazvydas Ignotas <[email protected]>
* util/disk_cache: add helpers for creating/destroying disk cache put jobsTimothy Arceri2017-03-151-0/+40
| | | | | | | V2: Make a copy of the data so we don't have to worry about it being freed before we are done compressing/writing. Reviewed-by: Grazvydas Ignotas <[email protected]>
* util/disk_cache: add thread queue to disk cacheTimothy Arceri2017-03-151-1/+15
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]>
* radv/ac: workaround regression in llvm 4.0 releaseDave Airlie2017-03-151-1/+12
| | | | | | | | | | | | | | | LLVM 4.0 released with a pretty messy regression, that hopefully get fixed in the future. This work around was proposed by Tom, and it fixes the CTS regressions here at least, I'm not sure if this will cause any major side effects, but correctness over speed and all that. radeonsi should possibly consider the same workaround until an llvm fix can be found. Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: gather4 cube workaround integerDave Airlie2017-03-151-1/+71
| | | | | | | | | | | | | | | | | | | | This fix is extracted from amdgpu-pro shader traces. It appears the gather4 workaround for integer types doesn't work for cubes, so instead if forces a float scaled sample, then converts to integer. It modifies the descriptor before calling the gather. This also produces some ugly asm code for reasons specified in the patch, llvm could probably do better than dumping sgprs to vgprs. This fixes: dEQP-VK.glsl.texture_gather.basic.cube.rgba8* Acked-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Set driver version to mesa version;Bas Nieuwenhuizen2017-03-151-1/+23
| | | | | | | | | | | | | I couldn't really find an encoding in the spec. I'm not sure it prescribes VK_MAKE_VERSION format, but vulkan.gpuinfo.org interprets it that way by default. vulkaninfo gives the raw number, so we could alternatively do something like 17001000, but that doesn't show up right on vulkan.gpuinfo.org again. Looking at that site, the -pro driver also uses VK_MAKE_VERSION, so keeping consistency is probably best. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: Increase api version to 1.0.42.Bas Nieuwenhuizen2017-03-151-1/+1
| | | | | | | | | I've skimmed to changes from 1.0.5 to 1.0.42 and I think we have all changes. We're still not conformant ofcourse, but this should not regress stuff, Signed-off-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
* util/vk: Add helpers for finding an extension structJason Ekstrand2017-03-151-0/+17
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Flush before copying with PKT3_WRITE_DATA in CmdUpdateBufferAlex Smith2017-03-141-0/+2
| | | | | | | | | | | | | | | Need to flush before updating the buffer to ensure that the copy is ordered after previous accesses (assuming the app has performed the appropriate barriers). This fixes potential issues due to draws prior to an update reading the new buffer content, despite having the necessary barriers between them. Signed-off-by: Alex Smith <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Emit cache flushes before CP DMA.Bas Nieuwenhuizen2017-03-141-0/+3
| | | | | | | | The flushes could be due to TRANSFER barriers. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Cc: 17.0 <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* Convert sed(1) syntax to be compatible with FreeBSD and OpenBSDJan Beich2017-03-141-10/+10
| | | | | | | | | | | | BSD regex library doesn't support extended RE escapes (e.g. \+) and shorthand character classes (e.g. \s, \S) and SVR4-style word delimiters[1] (on DragonFly and NetBSD). Both GNU and BSD sed support -E and -r to enable extended RE but OS X still lacks -r. [1] https://www.illumos.org/issues/516 Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Eric Engestrom <[email protected]> (GNU sed)
* anv: Properly enumerate physical devices when none are presentJason Ekstrand2017-03-141-2/+5
|
* nir/constant_expressions: Refactor helper functionsJason Ekstrand2017-03-141-24/+27
| | | | | | | | Apart from avoiding some unneeded size cases, this shouldn't have any actual functional impact. Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Rework conversion opcodesJason Ekstrand2017-03-1422-308/+218
| | | | | | | | | | | | | | | | | | | | | | | | The NIR story on conversion opcodes is a mess. We've had way too many of them, naming is inconsistent, and which ones have explicit sizes was sort-of random. This commit re-organizes things and makes them all consistent: - All non-bool conversion opcodes now have the explicit size in the destination and are named <src_type>2<dst_type><size>. - Integer <-> integer conversion opcodes now only come in i2i and u2u forms (i2u and u2i have been removed) since the only difference between the different integer conversions is whether or not they sign-extend when up-converting. - Boolean conversion opcodes all have the explicit size on the bool and are named <src_type>2<dst_type>. Making things consistent also allows nir_type_conversion_op to be moved to nir_opcodes.c and auto-generated using mako. This will make adding int8, int16, and float16 versions much easier when the time comes. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Re-arrange conversion operationsJason Ekstrand2017-03-141-36/+31
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/vec4: Get rid of the type parameter from to/from_doubleJason Ekstrand2017-03-142-24/+15
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* glsl/nir: Use nir_type_conversion_opJason Ekstrand2017-03-141-37/+32
| | | | | | Using the helper is way better than hand-coding the universe. Reviewed-by: Eric Anholt <[email protected]>
* nir: Rewrite nir_type_conversion_opJason Ekstrand2017-03-141-63/+92
| | | | | | | | | The original version was very convoluted and tried way too hard to not just have the nested switch statement that it needs. Let's just write the obvious code and then we know it's correct. This fixes a bunch of missing cases particularly with int64. Reviewed-by: Plamena Manolova <[email protected]>
* nir: Add a get_nir_type_for_glsl_base_type helperJason Ekstrand2017-03-141-2/+8
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nir/validate: Rework ALU bit-size rule validationJason Ekstrand2017-03-141-32/+33
| | | | | | | | | | | The original bit-size validation wasn't capable of properly dealing with instructions with variable bit sizes. An attempt was made to handle it by looking at source and destinations but, because the validation was done in validate_alu_(src|dest), it didn't really have the needed information. The new validation code is much more straightforward and should be more correct. Reviewed-by: Eric Anholt <[email protected]>
* nir/validate: Validate that bit sizes and components always matchJason Ekstrand2017-03-141-38/+63
| | | | | | | | | | | | | We've always required bit sizes to match but the rules for number of components have been a bit loose. You've never been allowed to source from something with less components than you consume, but more has always been fine. This changes the validator to require that they match exactly. The fact that they don't always match has been a source of confusion in NIR for quite some time and it's time we got rid of it. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>