summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i965/sbe: fix active components for SSO programs with over 16 inputsIago Toral Quiroga2017-10-191-8/+2
| | | | | | | | | | | | | | | | | | | | | | When we have up to 16 FS inputs, the SF unit will reorder our inputs to be consecutive, however, when we have more than 16 we need to to read our inputs from the URB exactly as they have been output from the previous stage. This means that for SSO we have to consider if we have URB padding due to unused input locations. Specifically, this affects gen9 active components programming, since for things to work in scenarios with over 16 inputs that have padded regions we need to ensure that we program active components for the padded regions too. If we don't do this the hardware won't read the URB properly for inputs located after padded regions. Found empirically. Fixes (these also require a patch in CTS): KHR-GL45.enhanced_layouts.varying_locations KHR-GL45.enhanced_layouts.varying_array_locations Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Do not log a perf warning when mapping an idle boChris Wilson2017-10-191-2/+3
| | | | | | | | | | We only want to scare the user away from causing a GPU stall for mapping a busy bo. The time taken to instantiate the set of pages for a buffer and their mmapping is unavoidable and flagging idle bo as being busy is "crying wolf". Reported-by: Tvrtko Ursulin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use a union to bitcast a floatMatt Turner2017-10-181-1/+2
| | | | ... which does not break C's aliasing rules.
* drirc: Group a few games in the glthread whitelist together.Darren Salt2017-10-191-6/+21
| | | | Signed-off-by: Marek Olšák <[email protected]>
* drirc: Enable glthread for more games (Saints Row 4 & Gat out of Hell).Darren Salt2017-10-191-0/+6
| | | | | | | | | | | | “Saints Row: Gat out of Hell” benefits from this on slower CPUs in that usage spikes on individual cores are avoided, which in turn makes it harder to hit a bug which causes broken audio and the game to hang on exit. “Saints Row IV” appears to be fine either way, but also exhibits the audio breakage bug: glthread is therefore being enabled on the grounds that it should make it a little harder to hit that bug. Signed-off-by: Marek Olšák <[email protected]>
* radv: reset dirty flags after flushing all statesSamuel Pitoiset2017-10-181-2/+2
| | | | | | | | Move it to radv_cmd_buffer_flush_state() because if rasterizerDiscardEnable is true, the flags are not cleared. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not re-emit the index buffer for every draw callSamuel Pitoiset2017-10-181-29/+28
| | | | | | | | | It can only be changed when CmdBindIndexBuffer() is called or when a secondary buffer is used. Though not always, but let's re-emit the packets in this situation for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove useless mask operation in radv_cs_emit_draw_indexed_packet()Samuel Pitoiset2017-10-181-1/+1
| | | | | | | This saves few CPU cycles when CmdDrawIndexed() is used a lot. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Do not read from the disk cache with RADV_DEBUG=nocache.Bas Nieuwenhuizen2017-10-181-1/+2
| | | | | | Otherwise the flag is borderline useless. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Set active_stages after getting cached shadersAlex Smith2017-10-181-1/+6
| | | | | | Fixes: 7d45d22fdd2e ("radv: switch to using radv_create_shaders()") Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Don't free NIR shaders if tracingAlex Smith2017-10-181-1/+1
| | | | | | | | | Fixes a crash while generating a hang report. Fixes: 7d45d22fdd2e ("radv: switch to using radv_create_shaders()") Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* Revert "egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}"Marek Olšák2017-10-185-18/+31
| | | | | | This reverts commit 8cb84c8477a57ed05d703669fee1770f31b76ae6. This fixes crashing shader-db/run.
* Revert "egl: drop EGL driver `name`"Marek Olšák2017-10-185-1/+10
| | | | | | This reverts commit 6414d6bd8d2897f4ba643357fe3037f3acd60879. This is needed to apply the next revert.
* st/mesa: set dimension for constants in ATI_fragment_shaderMiklós Máté2017-10-181-0/+4
| | | | | | | | | This fixes an assertion failure introduced by 30a2f0dfd46de. Fixes: 30a2f0dfd46 ("radeonsi: add an assertion that only Signed-off-by: Miklós Máté <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* st/osmesa: include u_inlines.h for pipe_resource_referenceMichel Dänzer2017-10-181-0/+1
| | | | | | | | | Fixes build failure due to unresolved symbol. Fixes: 7561da367bae "st/mesa: Initialize textures array in st_framebuffer_validate" Trivial.
* st/mesa: Initialize textures array in st_framebuffer_validateMichel Dänzer2017-10-186-12/+7
| | | | | | | | | | | | | | | | And just reference pipe_resources to it in the validate callbacks. Avoids pipe_resource leaks when st_framebuffer_validate ends up calling the validate callback multiple times, e.g. when a window is resized. v2: * Use generic stable tag instead of Fixes: tag, since the problem could already happen before the commit referenced in v1 (Thomas Hellstrom) * Use memset to initialize the array on the stack instead of allocating the array with os_calloc. Cc: [email protected] Reviewed-by: Thomas Hellstrom <[email protected]>
* egl: set UseFallback if LIBGL_ALWAYS_SOFTWARE is setEric Engestrom2017-10-184-6/+7
| | | | | Suggested-by: Emil Velikov <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* egl: drop EGL driver `name`Eric Engestrom2017-10-185-10/+1
| | | | | | | | | | | | | | The "DRI2" name was reported as confusing when printing EGL infos (one user reported thinking DRI3 was not working on his X server), and the only alternative is Haiku, which can only be used on a Haiku machine. The name therefore doesn't add any information that the user wouldn't know already, so let's just drop it. Cc: Kai Wasserbäch <[email protected]> Suggested-by: Emil Velikov <[email protected]> Related-to: b174a1ae720cb404738c ("egl: Simplify the "driver" interface") Signed-off-by: Eric Engestrom <[email protected]>
* egl: drop always-false TestOnly optionEric Engestrom2017-10-185-18/+9
| | | | Signed-off-by: Eric Engestrom <[email protected]>
* Fix the xf86vm meson dependencyNicholas Miell2017-10-181-1/+1
| | | | | | | | The pkg-config file is called xxf86vm. Signed-off-by: Nicholas Miell <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}Eric Engestrom2017-10-185-31/+18
| | | | | | | | Note: dropping the EGL_BAD_ALLOC in egl_haiku because it's overwritten by the EGL_NOT_INITIALIZED in eglInitialize(). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl_dri2: drop dri2_egl_driver structEric Engestrom2017-10-182-58/+50
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl_dri2: move glFlush out of struct dri2_egl_driverEric Engestrom2017-10-182-27/+22
| | | | | | | There's no reason to store this there, it doesn't depend on the driver. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* llvmpipe: handle shader sample mask outputRoland Scheidegger2017-10-181-2/+24
| | | | | | | | | This probably isn't all that useful for GL, but there are apis where sample_mask is a valid output even without msaa. Just discard the pixel if the sample_mask doesn't include the bit for sample 0. Reviewed-by: Brian Paul <[email protected]>
* anv: Fix instance typos.Vinson Lee2017-10-182-2/+2
| | | | | | | | | | | | | | | | | | Fix build error. CC vulkan/vulkan_libvulkan_common_la-anv_device.lo In file included from vulkan/anv_device.c:33:0: vulkan/anv_device.c: In function ‘anv_AllocateMemory’: vulkan/anv_device.c:1562:37: error: ‘struct anv_device’ has no member named ‘instace’; did you mean ‘instance’? result = vk_errorf(device->instace, device, ^ vulkan/anv_private.h:317:17: note: in definition of macro ‘vk_errorf’ __vk_errorf(instance, obj, REPORT_OBJECT_TYPE(obj), error,\ ^~~~~~~~ Fixes: 9775894f1025 ("anv: Move size check from anv_bo_cache_import() to caller (v2)") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* mesa: fix trivial typo in _mesa_PixelMapusv() error stringBrian Paul2017-10-181-1/+1
| | | | | Signed-off-by: Brian Paul <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103323
* meson: move expat dependency where it's neededEric Engestrom2017-10-182-2/+2
| | | | | | Suggested-by: Lionel Landwerlin <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* automake: intel: move expat handling where it's usedHongxu Jia2017-10-182-5/+2
| | | | | | | | | | Linking libvulkan_intel.so can fail, due to unresolved references to libexpat.so. EXPAT_CFLAGS should be moved as well. Signed-off-by: Hongxu Jia <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* radv: don't create dummy fs when compiling compute stageTimothy Arceri2017-10-181-1/+1
| | | | | | Fixes: d1c9f30d7ff7 "radv: add radv_create_shaders() helper" Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use the dispatch initiator for indirect dispatchesSamuel Pitoiset2017-10-181-11/+13
| | | | | | | Missed that when I allowed waves to be launched out-of-order. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove XtoY_temps structsSamuel Pitoiset2017-10-181-36/+26
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: Install as Vulkan HAL module in Android.mk buildTapani Pälli2017-10-181-1/+3
| | | | | | | | | | | | | | | | | Now that anvil fully implements the Vulkan HAL interface, we can install it as the vendor HAL module at /vendor/lib/hw/vulkan.${board}.so. To do so: - Rename LOCAL_MODULE to vulkan.$(TARGET_BOARD_PLATFORM). - Use LOCAL_PROPRIETARY_MODULE to install under vendor path. Tested by running different Sascha Williams demos on Android-IA. Signed-off-by: Tapani Pälli <[email protected]> [chadv: Extract this hunk from Tapani's patch, and embed it as stand-alone patch in my arc-vulkan series]. Signed-off-by: Chad Versace <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Implement VK_ANDROID_native_buffer (v9)Chad Versace2017-10-188-6/+459
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implementation is correct (afaict), but takes two shortcuts regarding the import/export of Android sync fds. Shortcut 1. When Android calls vkAcquireImageANDROID to import a sync fd into a VkSemaphore or VkFence, the driver instead simply blocks on the sync fd, then puts the VkSemaphore or VkFence into the signalled state. Thanks to implicit sync, this produces correct behavior (with extra latency overhead, perhaps) despite its ugliness. Shortcut 2. When Android calls vkQueueSignalReleaseImageANDROID to export a collection of wait semaphores as a sync fd, the driver instead submits the semaphores to the queue, then returns sync fd -1, which informs the caller that no additional synchronization is needed. Again, thanks to implicit sync, this produces correct behavior (with extra batch submission overhead) despite its ugliness. I chose to take the shortcuts instead of properly importing/exporting the sync fds for two reasons: Reason 1. I've already tested this patch with dEQP and with demos apps. It works. I wanted to get the tested patches into the tree now, and polish the implementation afterwards. Reason 2. I want to run this on a 3.18 kernel (gasp!). In 3.18, i915 supports neither Android's sync_fence, nor upstream's sync_file, nor drm_syncobj. Again, I tested these patches on Android with a 3.18 kernel and they work. I plan to quickly follow-up with patches that remove the shortcuts and properly import/export the sync fds. Non-Testing =========== I did not test at all using the Android.mk buildsystem. I may have broke it. Please test and review that. Testing ======= I tested with 64-bit ARC++ on a Skylake Chromebook and a 3.18 kernel. The following pass (as of patchset v9): - a little spinning cube demo APK - several Sascha demos - dEQP-VK.info.* - dEQP-VK.api.wsi.android.* (except dEQP-VK.api.wsi.android.swapchain.*.image_usage, because dEQP wants to create swapchains with VK_IMAGE_USAGE_STORAGE_BIT) - dEQP-VK.api.smoke.* - dEQP-VK.api.info.instance.* - dEQP-VK.api.info.device.* v2: - Reject VkNativeBufferANDROID if the dma-buf's size is too small for the VkImage. - Stop abusing VkNativeBufferANDROID by passing it to vkAllocateMemory during vkCreateImage. Instead, directly import its dma-buf during vkCreateImage with anv_bo_cache_import(). [for jekstrand] - Rebase onto Tapani's VK_EXT_debug_report changes. - Drop `CPPFLAGS += $(top_srcdir)/include/android`. The dir does not exist. v3: - Delete duplicate #include "anv_private.h". [per Tapani] - Try to fix the Android-IA build in Android.vulkan.mk by following Tapani's example. v4: - Unset EXEC_OBJECT_ASYNC and set EXEC_OBJECT_WRITE on the imported gralloc buffer, just as we do for all other winsys buffers in anv_wsi.c. [found by Tapani] v5: - Really fix the Android-IA build by ensuring that Android.vulkan.mk uses Mesa' vulkan.h and not Android's. Insert -I$(MESA_TOP)/include before -Iframeworks/native/vulkan/include. [for Tapani] - In vkAcquireImageANDROID, submit signal operations to the VkSemaphore and VkFence. [for zhou] v6: - Drop copy-paste duplication in vkGetSwapchainGrallocUsageANDROID(). [found by zhou] - Improve comments in vkGetSwapchainGrallocUsageANDROID(). v7: - Fix vkGetSwapchainGrallocUsageANDROID() to inspect its VkImageUsageFlags parameter. [for tfiga] - This fix regresses dEQP-VK.api.wsi.android.swapchain.*.image_usage because dEQP wants to create swapchains with VK_IMAGE_USAGE_STORAGE_BIT. v8: - Drop unneeded goto in vkAcquireImageANDROID. [for tfiga] v8.1: (minor changes) - Drop errant hunks added by rerere in anv_device.c. - Drop explicit mention of VK_ANDROID_native_buffer in anv_entrypoints_gen.py. [for jekstrand] v9: - Isolate as much Android code as possible, moving it from anv_image.c to anv_android.c. Connect the files with anv_image_from_gralloc(). Remove VkNativeBufferANDROID params from all anv_image.c funcs. [for krh] - Replace some intel_loge() with vk_errorf() in anv_android.c. - Use © in copyright line. [for krh] Reviewed-by: Tapani Pälli <[email protected]> (v5) Reviewed-by: Kristian H. Kristensen <[email protected]> (v9) Reviewed-by: Jason Ekstrand <[email protected]> (v9) Cc: zhoucm1 <[email protected]> Cc: Tomasz Figa <[email protected]>
* anv: Move size check from anv_bo_cache_import() to caller (v2)Chad Versace2017-10-175-23/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change prepares for VK_ANDROID_native_buffer. When the user imports a gralloc hande into a VkImage using VK_ANDROID_native_buffer, the user provides no size. The driver must infer the size from the internals of the gralloc buffer. The patch is essentially a refactor patch, but it does change behavior in some edge cases, described below. In what follows, the "nominal size" of the bo refers to anv_bo::size, which may not match the bo's "actual size" according to the kernel. Post-patch, the nominal size of the bo returned from anv_bo_cache_import() is always the size of imported dma-buf according to lseek(). Pre-patch, the bo's nominal size was difficult to predict. If the imported dma-buf's gem handle was not resident in the cache, then the bo's nominal size was align(VkMemoryAllocateInfo::allocationSize, 4096). If it *was* resident, then the bo's nominal size was whatever the cache returned. As a consequence, the first cache insert decided the bo's nominal size, which could be significantly smaller compared to the dma-buf's actual size, as the nominal size was determined by VkMemoryAllocationInfo::allocationSize and not lseek(). I believe this patch cleans up that messy behavior. For an imported or exported VkDeviceMemory, anv_bo::size should now be the true size of the bo, if I correctly understand the problem (which I possibly don't). v2: - Preserve behavior of aligning size to 4096 before checking. [for jekstrand] - Check size with < instead of <=, to match behavior of commit c0a4f56 "anv: bo_cache: allow importing a BO larger than needed". [for chadv]
* meson: turn on pl111 not vc4 when pl111 driver specificedDylan Baker2017-10-171-1/+1
| | | | | | Reviewed-by: Eric Anholt <[email protected]> fixes: 1918c9b1627d5403 ("meson: Add support for the pl111 driver.") Signed-off-by: Dylan Baker <[email protected]>
* radv: Link shaders.Bas Nieuwenhuizen2017-10-183-1/+46
| | | | | | | | | | | | | | Here we make use of NIR the linking helpers to remove unused varyings. Sascha Willems demo results: computecullandlod 39 -> 41 fps pipelines ~6100 -> ~6200 fps Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Timothy Arceri <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: reuse the multiple shader store & load functions for gs copy variantTimothy Arceri2017-10-183-149/+17
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove some now unused shader compile codeTimothy Arceri2017-10-183-254/+0
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: switch to using radv_create_shaders()Timothy Arceri2017-10-181-85/+29
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_create_shaders() helperBas Nieuwenhuizen2017-10-181-0/+130
| | | | | | | | | This is a combined shader creation helper than will help us to create the shaders for each stage at once. This will allow us to do some link time optimisations. Signed-off-by: Timothy Arceri <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: add radv_hash_shaders() helperBas Nieuwenhuizen2017-10-182-0/+40
| | | | | | | | This will be used to create a hash of the combined shaders in the pipeline. Signed-off-by: Timothy Arceri <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: Add multiple shader cache store & load functions.Bas Nieuwenhuizen2017-10-182-0/+170
| | | | | Signed-off-by: Timothy Arceri <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: Change cache datastructures for combined pipelines.Bas Nieuwenhuizen2017-10-181-38/+64
| | | | | Signed-off-by: Timothy Arceri <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: reorder init function callsTimothy Arceri2017-10-181-2/+2
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* meson: Add support for the vc5 driver.Eric Anholt2017-10-178-2/+217
| | | | | | | v2: Default vc5 to off, since it requires the simulator currently. Add missing dep on the XML generation from libbroadcom_vc5. Reviewed-by: Dylan Baker <[email protected]> (v1)
* meson: Add support for the pl111 driver.Eric Anholt2017-10-175-2/+41
| | | | Reviewed-by: Dylan Baker <[email protected]>
* meson: Add support for the vc4 driver.Eric Anholt2017-10-179-4/+234
| | | | Reviewed-by: Dylan Baker <[email protected]>
* radeonsi: if there's just const buffer 0, set it in place of CONST/SSBO pointerMarek Olšák2017-10-174-13/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SI_SGPR_CONST_AND_SHADER_BUFFERS now contains the pointer to const buffer 0 if there is no other buffer there. Benefits: - there is no constbuf descriptor upload and shader load It's assumed that all constant addresses are within bounds. Non-constant addresses are clamped against the last declared CONST variable. This only works if the state tracker ensures the bound constant buffer matches what the shader needs. Once we get 32-bit pointers, we can only do this for user constant buffers where the driver is in charge of the upload so that it can guarantee a 32-bit address. The real performance benefit might not be measurable. These apps get 100% theoretical benefit in all shaders (except where noted): - antichamber - barman arkham origins - borderlands 2 - borderlands pre-sequel - brutal legend - civilization BE - CS:GO - deadcore - dota 2 -- most shaders - europa universalis - grid autosport -- most shaders - left 4 dead 2 - legend of grimrock - life is strange - payday 2 - portal - rocket league - serious sam 3 bfe - talos principle - team fortress 2 - thea - unigine heaven - unigine valley -- also sanctuary and tropics - wasteland 2 - xcom: enemy unknown & enemy within - tesseract - unity (engine) Changed stats only: SGPRS: 2059998 -> 2086238 (1.27 %) VGPRS: 1626888 -> 1626904 (0.00 %) Spilled SGPRs: 7902 -> 7865 (-0.47 %) Code Size: 60924520 -> 60982660 (0.10 %) bytes Max Waves: 374539 -> 374526 (-0.00 %) Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: clean up ac_build_indexed_load function interfacesMarek Olšák2017-10-175-55/+61
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: handle 64-bit loads earlier in fetch_constantMarek Olšák2017-10-171-16/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>