summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radv: fix GPU hangs when loading depth/stencil clear values on SI/CIKSamuel Pitoiset2018-11-081-5/+19
| | | | | | | | HTILE is supported on these chips, not sure how I missed that. This restores using PFP_SYNC_ME when LOAD_CONTEXT_REG is not used. Fixes: f425d9ee74 ("radv: use LOAD_CONTEXT_REG when loading fast clear values") Signed-off-by: Samuel Pitoiset <[email protected]>
* radv: use LOAD_CONTEXT_REG when loading fast clear valuesSamuel Pitoiset2018-11-082-19/+27
| | | | | | | | | This avoids syncing the Micro Engine. This is only supported for VI+ currently. There is probably a way for using LOAD_CONTEXT_REG on previous chips but that could be done later. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: only expose VK_SUBGROUP_FEATURE_ARITHMETIC_BIT for VI+Samuel Pitoiset2018-11-081-1/+1
| | | | | | | | | | | Inclusive and exclusives scan are missing because older chips don't have llvm.amdgcn.update.dpp. This fixes crashes with dEQP-VK.subgroups.arithmetic.*. CC: [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* glx: Demand success from CreateContext requests (v2)Adam Jackson2018-11-071-38/+55
| | | | | | | | | | | | | | | | | | | | | | | GLXCreate{,New}Context, like most X resource creation requests, does not emit a reply and therefore is emitted into the X stream asynchronously. However, unlike most resource creation requests, the GLXContext we return is a handle to library state instead of an XID. So if context creation fails for any reason - say, the server doesn't support indirect contexts - then we will fail in strange places for strange reasons. We could make every GLX entrypoint robust against half-created contexts, or we could just verify that context creation worked. Reuse the __glXIsDirect code to do this, as a cheap way of verifying that the XID is real. glXCreateContextAttribsARB solves this by using the _checked version of the xcb command, so effectively this change makes the classic context creation paths as robust as CreateContextAttribs. v2: Better use of Bool, check that error != NULL first (Olivier Fourdan) Signed-off-by: Adam Jackson <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gm107/ir: fix compile time warning in getTEXSMaskKarol Herbst2018-11-071-0/+1
| | | | | | | | | | | In function 'uint8_t nv50_ir::getTEXSMask(uint8_t)': warning: control reaches end of non-void function [-Wreturn-type] Reported-by: Moiman@freenode Fixes: f821e80213e38e93f96255b3deacb737a600ed40 "gm107/ir: use scalar tex instructions where possible" Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* winsys/amdgpu: Stop using amdgpu_bo_handle_type_kms_noimportMichel Dänzer2018-11-071-3/+3
| | | | | | | | | It only behaves any different from amdgpu_bo_handle_type_kms with libdrm 2.4.93, and it breaks if an older version is picked up. Bugzilla: https://bugs.freedesktop.org/108096 Reviewed-by: Marek Olšák <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* intel/dump_gpu: add platform optionLionel Landwerlin2018-11-072-6/+29
| | | | | | | Got tired of remembering the PCI ids. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* intel/dump_gpu: move output option togetherLionel Landwerlin2018-11-071-5/+5
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: disable conditional rendering for vkCmdCopyQueryPoolResults()Samuel Pitoiset2018-11-071-0/+10
| | | | | | | | | VK_EXT_conditional_rendering says that copy commands should not be affected by conditional rendering. Cc: 18.2 18.3 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: allocate enough space in CS when copying query results with computeSamuel Pitoiset2018-11-071-0/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir_to_llvm: fix b2f for f64Timothy Arceri2018-11-071-3/+12
| | | | | | Fixes: d7e0d47b9de3 ("nir: Add a bunch of b2[if] optimizations") Reviewed-by: Dave Airlie <[email protected]>
* gm107/ir: use scalar tex instructions where possibleKarol Herbst2018-11-062-3/+317
| | | | | | | | | | | | | | | | | | | TEXS, TLD4 and TLD4S are variants of tex instructions which are more scalar, which gives RA more freedom and is less likely to insert silly MOVs to satisfy quad registers. shader-db changes: total instructions in shared programs : 7687265 -> 7614782 (-0.94%) total gprs used in shared programs : 803620 -> 798045 (-0.69%) total shared used in shared programs : 639636 -> 639636 (0.00%) total local used in shared programs : 24648 -> 24648 (0.00%) total bytes used in shared programs : 82103400 -> 81330696 (-0.94%) local shared gpr inst bytes helped 0 0 3648 10647 10647 hurt 0 0 464 205 205 Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: add scalar field to TexInstructionsKarol Herbst2018-11-062-1/+6
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ra: add condenseDef overloads for partial condensesKarol Herbst2018-11-061-8/+21
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: print color masks of tex instructionsKarol Herbst2018-11-061-4/+33
| | | | | | | v2: print the mask for TXG as well make the mask to be printed more mask like Reviewed-by: Ilia Mirkin <[email protected]>
* vulkan: Update the XML and headers to 1.1.91Jason Ekstrand2018-11-064-500/+514
| | | | | | | | The biggest change here is the rename of VK_NVX_ray_tracing to VK_NV_ray_tracing and the total removal of VK_KHR_mir_surface. Acked-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* r600: Add support for EXT_texture_sRGB_R8Gert Wollny2018-11-061-0/+1
| | | | | | | | | | | Enables on R600 and makes pass: dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.* dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8* v2: remove chunk for dri/radeon (Emil) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* anv/android: mark gralloc allocated BOs as externalLionel Landwerlin2018-11-061-1/+1
| | | | | | | | | | Allocating through Gralloc implies buffers are going to be used outside the driver. We have special MOCS settings for external BOs and we probably want to use them here too. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: a1220e73116bad7 ("anv/android: Set the BO flags in bo_cache_import (v2)") Reviewed-by: Tapani Pälli <[email protected]>
* anv: stub internal android codeLionel Landwerlin2018-11-067-11/+80
| | | | | | | | | | | | | This reduces the amount of #ifdef ANDROID we'll have to have inside the driver. Potentially offering better coverage of the android extensions. v2: Move anv_android.h include before anv_entrypoints.h (Tapani) Fix autotools android build (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* freedreno/a6xx: Clear z32 and separate stencil with blitterKristian H. Kristensen2018-11-062-27/+50
| | | | Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: fix VSC bug with larger # of tilesRob Clark2018-11-061-5/+2
| | | | | | | | | | At higher resolutions with the addition of MSAA, the number of tiles can increase to the point where we use more than one VSC pipe per tile. Which would cause us to calculate an out-of-bounds offset for VSC_SIZE_ADDRESS. So don't try to be clever, just always put it at a fixed offset assuming the max 32 VSC pipes in use. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-11-067-29/+51
| | | | Signed-off-by: Rob Clark <[email protected]>
* wayland/egl: Resize EGL surface on update buffer for swrastOlivier Fourdan2018-11-061-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After commit a9fb331ea ("wayland/egl: update surface size on window resize"), the surface size is updated as soon as the resize is done, and `update_buffers()` would resize only if the surface size differs from the attached size. However, in the case of swrast, there is no resize callback and the attached size is updated in `dri2_wl_swrast_commit_backbuffer()` prior to the `swrast_update_buffers()` so the attached size is always up to date when it reaches `swrast_update_buffers()` and the surface is never resized. This can be observed with "totem" using the GDK backend on Wayland (the default) when running on software rendering: $ LIBGL_ALWAYS_SOFTWARE=true CLUTTER_BACKEND=gdk totem Resizing the window would leave the EGL surface size unchanged. To avoid the issue, partially revert the part of commit a9fb331ea for `swrast_update_buffers()` and resize on the win size and not the attached size. Fixes: a9fb331ea - wayland/egl: update surface size on window resize Signed-off-by: Olivier Fourdan <[email protected]> CC: Daniel Stone <[email protected]> CC: Juan A. Suarez Romero <[email protected]> CC: [email protected] Reviewed-by: Juan A. Suarez <[email protected]>
* intel/decoders: fix instruction base address parsingLionel Landwerlin2018-11-052-2/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 00103db04ab879 ("intel: Fix decoding for partial STATE_BASE_ADDRESS updates.") Reviewed-by: Kenneth Graunke <[email protected]>
* egl/glvnd: correctly report errors when vendor cannot be foundEmil Velikov2018-11-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | If the user provides an invalid display or device the ToVendor lookup will fail. In this case, the local [Mesa vendor] error code will be set. Thus on sequential eglGetError(), the error will be EGL_SUCCESS. To be more specific, GLVND remembers the last vendor and calls back into it's eglGetError, although there's no guarantee to ever have had one. v2: - Add _eglError call, so the debug callback is executed (Kyle) - Drop XXX comment. Piglit: tests/egl/spec/egl_ext_device_query Fixes: ce562f9e3fa ("EGL: Implement the libglvnd interface for EGL (v3)") Cc: Eric Engestrom <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Kyle Brenneman <[email protected]>
* egl: add EGL_EXT_device_base entrypointsEmil Velikov2018-11-051-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | eglQueryDevicesEXT (unlike the other three functions) does not depend on the display. It is implemented in GLVND, which calls into each driver collecting the list of devices and presenting it to the user. For the other entrypoints, GLVND acts as pass through stub calling into the vendor library. The vendor implementation calls back into GLVND to get the vendor dispatch. Then the driver proceeds to call itself via the said dispatch. This design makes is possible to keep using "old" GLVND with newer vendor drivers. Since effectively all the extension code is within the latter itself. Without said entrypoints, any user will outright crash - as reported in the bug report. Note: there's a follow-up fix needed to our GLVND code, to make piglit happy. v2: add some beefy documentation in the commit message. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108635 Fixes: 7552fcb7b9b ("egl: add base EGL_EXT_device_base implementation") Reported-by: [email protected] Cc: [email protected] Acked-by: Eric Engestrom <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Tested-by: Emil Velikov <[email protected]>
* docs: mention EXT_shader_implicit_conversionsEmil Velikov2018-11-051-1/+1
| | | | | Reviewed-by: Erik Faye-Lund <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* st/va: fix incorrect use of resource_destroyMarek Olšák2018-11-051-4/+2
| | | | | | | Fixes: 4373dd32154 ("st/va: Support YUV formats in vaCreateSurfaces") Cc: Drew Davenport <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* i965/batch/debug: Allow log be dumped before assertSergii Romantsov2018-11-051-1/+1
| | | | | | | | | Message that may show the culprit of assert now will be dumped before that for debug purposes. Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Lionel G Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/sanitize_gpu: add debug message on mmap failLionel Landwerlin2018-11-051-1/+3
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* intel/sanitize_gpu: deal with non page multiple buffer sizesLionel Landwerlin2018-11-051-4/+7
| | | | | | | | We can only map at page aligned offsets. We got that wrong with buffer size where (size % 4096) != 0 (anv has a WA buffer of 1024). Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* intel/sanitize_gpu: add help/gdb options to wrapperLionel Landwerlin2018-11-051-1/+54
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* intel/dump_gpu: add missing gdb optionLionel Landwerlin2018-11-051-0/+2
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* wsi/wayland: only finish() a successfully init()ed displayEric Engestrom2018-11-051-1/+2
| | | | | | | Fixes: 43691024982b3ea734ad0 "vulkan/wsi/wayland: Stop caching Wayland displays" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Philipp Zabel <[email protected]>
* wsi/wayland: use proper VkResult typeEric Engestrom2018-11-051-2/+2
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* autotools: library-dependency when no sse and 32-bitSergii Romantsov2018-11-052-2/+3
| | | | | | | | | | | | | | | Building of 32bit Mesa may fail if __SSE__ is not specified. Added missed dependency from libm. v2: avoided dependecy on any flag, just link v3: meson doesn't fail, but have added dependency on libm CC: Dylan Baker <[email protected]> CC: Lionel G Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108560 Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: more use of radv_cp_wait_mem()Samuel Pitoiset2018-11-051-22/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: replace si_emit_wait_fence() with radv_cp_wait_mem()Samuel Pitoiset2018-11-054-10/+14
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: add missing TFB queries support to CmdCopyQueryPoolsResults()Samuel Pitoiset2018-11-052-0/+278
| | | | | | | Cc: 18.3 <[email protected]> Fixes: b4eb029062a ("radv: implement VK_EXT_transform_feedback") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: remove useless sync after copying query results with computeSamuel Pitoiset2018-11-051-4/+0
| | | | | | | | | | | | | | | The spec says: "vkCmdCopyQueryPoolResults is considered to be a transfer operation, and its writes to buffer memory must be synchronized using VK_PIPELINE_STAGE_TRANSFER_BIT and VK_ACCESS_TRANSFER_WRITE_BIT before using the results." VK_PIPELINE_STAGE_TRANSFER_BIT will wait for compute to be idle, while VK_ACCESS_TRANSFER_WRITE_BIT will invalidate both L1 vector caches and L2. So, it's useless to set those flags internally. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* r600/sb: Fix constant logical operand in assert.Vinson Lee2018-11-041-1/+1
| | | | | | Fixes: da977ad90747 ("r600/sb: start adding GDS support") Signed-off-by: Vinson Lee <[email protected]> Reviewed-By: Gert Wollny <[email protected]>
* st/mesa: Don't record garbage streamout information in the non-SSO case.Kenneth Graunke2018-11-033-31/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the non-SSO case, where multiple shader stages are linked together, we were recording garbage pipe_stream_output_info structures for all but the last enabled geometry-processing stage. Specifically, we were using the gl_transform_feedback_info from shader_program->last_vert_prog (the stage whose outputs will be recorded)...but were pairing it with the output varying mappings from the current shader stage. For example, a program with a VS and GS, the VS's pipe_shader_state would have a pipe_stream_output_info based on the GS transform feedback info, but the VS output mapping. This generally worked out okay because only the pipe_stream_output_info for the last stage really matters - the others can be ignored. However, we'd like to avoid confusing the pipe driver. In particular, my new driver translates the stream out information to hardware packets at bind_{vs,tes,gs}_state() time...and was hitting asserts about garbage varyings that didn't exist. This patch changes st/mesa to record a blank pipe_stream_output_info with num_outputs = 0 for all stages prior to last_vert_prog. The last one is captured as normal. (In the fully-SSO case, nothing should change - each program contains a single shader stage, so last_vert_prog *is* the current shader.) Tested with llvmpipe (piglit's gpu profile), and freedreno (a3xx, gpu profile with -t transform.feedback). Fixes several hundred CTS tests on my new driver. Reviewed-by: Timothy Arceri <[email protected]>
* st/nir: Drop unused parameter from st_nir_assign_uniform_locations().Kenneth Graunke2018-11-031-2/+1
| | | | | | ARB programs won't have one of these, and we don't use it anyway. Reviewed-by: Rob Clark <[email protected]>
* st/mesa: Pull nir_lower_wpos_ytransform work into a helper function.Kenneth Graunke2018-11-032-29/+40
| | | | | | This will let me use it in the ARB program code as well. Reviewed-by: Rob Clark <[email protected]>
* intel: Use a URB start offset of 0 for disabled stages.Kenneth Graunke2018-11-031-3/+9
| | | | | | | | | | | | | | | | | | | | | | | There are some cases where the VS is the only stage enabled, it uses the entire URB, and the URB is large enough that placing later stages after the VS exceeds the number of bits for "URB Starting Address". For example, on Icelake GT2, "varying-packing-simple mat2x4 array" from Piglit is getting a starting offset of 128 for the GS/HS/DS. But the field is only large enough to hold an offset of 127. i965 doesn't hit any genxml assertions because it's still using the old OUT_BATCH mechanism. 128 << GEN7_URB_STARTING_ADDRESS_SHIFT (57) == 0, with the extra bit falling off the end. So we place the disabled stage at the beginning of the URB (overlapping with push constants). This is likely okay since it's a zero size region (0 entries). It seems like the Vulkan driver might hit this assertion, however, and the situation seems harmless. To work around this, always place disabled stages at the start of the URB, so the last enabled stage can fill the remaining space without overflowing the field. Reviewed-by: Jordan Justen <[email protected]>
* android: radv: add libmesa_git_sha1 static dependencyMauro Rossi2018-11-031-1/+2
| | | | | | | | | | | | | | libmesa_git_sha1 whole static dependency is added to get git_sha1.h header and avoid following building error: external/mesa/src/amd/vulkan/radv_device.c:46:10: fatal error: 'git_sha1.h' file not found ^ 1 error generated. Fixes: 9d40ec2cf6 ("radv: Add support for VK_KHR_driver_properties.") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* vc4: Use the normal simulator ioctl path for CL submit as well.Eric Anholt2018-11-023-13/+5
| | | | The simulator no longer needs to look back into the gallium structs.
* vc4: Maintain a separate GEM mapping of BOs in the simulator.Eric Anholt2018-11-022-42/+58
| | | | This will let us avoid looking back into the gallium driver's vc4_bo.
* vc4: Take advantage of _mesa_hash_table_remove_key() in the simulator.Eric Anholt2018-11-021-4/+2
|
* v3d: Remove the special path for simulaton of the submit ioctl.Eric Anholt2018-11-025-19/+13
| | | | | Now that it doesn't need to find the struct v3d_bos, it can just take the normal v3d_ioctl() path.