summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallium/tgsi: start adding hw atomics (v3.2)Dave Airlie2017-11-106-3/+121
| | | | | | | | | | | | | | | | | | | | | This adds support for a hw atomic counters to TGSI. A new register file for storing atomic counters is added, along with a new atomic counter semantic, along with docs for both. v2: drop semantic, move hw counter to backend, Ilia pointed out SSO would have busted my plan, and he was right. v3: drop BUFFER decls. (Marek) v3.1: minor fixups for whitespace, set ureg error if we overflow the hw atomic limits. (nha) v3.2: fix some docs inconsistencies (Ilia) Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium: add CAPs to support HW atomic counters. (v3)Dave Airlie2017-11-1015-1/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | This looks like an evergreen specific feature, but with atomic counters AMD have hw specific counters they use instead of operating on buffers directly. These are separate to the buffer atomics, so require different limits and code paths. I've left the CAP for atomic type extensible in case someone else has a variant on this sort of thing (freedreno maybe?) and needs to change it. This adds all the CAPs required to add support for those atomic counters, along with a related CAP for limiting the number of output resources. I'd like to land this and the st patch then I can start to upstream the evergreen support for these and other GL4.x features. v2: drop the ATOMIC_COUNTER_MODE cap, just use the return from the HW counters. If 0 we use the current mode. v3: fix some rebase errors (Gert Wollny) Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-By: Gert Wollny <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/query: drop rest of vi workaround code.Dave Airlie2017-11-102-37/+13
| | | | | | | | This isn't needed in r600 anymore. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* docs: Fix GL_MESA_program_debug enumsRoland Scheidegger2017-11-091-18/+8
| | | | | | | | | | 13b303ff9265b89bdd9100e32f905e9cdadfad81 added the actual enums but didn't remove the already existing XXXX ones. (And also duplicated the "fragment" names instead of using the "vertex" names.) Fixes: 13b303ff9265b89bdd91 "docs: Update the list of used MESA GL enums." Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* st/mesa: remove 'struct' keyword on function parameterBrian Paul2017-11-091-2/+1
| | | | | | | st_src_reg is a class, not a struct. Simply remove 'struct' to silence a MSVC compiler warning (class vs. struct mismatch). Reviewed-by; Charmaine Lee <[email protected]>
* threads: fix MinGW build breakageBrian Paul2017-11-091-1/+4
| | | | | | Fixes: f1a364878431c8 ("threads: update for late C11 changes") Reviewed-by: Roland Scheidegger <[email protected]>
* mesa: s/GLint/gl_buffer_index/ for _ColorDrawBufferIndexesBrian Paul2017-11-099-26/+27
| | | | | | | Also fix local variable declarations and replace -1 with BUFFER_NONE. No Piglit changes. Reviewed-by: Charmaine Lee <[email protected]>
* mesa: s/GLint/gl_buffer_index/ for _ColorReadBufferIndexBrian Paul2017-11-091-1/+1
| | | | | | BUFFER_NONE is -1 so no reason for GLint. Reviewed-by: Charmaine Lee <[email protected]>
* mesa: minor reformatting, add const to gl_external_samplers()Brian Paul2017-11-091-1/+4
| | | | | | This function should probably be moved elsewhere, too. Reviewed-by: Charmaine Lee <[email protected]>
* st/mesa: whitespace clean-up in st_mesa_to_tgsi.cBrian Paul2017-11-091-167/+169
| | | | | | Remove trailing whitespace, fix indentation, wrap lines to 78 columns, etc. Reviewed-by: Charmaine Lee <[email protected]>
* meson: implement default driver argumentsDylan Baker2017-11-092-6/+44
| | | | | | | | | | | | | | | | | | | This allows drivers to be set by OS/arch in a sane manner. v2: - set _drivers to a list of drivers instead of manually assigning each with_* v3: - Use "auto" instead of "default", which matches the value of other automatically configured options. - Set vulkan drivers as well - Add error message if no automatic drivers are known for a given arch/OS combo - use not(darwin or windows) instead of (linux or *bsd), which is probably more accurate (that way Solaris and other *nix systems aren't excluded) - rename softpipe to swrast, as swrast is the actual option name Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* i965: Pretend there are 4 subslices for compute shader threads on Gen9+.Kenneth Graunke2017-11-091-1/+13
| | | | | | | | | | | | | | | Similar to what we did for pixel shader threads - see gen_device_info.c. We don't want to bump the actual Maximum Number of Threads though, so we adjust it here. For pixel shaders, we don't use max_wm_threads, so we could just bump it globally. Supposedly fixes Piglit tests: arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec3-int64_t arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-i64vec4-int64_t arb_gpu_shader_int64/execution/built-in-functions/cs-op-div-u64vec4-uint64_t Reviewed-by: Jordan Justen <[email protected]>
* meson: Add script to use VERSION file for getting versionDylan Baker2017-11-092-1/+38
| | | | | | | | | | | | | | | | | | | Meson has up until this point set it's version in the root meson.build script, while the other build systems read the VERSION file. This is just "one more thing" to duplicate between meson and every other build system. This script is a simple "read, strip, print" sort of deal to allow meson to read the VERSION file. I chose to implement this in python since python is portable, and to keep the meson.build script clean. This is also complicated by the fact that the project() call *must* be the first non-comment,non-blank in the toplevel meson.build script. v2: - Move from scripts/ to bin/ - use python explicitly to run the scripts to support windows Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* broadcom/vc4: Mark BOs as purgeable when they enter the BO cacheBoris Brezillon2017-11-093-48/+86
| | | | | | | | | | | | | | | | This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all BOs placed in the mesa BO cache as purgeable so that the system can reclaim this memory under memory pressure. v2: - Removed BOs from the cache when they've been purged by the kernel - Check whether the madvise ioctl is supported or not before using it v3: Don't walk the whole list when we find a busy BO (by anholt, acked by Boris) Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* drm-uapi: Update vc4 header from drm-nextBoris Brezillon2017-11-091-0/+19
| | | | | | | | | | | Taken from drm-next d65d31388a23 ("Merge tag 'drm-misc-next-fixes-2017-11-07' of git://anongit.freedesktop.org/drm/drm-misc into drm-next") v2: Add the NOTSUPP definition from the final drm-next version, not the commit (anholt). Signed-off-by: Boris Brezillon <[email protected]>
* meson: Enable VC4's NEON assembly support.Eric Anholt2017-11-092-2/+15
| | | | | | Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Timothy Arceri <[email protected]>
* meson: Always link libgallium_dri.so against dep_thread.Eric Anholt2017-11-091-0/+1
| | | | | | | | Somehow on my cross build the -pthread is getting lost. All the other deps seem to work out fine. Reviewed-by: Dylan Baker <[email protected]> Tested-by: Timothy Arceri <[email protected]>
* meson: Drop stale comment about making valgrind conditional.Eric Anholt2017-11-091-1/+0
| | | | | | | It was fixed in 5c2ff5773a707519f6a773126f201c4e1e8a42d7. Reviewed-by: Dylan Baker <[email protected]> Tested-by: Timothy Arceri <[email protected]>
* meson: Leave dep_llvm empty if !with_llvmEric Anholt2017-11-091-3/+4
| | | | | | | | | The gallium auxiliary build would link against llvm, for the gallivm code that it didn't build. This broke the build on my armhf cross, where libLLVM-3.9.so is not multiarch and thus points to x86-64 libs. Reviewed-by: Dylan Baker <[email protected]> Tested-by: Timothy Arceri <[email protected]>
* Revert "glx: Implement GLX_EXT_no_config_context (v2)"Adam Jackson2017-11-096-31/+13
| | | | | | Pushed ahead of things actually working. This reverts commit 5293b96b160b904c0e53cbce93679c3aa090f846.
* radeonsi: pack r600_surface betterMarek Olšák2017-11-091-11/+11
| | | | | | 160 -> 136 bytes Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: pack r600_texture betterMarek Olšák2017-11-091-27/+26
| | | | | | 1752 -> 1736 bytes Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clean up r600_surfaceMarek Olšák2017-11-092-29/+11
| | | | | | 216 -> 160 bytes Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove r600_texture::non_disp_tilingMarek Olšák2017-11-092-9/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove DBG_NO_DISCARD_RANGEMarek Olšák2017-11-093-5/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glx: Implement GLX_EXT_no_config_context (v2)Adam Jackson2017-11-096-13/+31
| | | | | | | | | | This more or less ports EGL_KHR_no_config_context to GLX. v2: Enable the extension only for those backends that support it. Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102 Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* glx: Prepare the DRI backends for GLX_EXT_no_config_contextAdam Jackson2017-11-093-6/+7
| | | | | | | | | | | This should be safe as these backends already support the EGL version of this extension. DRI1 is not affected because it does not support GLX_ARB_create_context anyway. DRI-Windows is not prepared to implement this as there's no equivalent WGL extension, and wglCreateContextAttribs seems to really want the HDC's pixel format to be set. Signed-off-by: Adam Jackson <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glx: Relax validate_renderType_against_config for EXT_no_config_contextAdam Jackson2017-11-091-13/+17
| | | | | Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Adam Jackson <[email protected]>
* anv: fix build failureNicolai Hähnle2017-11-091-2/+2
| | | | Fixes: e3a8013de8ca ("util/u_queue: add util_queue_fence_wait_timeout")
* mesa: flush and wait after creating a fallback textureNicolai Hähnle2017-11-091-0/+5
| | | | | | | | Fixes non-deterministic failures in dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_sync.images.texture_source.teximage2d_render and others in dEQP-EGL.functional.sharing.gles2.multithread.* Reviewed-by: Marek Olšák <[email protected]>
* mesa: increase MaxServerWaitTimeoutNicolai Hähnle2017-11-091-1/+1
| | | | | | | | | | | | The current value was introduced in commit a27180d0d8666, which claims that it represents ~1.11 years. However, it is interpreted in nanoseconds, so it actually only represents ~9.8 hours. That seems a bit short. Use the largest value consistent with both int32 and int64. It corresponds to ~292 years in nanoseconds. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: remove redundant flushes from st_flushNicolai Hähnle2017-11-093-3/+6
| | | | | | | | | | | st_flush should flush state tracker-internal state and the pipe, but not mesa/main state. Of the four callers: - glFlush/glFinish already call FLUSH_{VERTICES,STATE}. - st_vdpau doesn't need to call them. - st_manager will now call them explicitly. Reviewed-by: Marek Olšák <[email protected]>
* st/dri: use stapi flush instead of pipe flush when creating fencesNicolai Hähnle2017-11-091-5/+6
| | | | | | | | | There may be pending operations (e.g. vertices) that need to be flushed by the state tracker. Found by inspection. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use a threaded context even for debug contextsNicolai Hähnle2017-11-091-9/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: record and dump time of flushNicolai Hähnle2017-11-093-1/+8
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ddebug: optionally handle transfer commands like drawsNicolai Hähnle2017-11-094-66/+288
| | | | | | | | Transfer commands can have associated GPU operations. Enabled by passing GALLIUM_DDEBUG=transfers. Reviewed-by: Marek Olšák <[email protected]>
* ddebug: dump context and before/after times of drawsNicolai Hähnle2017-11-092-0/+10
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ddebug: generalize print_named_xxx via a PRINT_NAMED macroNicolai Hähnle2017-11-091-15/+10
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ddebug: rewrite to always use a threaded approachNicolai Hähnle2017-11-094-515/+546
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch has multiple goals: 1. Off-load the writing of records in 'always' mode to another thread for performance. 2. Allow using ddebug with threaded contexts. This really forces us to move some of the "after_draw" handling into another thread. 3. Simplify the different modes of ddebug, both in the code and in the user interface, i.e. GALLIUM_DDEBUG. In particular, there's no 'pipelined' anymore, since we're always pipelined; and 'noflush' is replaced by 'flush', since we no longer flush by default. 4. Fix the fences in pipelining mode. They previously relied on writes via pipe_context::clear_buffer. However, on radeonsi, those could (quite reasonably) end up in the SDMA buffer. So we use the newly added PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE fences instead. 5. Improve pipelined mode overall, using the finer grained information provided by the new fences. Overall, the result is that pipelined mode should be more useful, and using ddebug in default mode is much less invasive, in the sense that it changes the overall driver behavior less (which is kind of crucial for a driver debugging tool). An example of the new hang debug output: Gallium debugger active. Hang detection timeout is 1000ms. GPU hang detected, collecting information... Draw # driver prev BOP TOP BOP dump file ------------------------------------------------------------- 2 YES YES YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000000 3 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000001 4 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000002 5 YES NO YES NO /home/nha/ddebug_dumps/shader_runner_19919_00000003 Done. We can see that there were almost certainly 4 draws in flight when the hang happened: the top-of-pipe fence was signaled for all 4 draws, the bottom-of-pipe fence for none of them. In virtually all cases, we'd expect the first draw in the list to be at fault, but due to the GPU parallelism, it's possible (though highly unlikely) that one of the later draws causes a component to get stuck in a way that prevents the earlier draws from making progress as well. (In the above example, there were actually only 3 draws truly in flight: the last draw is a blit that waits for the earlier draws; however, its top-of-pipe fence is emitted before the cache flush and wait, and so the fact that the draw hasn't truly started yet can only be seen from a closer inspection of GPU state.) Acked-by: Marek Olšák <[email protected]>
* ddebug: use an atomic increment when numbering filesNicolai Hähnle2017-11-091-1/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* dd/util: extract dd_get_debug_filename_and_mkdirNicolai Hähnle2017-11-091-12/+18
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_dump: add and use util_dump_transfer_usageNicolai Hähnle2017-11-094-16/+61
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_dump: add util_dump_nsNicolai Hähnle2017-11-092-0/+13
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_dump: export util_dump_ptrNicolai Hähnle2017-11-092-2/+5
| | | | | | Change format to %p while we're at it. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPENicolai Hähnle2017-11-091-1/+88
| | | | | | | v2: use uncached system memory for the fence, and use the CPU to clear it so we never read garbage when checking the fence Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: document some subtle details of fence_finish & fence_server_syncNicolai Hähnle2017-11-091-0/+22
| | | | | | | v2: remove the change to si_fence_server_sync, we'll handle that more robustly Reviewed-by: Marek Olšák <[email protected]>
* gallium: add pipe_context::callbackNicolai Hähnle2017-11-093-0/+58
| | | | | | | For running post-draw operations inside the driver thread. ddebug will use it. Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_threaded: implement pipe_context::set_log_contextNicolai Hähnle2017-11-091-0/+11
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_threaded: avoid syncs for get_query_resultNicolai Hähnle2017-11-091-17/+48
| | | | | | | | | | Queries should still get marked as flushed when flushes are executed asynchronously in the driver thread. To this end, the management of the unflushed_queries list is moved into the driver thread. Reviewed-by: Marek Olšák <[email protected]>
* gallium/u_threaded: implement asynchronous flushesNicolai Hähnle2017-11-096-27/+238
| | | | | | | | | | | | | This requires out-of-band creation of fences, and will be signaled to the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag. v2: - remove an incorrect assertion - handle fence_server_sync for unsubmitted fences by relying on the improved cs_add_fence_dependency - only implement asynchronous flushes on amdgpu Reviewed-by: Marek Olšák <[email protected]>