aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4/vc4_screen.h
Commit message (Collapse)AuthorAgeFilesLines
* broadcom/vc4: Native fence fd supportStefan Schake2018-05-171-2/+2
| | | | | | | | | | | | | | | With the syncobj support in place, lets use it to implement the EGL_ANDROID_native_fence_sync extension. This mostly follows previous implementations in freedreno and etnaviv. v2: Drop the flags (Eric) Handle in_fence_fd already in job_submit (Eric) Drop extra vc4_fence_context_init (Eric) Dup fds with CLOEXEC (Eric) Mention exact extension name (Eric) Signed-off-by: Stefan Schake <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* broadcom/vc4: Detect syncobj supportStefan Schake2018-05-171-0/+1
| | | | | | | | | | We need to know if the kernel supports syncobj submission since otherwise all the DRM syncobj calls fail. v2: Use drmGetCap to detect syncobj support (Eric) Signed-off-by: Stefan Schake <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* broadcom/vc4: Add support for HW perfmonBoris Brezillon2018-03-051-0/+1
| | | | | | | | | The V3D engine provides several perf counters. Implement ->get_driver_query_[group_]info() so that these counters are exposed through the GL_AMD_performance_monitor extension. Signed-off-by: Boris Brezillon <[email protected]> Signed-off-by: Eric Anholt <[email protected]>
* broadcom/vc4: Mark BOs as purgeable when they enter the BO cacheBoris Brezillon2017-11-091-0/+1
| | | | | | | | | | | | | | | | This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all BOs placed in the mesa BO cache as purgeable so that the system can reclaim this memory under memory pressure. v2: - Removed BOs from the cache when they've been purged by the kernel - Check whether the madvise ioctl is supported or not before using it v3: Don't walk the whole list when we find a busy BO (by anholt, acked by Boris) Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Set shareable BOs as T tiled if possibleEric Anholt2017-07-121-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | X11 and GL compositor performance on VC4 has been terrible because of our SHARED-usage buffers all being forced to linear. This swaps SHARED && !LINEAR buffers over to being tiled. This is an expected win for all GL compositors during rendering (a full copy of each shared texture per draw call), allows X11 to be used with decent performance without a GL compositor, and improves X11 windowed swapbuffers performance as well. It also halves the memory usage of shared buffers that get textured from. The only cost should be idle systems with a scanout-only buffer that isn't flagged as LINEAR, in which case the memory bandwidth cost of scanout goes up ~25%. This implements the EGL_EXT_image_dma_buf_import_modifiers extension, supporting the VC4 T_TILED modifier. v2: Added modifier support to resource creation/import, and advertisement (by daniels). v3: Fix old-kernel fallback path, fix compiler error and warnings, and comment touchups (by anholt). Reviewed-by: Daniel Stone <[email protected]>
* vc4: Make the miptree debug code available under VC4_DEBUG=surfEric Anholt2017-07-121-0/+1
| | | | | | I kept flipping the bool on for debug, so let's just make it available. Reviewed-by: Daniel Stone <[email protected]>
* gallium: Add renderonly-based support for pl111+vc4.Eric Anholt2017-06-151-1/+4
| | | | | | | | | | | | | | | | | | | This follows the model of imx (display) and etnaviv (render): pl111 is a display-only device, so when asked to do GL for it, we see if we have a vc4 renderer, make the vc4 screen, and have vc4 call back to pl111 to do scanout allocations. The difference from etnaviv is that we share the same BO between vc4 and pl111, rather than having a vc4 bo and a pl11 bo and copies between the two. The only mismatch between their requirements is that vc4 requires 4-pixel (at 32bpp) stride alignment, while pl111 requires that stride match width. The kernel will reject any modesets to an incorrect stride, so the 3D driver doesn't need to worry about that. v2: Rebase on Android rework, drop unused include. v3: Fix another Android bug, from Rob Herring's build-testing. Reviewed-by: Christian Gmeiner <[email protected]>
* vc4: Drop pointless indirections around BO import/export.Eric Anholt2017-05-171-7/+0
| | | | | | I've since found them to be more confusing by adding indirections than clarifying by screening off resources from the handle/fd import/export process.
* gallium: s/unsigned/enum pipe_shader_type/ for get_compiler_options()Brian Paul2017-03-081-1/+2
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/util: replace pipe_mutex with mtx_tTimothy Arceri2017-03-071-2/+2
| | | | | | pipe_mutex was made unnecessary with fd33a6bcd7f12. Reviewed-by: Marek Olšák <[email protected]>
* vc4: Try compiling our FSes in multithreaded mode on new kernels.Eric Anholt2016-11-161-0/+1
| | | | | | Multithreaded fragment shaders let us hide texturing latency by a hyperthreading-style switch to another fragment shader. This gets us up to 20% framerate improvements on glmark2 tests.
* vc4: Add support for ETC1 textures if the kernel is new enough.Eric Anholt2016-11-161-0/+5
| | | | | The kernel changes for exposing the param have now been merged, so we can expose it here.
* vc4: Move simulator memory management to a u_mm.h heap.Eric Anholt2016-10-211-0/+4
| | | | | | Now we aren't limited to 256MB total allocated across a driver instance, just 256MB at one time. We're still copying in and out, which should get fixed.
* vc4: Move simulator globals into a struct.Eric Anholt2016-10-211-3/+0
| | | | | I would like to put a couple more things in here, so it's time to package it up.
* vc4: use the new parent/child pools for transfersNicolai Hähnle2016-10-051-0/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* vc4: Tell state_tracker that we would prefer NIR.Eric Anholt2016-08-221-0/+4
| | | | | | | | | | Before this series, the code generation path was: GLSL IR -> TGSI -> NIR -> NIR clone -> QIR -> QPU Now it's (generally) GLSL IR -> NIR -> NIR clone -> QIR -> QPU
* vc4: add hash table look-up for exported dmabufsRob Herring2016-07-261-0/+3
| | | | | | | | | | | | | It is necessary to reuse existing BOs when dmabufs are imported. There are 2 cases that need to be handled. dmabufs can be created/exported and imported by the same process and can be imported multiple times. Copying other drivers, add a hash table to track exported BOs so the BOs get reused. v2: Whitespace fixup (by anholt) Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Return V3D version details in the GL renderer info.Eric Anholt2016-07-201-0/+2
| | | | This is as close as we get to a name for the 3D blocks.
* vc4: Check the V3D version reported by the kernel.Eric Anholt2016-07-201-0/+2
| | | | | | We don't want to bring up an old userspace driver on a kernel for newer hardware. We'll also want to look at the other ident fields in the future.
* vc4: Add a flag in the screen to track control flow support.Eric Anholt2016-07-121-0/+1
| | | | | For now it's still always false, but I need it in place for kernel backwards compat support as I extend the backend for control flow.
* vc4: Add support for dumping executed commands to a file.Eric Anholt2015-12-151-0/+1
| | | | | | | | | | The VC4_DEBUG=cl,qpu is nice and all, but I want to be able to get more detailed dumps, and to replay the same exact commands in simulation. For that I need a dump with all of the VBOs, shaders, shader recs, etc. This dump can be parsed by vc4-gpu-tools. For now this is only doable from simulator mode, because otherwise we don't have access to the RCL contents generated by the kernel.
* vc4: Track the number of BOs allocated and their size.Eric Anholt2015-06-171-0/+6
| | | | This is useful for BO leak debugging.
* vc4: Drop qir include from vc4_screen.hEric Anholt2015-06-091-1/+1
| | | | | We didn't need any of it except for the list header, and qir.h pulls in nir.h, which is not really interesting to winsys.
* vc4: Convert from simple_list.h to list.hEric Anholt2015-05-291-2/+2
| | | | list.h is a nicer and more familiar set of list functions/macros.
* vc4: Convert to consuming NIR.Eric Anholt2015-04-011-0/+1
| | | | | | | | | | | | | | | | | | | NIR brings us better optimization than I would have bothered to write within the driver, developers sharing future optimization work, and the ability to share device-specific lowering code that we and other GLES2-level drivers need. total uniforms in shared programs: 13421 -> 13422 (0.01%) uniforms in affected programs: 62 -> 63 (1.61%) total instructions in shared programs: 39961 -> 39707 (-0.64%) instructions in affected programs: 15494 -> 15240 (-1.64%) v2: Add missing imov support, and assert that there are no dest saturates. v3: Rebase on the target-specific algebraic series. v4: Rebase on gallium-includes-from-NIR changes in mater. v5: Rebase on variables being in lists instead of hash tables. v6: Squash in intermediate changes that used the NIR-to-TGSI pass (which I'm not committing)
* vc4: Add a userspace BO cache.Eric Anholt2014-12-171-0/+12
| | | | | | | | | | Since our kernel BOs require CMA allocation, and the use of them requires new mmaps, it's pretty expensive and we should avoid it if possible. Copying my original design for Intel, make a userspace cache that reuses BOs that haven't been shared to other processes but frees BOs that have sat in the cache for over a second. Improves glxgears framerate on RPi by around 30%.
* vc4: Drop a weird argument in the BOs-from-handles API.Eric Anholt2014-12-171-2/+1
|
* vc4: Add a debug flag for waiting for sync on submit.Eric Anholt2014-12-051-0/+1
| | | | | This is nice when you're tracking down which command list is hanging the GPU.
* vc4: Update for new kernel ABI with async execution and waits.Eric Anholt2014-11-201-0/+13
| | | | | Our submits now return immediately and you have to manually wait for things to complete if you want to (like a normal driver).
* vc4: Add a debug flag for flushing after every draw.Eric Anholt2014-09-091-0/+1
| | | | | It was useful on i965, but it's even more useful for debugging tiled renderers.
* vc4: Add support for texture tiling.Eric Anholt2014-08-221-2/+0
| | | | | | This still treats everything as RGBA8888 for the most part, same as before. This is a prerequisite for handling other texture formats, since only RGBA8888 has a raster-layout mode.
* vc4: Add support for swizzling of texture colors.Eric Anholt2014-08-181-0/+1
| | | | | Fixes swapped colors on the copypix demo and some piglit tests like pbo-teximage-tiling .
* vc4: Fix off-by-one in texture maximum levels.Eric Anholt2014-08-121-1/+1
| | | | It's 2048x2048 that's the max, not 1024x1024.
* vc4: Switch simulator to using kernel validatorEric Anholt2014-08-111-1/+0
| | | | | | | | This ensures that when I'm using the simulator, I get a closer match to what behavior on real hardware will be. It lets me rapidly iterate on the kernel validation code (which otherwise has a several-minute turnaround time), and helps catch buffer overflow bugs in the userspace driver faster.
* vc4: Add VC4_DEBUG env optionEric Anholt2014-08-081-1/+9
| | | | | v2: Fix an accidental deletion of some characters from the copyright message (caught by Ilia Mirkin)
* vc4: Initial skeleton driver import.Eric Anholt2014-08-081-0/+63
This mostly just takes every draw call and turns it into a sequence of commands that clear the FBO and draw a single shaded triangle to it, regardless of the actual input vertices or shaders. I copied the initial driver skeleton mostly from freedreno, and I've preserved Rob Clark's copyright for those. I also based my initial hardcoded shaders and command lists on Scott Mansell (phire)'s "hackdriver" project, though the bit patterns of the shaders emitted end up being different. v2: Rebase on gallium megadrivers changes. v3: Rebase on PIPE_SHADER_CAP_MAX_CONSTS change. v4: Rely on simpenrose actually being installed when building for simulation. v5: Add more header duplicate-include guards. v6: Apply Emil's review (protection against vc4 sim and ilo at the same time, and dropping the dricommon drm bits) and fix a copyright header (thanks, Roland)