aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* iris: Fix bad external BO hash table and zombie list interactionsKenneth Graunke2019-08-051-12/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A while ago, we started deferring GEM object closure and VMA release until buffers were idle. This had some unforeseen interactions with external buffers. We keep imported buffers in hash tables, so if we have repeated imports of the same GEM object, we map those to the same iris_bo structure. This is critical for several reasons. Unfortunately, we broke this assumption. When freeing a non-idle external buffer, we would drop it from the hash tables, then move it to the zombie list. If someone reimported the same GEM object, we would not find it in the hash tables, and go ahead and make a second iris_bo for that GEM object. But the old iris_bo would still be in the zombie list, and so we would eventually call GEM_CLOSE on it - closing a BO that should have still been live. To work around this, we defer removing a BO from the hash tables until it's actually fully closed. This has the strange effect that an external BO may be on the zombie list, and yet be resurrected before it can be properly cleaned up. In this case, we remove it from the list so it won't be freed. Fixes severe instability in Weston, which was hitting EINVALs and ENOENTs from execbuf2, due to batches referring to a GEM object that had been closed, or at least had its VMA torched. Fixes: 457a55716ea ("iris: Defer closing and freeing VMA until buffers are idle.")
* iris/bufmgr: Move iris_bo_reference into hash_find_bo, rename itKenneth Graunke2019-08-051-14/+16
| | | | | | Everybody importing an external buffer was looking it up in the hash table, then referencing it. We can just do that in the helper instead, which also gives us a convenient spot to stash extra code shortly.
* gallium: add stm DRM entry pointAhmad Fatoum2019-08-053-0/+3
| | | | | | | | | | The STM32MP157 features a Vivante GC400 GPU supported by etnaviv. Add a DRM entry point for the STM display controller, so mesa can be used with it. Signed-off-by: Ahmad Fatoum <[email protected]> Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* lima/ppir: simplify load uni/temp op lowering and schedulingErico Nunes2019-08-042-34/+33
| | | | | | | | | | | | | | | | | | | The load uniform/temporary operations output only to a pipeline register, which must be consumed by another op in the same instruction later. The current implementation delays the decision of who will consume this result to until the scheduling step. If the consumer node is not able to use the pipeline register, a mov node may have to be created, during the scheduler step. As part of the ppir scheduler simplification, and now that the ppir scheduler supports pipeline register dependencies, this can be simplified by always creating a single mov node outputting to a normal register that can be used directly by all consumers. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: simplify select op lowering and schedulingErico Nunes2019-08-045-11/+15
| | | | | | | | | | | | | | | | | | | The select operation relies on the select condition coming from the result of the the alu scalar mult slot, in the same instruction. The current implementation creates a mov node to be the predecessor of select, and then relies on an exception during scheduling to ensure that both ops are inserted in the same instruction. Now that the ppir scheduler supports pipeline register dependencies, this can be simplified by making the mov explicitly output to the fmul pipeline register, and the scheduler can place it without an exception. Since the select condition can only be placed in the scalar mult slot, differently than a regular mov, define a separate op for it. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: support pipeline registers in schedulerErico Nunes2019-08-042-46/+65
| | | | | | | | | | | | | | | | | | | | | | | | | The ppir scheduler grew to be rather complicated and containing many exceptions as it also has to take care of inserting additional nodes when it is mandatory for nodes to be in the same instruction. As such, the lima lowering and scheduling process can be difficult to understand and maintain. The ppir lowering step created nodes hoping that the scheduler would notice the exception and do the right thing. This proposal adds a simple refactor to the scheduler so that it places nodes with pipeline registers in the same instruction. With the scheduler handling this in a general way, it is possible to create same-instruction dependencies by using pipeline registers during the lowering stage. This is simpler to maintain because now we can make these dependencies explicit in a single place (lowering), and we can drop exceptions from scheduling. Reducing the complexity of the scheduler is also useful as preparatory work to support control flow in ppir. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: move alu vec to scalar lowering into NIRVasily Khoruzhick2019-08-042-107/+14
| | | | | | | | | Utgard PP is vec4, but some operations are scalar, utilize NIR vec to scalar lowering pass and indicate operations that we want to lower. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* iris: Fix handling of SIMD32 fragment shadersJason Ekstrand2019-08-031-44/+50
| | | | | | | | | | | | | | The brw_wm_prog_data_dispatch_grf_start_reg and _prog_offset helpers read the _NPixelDispatchEnable fields from 3DSTATE_PS to figure out which bits to pull out of the prog data and stuff where. Therefore, they need to be called with the final set of _NPixelDispatchEnable bits after we've done the workaround for SIMD32 and 16x MSAA. Otherwise, if you end up with a somewhat odd combination of enables, the GRF start reg and KSP data ends up in the wrong slots. In particular, running SIMD32-only is broken but several other combinations are as well. Fixes: 5445c176e27ba "iris: Disable SIMD32 when using a 16x MSAA..." Reviewed-by: Kenneth Graunke <[email protected]>
* etnaviv: s/boolean/boolChristian Gmeiner2019-08-032-2/+2
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Philipp Zabel <[email protected]>
* lima/ppir: Add gl_FrontFace handlingAndreas Baierl2019-08-036-13/+35
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* meson: remove unused fieldEric Engestrom2019-08-031-9/+9
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* meson: replace last uses of libxmlconfig with idep_xmlconfigEric Engestrom2019-08-032-5/+5
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* meson: drop unused dep_{thread,dl}Eric Engestrom2019-08-0314-14/+12
| | | | | | | | Unused as of last commit. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* meson: replace libmesa_util with idep_mesautilEric Engestrom2019-08-0320-35/+40
| | | | | | | | | | | This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* panfrost: Allocate polygon lists on-demandAlyssa Rosenzweig2019-08-026-10/+36
| | | | | | | | | | | | | | | Rather than alloacting a huge (64MB) polygon list on context creation and sharing it across framebuffers, we instead allocate polygon lists as BOs (which consistently hit the cache) sized appropriately; for about a month, we've known how to calculate the polygon list size so this has only recently become possible. The good news is we can render to truly massive framebuffers without crashing and, more importantly, we eliminate the 64MB upfront overhead. If a list that size isn't actually needed, it's not allocated. Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Handle the bo == NULL case in panfrost_bo_[un]reference()Boris Brezillon2019-08-021-1/+5
| | | | | | | Allows us to pass BOs without checking if they're NULL or not. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Get rid of the skippable param in attach_vt_framebuffer()Boris Brezillon2019-08-021-3/+3
| | | | | | | The only user of this function always passes true. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't emit a new FB desc when setting a new FB stateBoris Brezillon2019-08-021-1/+5
| | | | | | | | The FB desc will be emitted/attached on the first draw targetting this new FB. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Bail out early when doing a wallpaper blitBoris Brezillon2019-08-021-2/+14
| | | | | | | | | | | The wallpaper blit is a bit special in that the operation is targetting the current FB, but the u_blitter logic creates a new surface for it which makes util_framebuffer_state_equal() return false. In that case we don't want a new FB descriptor to be emitted/attached, so let's just copy the new state into ctx->pipe_framebuffer and exit the function. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Bail out early when new and current FB states are equalBoris Brezillon2019-08-021-0/+4
| | | | | | | | | If the current FB matches the new one there's nothing to be done in panfrost_set_framebuffer_state(). By bailing out early in that case we avoid emitting new FB descriptors (the old ones are still valid). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Delay FB descriptor allocationBoris Brezillon2019-08-022-18/+6
| | | | | | | | | No need to emit SFBD/MFBD at frame invalidation. They can be emitted when the framebuffer is attached, which saves us a potential FB desc re-allocation if a new FB is bound after the swap. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove job from ctx->jobs at submission timeBoris Brezillon2019-08-021-0/+8
| | | | | | | | This guarantees that new draws targetting the same framebuffer will get a new job instance. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Make ctx->job usefulBoris Brezillon2019-08-022-1/+23
| | | | | | | | | | | | | | | | ctx->job is supposed to serve as a cache to avoid an hash table lookup everytime we access the job attached to the currently bound FB, except it was never assigned to anything but NULL. Fix that by adding the missing assignment in panfrost_get_job_for_fbo(). Also add a missing NULL assignment in the ->set_framebuffer_state() path. While at it, add extra assert()s to make sure ctx->job is consistent. Fixes: 59c9623d0a75 ("panfrost: Import job data structures from v3d") Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* ac/nir,radv: Optimize bounds check for 64 bit CAS.Bas Nieuwenhuizen2019-08-021-0/+1
| | | | | | | | When the application does not ask for robust buffer access. Only implemented the check in radv. Reviewed-by: Samuel Pitoiset <[email protected]>
* gallivm: fix issue with AtomicCmpXchg wrapper on llvm 3.5-3.8Roland Scheidegger2019-08-021-1/+3
| | | | | | | | | | | | | These versions still need wrapper but already have both success and failure ordering. (Compile tested on llvm 3.3, 3.7, 3.8.) v2: don't duplicate whole function (suggested by Brian). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111102 Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: deduplicate os detection logic by using detect_os.hEric Engestrom2019-08-021-28/+19
| | | | | | | | This allows us to avoid having to rename all the PIPE_OS_* at once while still making sure PIPE_OS_* and DETECT_OS_* are always in sync. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gallium/utils: drop PIPE_SUBSYSTEM_WINDOWS_USEREric Engestrom2019-08-027-29/+12
| | | | | | | This is basically just an alias for PIPE_OS_WINDOWS. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* scons: rename PIPE_SUBSYSTEM_EMBEDDED to EMBEDDED_DEVICEEric Engestrom2019-08-023-3/+3
| | | | | | | It has nothing to do with the PIPE_SUBSYSTEM_* stuff from gallium. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gallium: remove never-used PIPE_SUBSYSTEM_DRIEric Engestrom2019-08-021-4/+0
| | | | | | | | PIPE_SUBSYSTEM_DRI was introduced in dacfef158943665fc0d1 ("gallium: New configuration header.") 11 years ago, and was never used. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* freedreno/batch: fix dependency loop detectionRob Clark2019-08-021-11/+10
| | | | | | | | | | | | | | | | | We can have a scenario like: A -> B A -> C -> B When adding the A->C dependency, it doesn't really matter that C depends on something that A depends on, that isn't a necessary condition for a dependency loop. Instead what we want to know is that nothing C depends on, directly or indirectly, depends on A. We can detect this by recursively OR'ing the dependents_mask of C and all it's dependencies. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: add missing flush/invalidates for blitRob Clark2019-08-022-15/+9
| | | | | | Various things we were missing for multiple blits in a single batch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: skip tiles with no geometryRob Clark2019-08-023-3/+66
| | | | | | | | | If no clear, and no geometry according to VSC_STATE[pipe] we can skip the tile entirely. If there is a fast-clear, we can't skip restore (clear) or resolve IBs, but we can still skip draw IB. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a6xx: VSC overflow detection/handlingRob Clark2019-08-023-34/+266
| | | | | | | | | | | | | | | | | | Check VSC_SIZE/VSC_SIZE2 regs from cmdstream to detect overflow, and skip use of VSC visibility stream when overflow is detected, to avoid GPU hangs. This is done w/ introduction of some CP_REG_TEST/ CP_COND_REG_EXEC packet pairs. In addition, eventually (after a frame or two) detect the condition and resize the VSC buffers until overflow no longer happens. Note that this significantly reduces the initial size of the VSC buffers, backing out a previous hack to make them 16x larger than what should be typically required (the previous "solution" for VSC overflow). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a6xx: remove USE/IGNORE_VISIBILITY draw patchingRob Clark2019-08-022-23/+9
| | | | | | | | | | Seems this isn't needed anymore on a6xx to control whether visibility stream is used. And it would be hard to deal with if it was, for disabling use of VSC stream in draw pass. So just remove it and simplify things. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a6xx: cleanup "blit_mem"Rob Clark2019-08-024-14/+25
| | | | | | | | | | | Rename to "control_mem", and switch to using a struct to manage the layout, rather than just ad-hoc hard-coded offsets. For recovering from VSC stream overflow, we'll need to add more, but best to clean it up first. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno: refresh tile debugRob Clark2019-08-021-15/+22
| | | | | | | | | | Fix some #ifdef'd bitrot, and get rid of #ifdef so it doesn't bitrot again. And add a prints for per-tile state. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/gmem: small cleanupRob Clark2019-08-021-2/+2
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/batch: always update last_fenceRob Clark2019-08-021-0/+2
| | | | | | | | | | | | | Not all flush paths come thru fd_context_flush(), so we should also set last_fence in the batch flush path. This avoids some no-op flushes just to get a fence. For example when pctx->flush_resource() triggers a flush. We should probably keep the last_fence update in fd_context_flush() as well to handle deferred flush case. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno: drop unused fd_fence_ref paramRob Clark2019-08-028-17/+22
| | | | | | | | | | | The pscreen param was just there to satisfy pipe_screen::fence_reference But some of the internal uses passed NULL for screen. Which is a bit ugly. Instead drop the param and add a shim function to plug into the screen. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno: a2xx: implement texture tilingJonathan Marek2019-08-026-4/+23
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: use nir_lower_alu_to_scalar instead of lowering passJonathan Marek2019-08-024-178/+12
| | | | | | | | | nir_lower_alu_to_scalar can now be used to only lower certain ops, so we don't need the custom pass. And we can lower fall_equal/fany_nequal with lower_vector_cmp instead. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix HW binning for batches with >256K verticesJonathan Marek2019-08-021-8/+8
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix fneg/fabs/fsat opcodesJonathan Marek2019-08-021-0/+12
| | | | | | | | Previously we would get a fmov with modifiers, but now that mov has no type these opcodes need to be supported. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix order of NIR optsJonathan Marek2019-08-021-2/+2
| | | | | | | | int_to_float needs to come after bool_to_float, and lower_to_source_mods needs to come after both, since they don't deal wih source mods. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix non-etc1 cubemapsJonathan Marek2019-08-025-15/+2
| | | | | | | Not sure how this happened, but apparently all cubemaps need swapped XY. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix fast clear not being used for Z24X8 buffersJonathan Marek2019-08-021-7/+11
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: align renderonly scanout buffersJonathan Marek2019-08-021-0/+3
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* iris: bump compat profile support to 4.6Timothy Arceri2019-08-021-2/+1
| | | | | | All of the current piglit compat profile tests pass. Reviewed-by: Kenneth Graunke <[email protected]>
* gallium: Implement GL_EXT_shader_samples_identical via a new capabilityKenneth Graunke2019-08-015-0/+5
| | | | | | | | | This exposes the textureSamplesIdenticalEXT function in GLSL. We enable it for iris and radeonsi, because their compilers already have support for this. Tested on Intel Kabylake and AMD Vega 64. Reviewed-by: Marek Olšák <[email protected]>
* iris/screen: use initialization routine for gen_device_infoMark Janes2019-08-011-5/+3
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>