summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* Revert "iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaround"Kenneth Graunke2019-10-071-23/+0
| | | | | | | This reverts commit 4f857423b3c095516e553b976b41969c2b9721fa. It caused GPU hangs on all affected platforms, in e.g. Piglit bin/stencil-twoside -auto -fbo.
* Revert "Revert "st/dri2: Implement DRI2bufferDamageExtension""Boris Brezillon2019-10-071-0/+1
| | | | | | | This reverts commit 19546108d3dd5541a189e36df4ea83b3f519e48f. This commit breaks the build because lima implements ->set_damage_region(). I guess we'll need more discussion before removing the ->set_damage_region() hook.
* Revert "st/dri2: Implement DRI2bufferDamageExtension"Boris Brezillon2019-10-071-1/+0
| | | | | | | | | | | | | | This reverts commit 492ffbed63a2a62759224b1c7d45aa7923d8f542. BACK_LEFT attachment can be outdated when the user calls KHR_partial_update(), leading to a damage region update on the wrong pipe_resource object. Let's not expose the ->set_damage_region() method until the core is fixed to handle that properly. Cc: [email protected] Signed-off-by: Boris Brezillon <[email protected]> Acked-by: Daniel Stone <[email protected]>
* gitlab-ci: Move LAVA-related files into top-level ci dirTomeu Vizoso2019-10-069-1741/+0
| | | | | | | In preparation for testing drivers other than Panfrost in LAVA labs. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gitlab-ci: Run dEQP on devices with PanfrostTomeu Vizoso2019-10-063-53/+40
| | | | | | | | | | | Include Panfrost's gitlab.ci.yml file from Mesa's main .gitlab-ci.yml so we test on devices with Panfrost. This uses LAVA to schedule jobs in the devices and will be the base for testing Etnaviv, Lima, etc. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaroundKenneth Graunke2019-10-051-0/+23
| | | | | | | | | | | This is a port of Nanley's 904c2a617d86944fbdc2c955f327aacd0b3df318 from i965 to iris. One concern is that iris uses larger batches, and also emits far fewer commands, so we may come closer to the 500 limit within a batch, and could need to supplement this with actual counting. Manhattan 3.0 had 239 3DSTATE_CONSTANT_PS packets in a batch, Unigine Valley had 155. So it seems like we're still in the realm of safety.
* iris: Refactor push constant allocation so we can reuse itKenneth Graunke2019-10-051-9/+22
| | | | | We'll need this for a workaround shortly. While refactoring, also improve the comment slightly.
* etnaviv: set texture INT_FILTER bitJonathan Marek2019-10-051-1/+2
| | | | | | | This should improve texture sampling performance on GC3000. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: implement texture comparatorJonathan Marek2019-10-056-5/+51
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: update headers from rnndbJonathan Marek2019-10-053-31/+40
| | | | | | | Update to etna_viv commit 7ff8029. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* r600: Fix interpolateAtCentroidGert Wollny2019-10-044-1/+14
| | | | | | | | | | If the instruction interpolateAtCentroid is used the extra interpolator must also be enabled in the state. Fixes: fs-interpolateatcentroid-block Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* panfrost: Get rid of the flush in panfrost_set_framebuffer_state()Boris Brezillon2019-10-031-43/+3
| | | | | | | | | Now that we have track inter-batch dependencies, the flush done in panfrost_set_framebuffer_state() is no longer needed. Let's get rid of it. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Kill the explicit serialization in panfrost_batch_submit()Boris Brezillon2019-10-031-12/+0
| | | | | | | | | Now that we have all the pieces in place to support pipelining batches we can get rid of the drmSyncobjWait() at the end of panfrost_batch_submit(). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Do fine-grained flushing when preparing BO for CPU accessesBoris Brezillon2019-10-032-19/+12
| | | | | | | | | | | | | We don't have to flush all batches when we're only interested in reading/writing a specific BO. Thanks to the panfrost_flush_batches_accessing_bo() and panfrost_bo_wait() helpers we can now flush only the batches touching the BO we want to access from the CPU. This fixes the dEQP-GLES2.functional.fbo.render.texsubimage.* tests. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Make sure the BO is 'ready' when picked from the cacheBoris Brezillon2019-10-033-24/+110
| | | | | | | | This is needed if we want to free the panfrost_batch object at submit time in order to not have to GC the batch on the next job submission. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add flags to reflect the BO imported/exported stateBoris Brezillon2019-10-032-2/+8
| | | | | | | | | Will be useful to make the ioctl(WAIT_BO) call conditional on BOs that are not exported/imported (meaning that all GPU accesses are known by the context). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add a panfrost_flush_batches_accessing_bo() helperBoris Brezillon2019-10-032-0/+35
| | | | | | | | This will allow us to only flush batches touching a specific resource, which is particularly useful when the CPU needs to access a BO. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add a panfrost_flush_all_batches() helperBoris Brezillon2019-10-035-15/+64
| | | | | | | | | | | And use it in panfrost_flush() to flush all batches, and not only the one currently bound to the context. We also replace all internal calls to panfrost_flush() by panfrost_flush_all_batches() ones. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Prepare panfrost_fence for batch pipeliningBoris Brezillon2019-10-035-58/+59
| | | | | | | | | | | | | | The panfrost_fence logic currently waits on the last submitted batch, but the batch serialization that was enforced in panfrost_batch_submit() is about to go away, allowing for several batches to be pipelined, and the last submitted one is not necessarily the one that will finish last. We need to make sure the fence logic waits on all flushed batches, not only the last one. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Start tracking inter-batch dependenciesBoris Brezillon2019-10-033-5/+356
| | | | | | | | | | The idea is to track which BO are being accessed and the type of access to determine when a dependency exists. Thanks to that we can build a dependency graph that will allow us to flush batches in the correct order. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add a panfrost_freeze_batch() helperBoris Brezillon2019-10-031-18/+44
| | | | | | | | | We'll soon need to freeze a batch not only when it's flushed, but also when another batch depends on us, so let's add a helper to avoid duplicating the logic. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use the per-batch fences to wait on the last submitted batchBoris Brezillon2019-10-034-15/+47
| | | | | | | | | We just replace the per-context out_sync object by a pointer to the the fence of the last last submitted batch. Pipelining of batches will come later. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add a batch fenceBoris Brezillon2019-10-032-1/+94
| | | | | | | So we can implement fine-grained dependency tracking between batches. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Make panfrost_batch->bos a hash tableBoris Brezillon2019-10-032-12/+23
| | | | | | | | So we can store the flags as data and keep the BO as a key. This way we keep track of the type of access done on BOs. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Extend the panfrost_batch_add_bo() API to pass access flagsBoris Brezillon2019-10-038-23/+114
| | | | | | | | | | | | | | | | | The type of access being done on a BO has impacts on job scheduling (shared resources being written enforce serialization while those being read only allow for job parallelization) and BO lifetime (the fragment job might last longer than the vertex/tiler ones, if we can, it's good to release BOs earlier so that others can re-use them through the BO re-use cache). Let's pass extra access flags to panfrost_batch_add_bo() and panfrost_batch_create_bo() so the batch submission logic can take the appropriate when submitting batches. Note that this information is not used yet, we're just patching callers to pass the correct flags here. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add the shader BO to the batch in patch_shader_state()Boris Brezillon2019-10-031-6/+5
| | | | | | | | | | We know a shader will be used by a batch when panfrost_patch_shader_state() is called, so let's add the shader BO at that time. Suggested-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* virgl: modify internal structures to track winsys-supplied dataGurchetan Singh2019-10-022-43/+52
| | | | | | | | | | | | | The winsys might supply dimensions that are different than those we calculate. In additional, it may supply virtualized modifiers. In practice, a stride != bpp * width and virtualized modifiers don't happen yet, but the plan is to move in that direction. Also make virgl_resource_layout static. Reviewed by: Robert Tarasov <[email protected]>
* virgl: modify resource_create_from_handle(..) callbackGurchetan Singh2019-10-022-2/+13
| | | | | | | This commit makes no functional changes, just adds the revelant plumbing. Reviewed by: Robert Tarasov <[email protected]>
* etnaviv: enable triangle strips only when the hardware supports itGert Wollny2019-10-021-1/+7
| | | | | | | | | | | | | | Some hardware has a bug with triangle strips and it is signalled by the flag BUG_FIXED8 whether this bug has been fixed. So only enable triangle strips when this flag is set. Thanks: Jonathan Marek and Christian Gmeiner for the pointers v2: Add TODO to indicate that the handling should be refined (Jonathan & Christian) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* iris: Enable EXT_demote_to_helper_invocationCaio Marcelo de Oliveira Filho2019-09-301-0/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Fix iris_rebind_buffer() for VBOs with non-zero offsets.Kenneth Graunke2019-09-301-2/+6
| | | | | | | | We can't just check for the BO base address, we need to check for the full address including any offset we may have applied. When updating the address, we need to include the offset again. Fixes: 5ad0c88dbe3 ("iris: Replace buffer backing storage and rebind to update addresses.")
* ac: add ac_build_image_get_sample_count from radeonsiMarek Olšák2019-09-301-17/+7
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* iris: Disable CCS_E for 32-bit floating point textures.Kenneth Graunke2019-09-301-1/+23
| | | | | | | | | | | | | | | | | | | | | A while back, Michael Larabel noticed that Paraview's Wavelet Volume case runs significantly slower on iris than i965. It turns out this is because we enable CCS_E for 32-bit floating point formats, while i965 disables it, with an oblique comment saying that we benchmarked it (on what exactly?) and determined that it was a loss. Paraview uses both R32_FLOAT and R32G32B32A32_FLOAT, and I observed large framerate drops when enabling CCS_E for either format. However, several other benchmarks (Aztec Ruins, many Synmark cases) use 16-bit floating point formats, with no apparent ill effects. So, disable compression for 32-bit float formats for now, but leave it enabled for 16-bit float formats as they seem to be working fine. Improves performance in Paraview's Wavelet Volume test by 62% on a Skylake GT4e. Fixes: 3cfc6a207bd ("iris: Fill out res->aux.possible_usages")
* radeonsi/gfx10: fix corruption for chips with harvested TCCsMarek Olšák2019-09-301-2/+6
| | | | | Cc: 19.2 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/gfx10: fix L2 cache rinse programmingMarek Olšák2019-09-301-5/+17
| | | | | Cc: 19.2 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* etnaviv: fix bitmask typoEric Engestrom2019-09-301-1/+1
| | | | | | Fixes: d92689c46f0d2da05ae6 ("etnaviv: nir: add native integers (HALTI2+)") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jonathan Marek <[email protected]>
* nouveau: set lower_sub = trueDaniel Schürmann2019-09-303-6/+2
| | | | | | Subtractions are already implemented as additions anyway. Reviewed-by: Connor Abbott <[email protected]>
* vc4: Enable the nir_opt_algebraic_late() pass.Eric Anholt2019-09-301-0/+15
| | | | | | | | | | Upcoming changes to sub optimization will make this pass required. Over the course of that series, we see uniforms +.46%, instructions -.24% (seems like a fine tradeoff -- uniforms are 1/2 the size of instructions as far as cache occupancy) Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* scons: Fix MSYS2 Mingw-w64 build.pal10002019-09-291-2/+2
| | | | | | | | | | | | | | | | Reviewed-by: Jose Fonseca <[email protected]> This patch is based on https://github.com/msys2/MINGW-packages/blob/28e3f85e09b6947ea80036c49f6c38f1394f93ca/mingw-w64-mesa/link-ole32.patch but with tweaks to avoid MSVC build break when applied. v2: Create Mingw platform alias pointing to windows host platform define to avoid spurious crosscompilation; v3: Fix obviously wrong compiler flags for swr driver; v4: Update original patch URL because it has been relocated; v5: Don't bother patching autools stuff as it's not used by MSYS2 Mingw-w64 build and it's days are numbered anyway; v6: After Mingw posix flag fix in 295851eb things are far simpler as we don't need more linking of uuid, ole32, version and shell32 than what is already in place.
* lima: set uniforms_address lower bits properlyVasily Khoruzhick2019-09-281-0/+8
| | | | | | | | | | | | | | | | | | | | Looks like blob uses following values for uniforms buffer: 0 for 8 bytes 1 for 16 bytes 2 for 24 bytes 2 for 32 bytes 3 for 40 bytes 3 for 48 bytes 3 for 56 bytes 3 for 64 bytes 4 for 72 bytes It all looks like log2(size / 8) rounded up, so let's do the same. Fixes: 931fc2a7b3f9("lima: do not set the PP uniforms address lowest bits") Reviewed-by: Icenowy Zheng <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* etnaviv: nir: fix gl_FragDepthJonathan Marek2019-09-281-3/+17
| | | | | | | Fixes the following piglit test: fragdepth_gles2 (for ETNA_MESA_DEBUG=nir) Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: disable earlyZ when shader writes fragment depthJonathan Marek2019-09-283-3/+8
| | | | | | | Fixes the following piglit test: fragdepth_gles2 Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: nir: make lower_alu easier to followJonathan Marek2019-09-281-32/+36
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: remove extra allocation for shader codeJonathan Marek2019-09-281-1/+1
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: nir: remove "options" structJonathan Marek2019-09-282-41/+24
| | | | | | | It just makes thing more complicated for no reason. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: nir: use store_deref instead of store_outputJonathan Marek2019-09-282-70/+59
| | | | | | | Allows some simplification. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: nir: add native integers (HALTI2+)Jonathan Marek2019-09-285-34/+170
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* qetnaviv: nir: use new immediates when possibleJonathan Marek2019-09-281-1/+21
| | | | | | | | | Note it can still be improved a bit: * Use alu swizzle to determine if src is scalar * Take into account new immediates in the multiple uniform src lowering Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: nir: set num_components for inputs/outputsJonathan Marek2019-09-281-3/+3
| | | | | | | | | This can improve performance by allowing the LAST_VARYING_2X bit to be set when possible (and possibility more benefits on HALTI5 where the number of components is set for each varying). Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: nir: allocate contiguous components for LOAD destinationJonathan Marek2019-09-281-8/+53
| | | | | | | | | LOAD starts reading into the first enabled destination component, and doesn't skip disabled components, so we need to allocate a destination with contiguous components. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>