aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* vulkan: Update the XML and headers to 1.1.124Caio Marcelo de Oliveira Filho2019-10-071-8/+119
| | | | Acked-by: Lionel Landwerlin <[email protected]>
* Revert "iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaround"Kenneth Graunke2019-10-071-23/+0
| | | | | | | This reverts commit 4f857423b3c095516e553b976b41969c2b9721fa. It caused GPU hangs on all affected platforms, in e.g. Piglit bin/stencil-twoside -auto -fbo.
* Revert "Revert "st/dri2: Implement DRI2bufferDamageExtension""Boris Brezillon2019-10-073-0/+53
| | | | | | | This reverts commit 19546108d3dd5541a189e36df4ea83b3f519e48f. This commit breaks the build because lima implements ->set_damage_region(). I guess we'll need more discussion before removing the ->set_damage_region() hook.
* Revert "st/dri2: Implement DRI2bufferDamageExtension"Boris Brezillon2019-10-073-53/+0
| | | | | | | | | | | | | | This reverts commit 492ffbed63a2a62759224b1c7d45aa7923d8f542. BACK_LEFT attachment can be outdated when the user calls KHR_partial_update(), leading to a damage region update on the wrong pipe_resource object. Let's not expose the ->set_damage_region() method until the core is fixed to handle that properly. Cc: [email protected] Signed-off-by: Boris Brezillon <[email protected]> Acked-by: Daniel Stone <[email protected]>
* gitlab-ci: Move LAVA-related files into top-level ci dirTomeu Vizoso2019-10-069-1741/+0
| | | | | | | In preparation for testing drivers other than Panfrost in LAVA labs. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gitlab-ci: Run dEQP on devices with PanfrostTomeu Vizoso2019-10-063-53/+40
| | | | | | | | | | | Include Panfrost's gitlab.ci.yml file from Mesa's main .gitlab-ci.yml so we test on devices with Panfrost. This uses LAVA to schedule jobs in the devices and will be the base for testing Etnaviv, Lima, etc. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* iris: Hack up a SKL/Gen9LP PS push constant fifo depth workaroundKenneth Graunke2019-10-051-0/+23
| | | | | | | | | | | This is a port of Nanley's 904c2a617d86944fbdc2c955f327aacd0b3df318 from i965 to iris. One concern is that iris uses larger batches, and also emits far fewer commands, so we may come closer to the 500 limit within a batch, and could need to supplement this with actual counting. Manhattan 3.0 had 239 3DSTATE_CONSTANT_PS packets in a batch, Unigine Valley had 155. So it seems like we're still in the realm of safety.
* iris: Refactor push constant allocation so we can reuse itKenneth Graunke2019-10-051-9/+22
| | | | | We'll need this for a workaround shortly. While refactoring, also improve the comment slightly.
* intel/isl: set vertical surface alignment on null surfacesLionel Landwerlin2019-10-051-0/+13
| | | | | | | | Just following the spec. Somewhat unclear whether this applies to NULL surfaces. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/isl: set surface array appropriatelyLionel Landwerlin2019-10-051-1/+1
| | | | | | | This doesn't seem to affect anything. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/isl: Set null surface format to R32_UINTLionel Landwerlin2019-10-051-1/+6
| | | | | | | | | | | | | | It appears we never had a test in piglit or deqp sampling from a null surface... It turns out this triggers a hang on IVB only. Updating the null surface format to R32_UINT fixes the hang on ivb and doesn't affect other platforms, so set it by default for all platforms. Signed-off-by: Lionel Landwerlin <[email protected]> Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1872 Cc: <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* etnaviv: set texture INT_FILTER bitJonathan Marek2019-10-051-1/+2
| | | | | | | This should improve texture sampling performance on GC3000. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: implement texture comparatorJonathan Marek2019-10-056-5/+51
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: update headers from rnndbJonathan Marek2019-10-053-31/+40
| | | | | | | Update to etna_viv commit 7ff8029. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* intel: fix subslice computation from topology dataLionel Landwerlin2019-10-051-1/+1
| | | | | | | | | | | | We're missing the offset of the slice in the subslice mask... This worked for most platforms that don't have first slice fused off because we would reread the same mask from slice0 again and again... Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: c1900f5b0f ("intel: devinfo: add helper functions to fill fusing masks values") Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1869 Reviewed-by: Mark Janes <[email protected]>
* dri: Avoid swapbuffer throttling in glXCopySubBufferMESAKenneth Graunke2019-10-052-2/+2
| | | | | | | | | | We were supplying __DRI2_THROTTLE_SWAPBUFFER, rather than the obvious choice of __DRI2_THROTTLE_COPYSUBBUFFER. This meant that we hit the swap-based frame throttling. glXCopySubBuffer doesn't seem like it's intended to be a frame boundary, so we'd like to avoid this throttling. Tested-by: Michel Dänzer <[email protected]> # DRI3 only Reviewed-by: Michel Dänzer <[email protected]>
* st/dri: Perform MSAA downsampling for __DRI2_THROTTLE_COPYSUBBUFFERKenneth Graunke2019-10-051-2/+4
| | | | | | | | | | | | | | glXCopySubBufferMESA copies data from the back buffer to the front, so it needs to perform a MSAA downsampling operation just like glXSwapBuffers would. Currently, the CopySubBuffer implementations supply a throttle reason of __DRI2_THROTTLE_SWAPBUFFERS, so they hit this path and work today. But we'd like to avoid swapbuffer throttling in this case, so the next patch will change that reason. Tested-by: Michel Dänzer <[email protected]> # DRI3 only Reviewed-by: Michel Dänzer <[email protected]>
* intel/error2aub: add support for platforms without PPGTTLionel Landwerlin2019-10-041-15/+24
| | | | | | | | Not much to do to enable this, just make sure to always write to the GGTT :) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* aco: fix load_constant with multiple arraysRhys Perry2019-10-041-3/+3
| | | | | | | | | I thought I fixed this, but I guess I must have broken it again. Fixes various dEQP-VK.draw.* tests Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* nir: Fix some wonky whitespace in nir_search.h.Eric Anholt2019-10-041-2/+2
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Factor out most of the algebraic passes C code to .c/.h.Eric Anholt2019-10-043-146/+173
| | | | | | | | | | | Working on the algebraic implementation, I was being driven nuts by my editor not highlighting and handling indentation for the C code. It turns out that it's basically not pass-specific code, and we can move it over to the relevant .c file. Replaces 30KB of code with 34KB of data on my i965 build. No perf diff on shader-db (n=3) Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Keep the range analysis HT around intra-pass until we make a change.Eric Anholt2019-10-047-38/+52
| | | | | | | | | This lets us memoize range analysis work across instructions. Reduces runtime of shader-db on Intel by -30.0288% +/- 2.1693% (n=3). Fixes: 405de7ccb6cb ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Skip emitting no-op movs from the builder.Eric Anholt2019-10-042-3/+12
| | | | | | | | | | | Having passes generate these is just making more work for copy propagation (and thus probably calling more optimization passes) later. Noticed while trying to debug nir_opt_algebraic() top-to-bottom having O(n^2) behavior due to not finding new matches in replacement code. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Make nir_search's dumping go to stderr.Eric Anholt2019-10-041-16/+16
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* surfaceless: Support EGL_WL_bind_wayland_displayAdam Jackson2019-10-041-0/+4
| | | | | | | Feature parity with the drm, x11, and wayland platforms. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/issues/1870 Tested-by: Pekka Paalanen <[email protected]>
* nir/print: always use the right FILE *Rhys Perry2019-10-041-2/+4
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: initialize needs_helper_invocations as wellErik Faye-Lund2019-10-041-0/+1
| | | | | | | Similar to the previous commit, we should also initialize needs_helper_invocations here. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: initialize uses_discard to falseErik Faye-Lund2019-10-041-0/+1
| | | | | | | | This matches what we do for uses_sample_qualifier, and what we do in ir_set_program_inouts.cpp as well. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv/aco,aco: set lower_fmodRhys Perry2019-10-043-31/+1
| | | | | | | | | | | | | | | | | | | | | | | | This simplifies ACO and allows the lowered code to be optimized (in particular, constant folded). Totals from affected shaders: SGPRS: 1776 -> 1776 (0.00 %) VGPRS: 1436 -> 1436 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 203452 -> 203564 (0.06 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 103 -> 103 (0.00 %) At least some of the code size increase seems to be from literals being applied to instructions as a result of constant folding. v2: remove fmod/frem handling in init_context() Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* dri3: Pass __DRI2_THROTTLE_COPYSUBBUFFER from loader_dri3_copy_drawableMichel Dänzer2019-10-041-1/+1
| | | | | | | | | | 0 is __DRI2_THROTTLE_SWAPBUFFER, which doesn't really make sense here. Avoids dri_flush() throttling twice for the same glFlush call with front buffer rendering, as described in https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2057 . Reviewed-by: Kenneth Graunke <[email protected]>
* r600: Fix interpolateAtCentroidGert Wollny2019-10-044-1/+14
| | | | | | | | | | If the instruction interpolateAtCentroid is used the extra interpolator must also be enabled in the state. Fixes: fs-interpolateatcentroid-block Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* pan/midgard: Replace mir_is_live_after with new passAlyssa Rosenzweig2019-10-031-57/+15
| | | | | | | | Now that we have live_out calculated per block as metadata, calculating liveness of an instruction at a given point in the program becomes O(n) to the size of the block worst-case, rather than O(n) the program. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Calculate temp_count for livenessAlyssa Rosenzweig2019-10-032-1/+3
| | | | | | This needs to be correct or the analysis fails. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Invalidate liveness for mir_is_live_afterAlyssa Rosenzweig2019-10-033-0/+6
| | | | | | | | | | Callers should have liveness info ready. Ideally we'd have a nice metadata tracking framework like NIR to handle this automatically, but for now this will allow us to make forward progress... when we're about to do something with liveness, invalidate everything ahead to force a clean calculation. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Begin tracking liveness metadataAlyssa Rosenzweig2019-10-034-5/+39
| | | | | | | This will allow us to explicitly invalidate liveness analysis results so we can cache liveness results. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Don't try to OR live_in of successorsAlyssa Rosenzweig2019-10-031-6/+2
| | | | | | | | By definition, once liveness analysis has occurred: live_out = OR {succ} succ->live_in Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Move RA's liveness analysis into midgard_liveness.cAlyssa Rosenzweig2019-10-033-122/+129
| | | | | | | | | | | There are unfortunately two distinct liveness analysis passes in the compiler right now -- one good (but complex) pass used by RA based on solving data flow equations, and one awful (but simple) pass used for dead code elimination and bundling based on an abstract walk of the AST. Let's move RA's pass into shared code so we can work on unifying. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_calculate_temp_count helperAlyssa Rosenzweig2019-10-032-0/+19
| | | | | | | This allows us to fill in ctx->temp_count explicitly, even if we haven't squished down the MIR. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove mir_has_multiple_writesAlyssa Rosenzweig2019-10-033-18/+0
| | | | | | | We already enforce this with the SSA/register distinction in the backend. There is no need to duplicate this logic merely for an assert. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Get rid of the flush in panfrost_set_framebuffer_state()Boris Brezillon2019-10-031-43/+3
| | | | | | | | | Now that we have track inter-batch dependencies, the flush done in panfrost_set_framebuffer_state() is no longer needed. Let's get rid of it. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Kill the explicit serialization in panfrost_batch_submit()Boris Brezillon2019-10-031-12/+0
| | | | | | | | | Now that we have all the pieces in place to support pipelining batches we can get rid of the drmSyncobjWait() at the end of panfrost_batch_submit(). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Do fine-grained flushing when preparing BO for CPU accessesBoris Brezillon2019-10-032-19/+12
| | | | | | | | | | | | | We don't have to flush all batches when we're only interested in reading/writing a specific BO. Thanks to the panfrost_flush_batches_accessing_bo() and panfrost_bo_wait() helpers we can now flush only the batches touching the BO we want to access from the CPU. This fixes the dEQP-GLES2.functional.fbo.render.texsubimage.* tests. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Make sure the BO is 'ready' when picked from the cacheBoris Brezillon2019-10-033-24/+110
| | | | | | | | This is needed if we want to free the panfrost_batch object at submit time in order to not have to GC the batch on the next job submission. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add flags to reflect the BO imported/exported stateBoris Brezillon2019-10-032-2/+8
| | | | | | | | | Will be useful to make the ioctl(WAIT_BO) call conditional on BOs that are not exported/imported (meaning that all GPU accesses are known by the context). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add a panfrost_flush_batches_accessing_bo() helperBoris Brezillon2019-10-032-0/+35
| | | | | | | | This will allow us to only flush batches touching a specific resource, which is particularly useful when the CPU needs to access a BO. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add a panfrost_flush_all_batches() helperBoris Brezillon2019-10-035-15/+64
| | | | | | | | | | | And use it in panfrost_flush() to flush all batches, and not only the one currently bound to the context. We also replace all internal calls to panfrost_flush() by panfrost_flush_all_batches() ones. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Prepare panfrost_fence for batch pipeliningBoris Brezillon2019-10-035-58/+59
| | | | | | | | | | | | | | The panfrost_fence logic currently waits on the last submitted batch, but the batch serialization that was enforced in panfrost_batch_submit() is about to go away, allowing for several batches to be pipelined, and the last submitted one is not necessarily the one that will finish last. We need to make sure the fence logic waits on all flushed batches, not only the last one. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Start tracking inter-batch dependenciesBoris Brezillon2019-10-033-5/+356
| | | | | | | | | | The idea is to track which BO are being accessed and the type of access to determine when a dependency exists. Thanks to that we can build a dependency graph that will allow us to flush batches in the correct order. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add a panfrost_freeze_batch() helperBoris Brezillon2019-10-031-18/+44
| | | | | | | | | We'll soon need to freeze a batch not only when it's flushed, but also when another batch depends on us, so let's add a helper to avoid duplicating the logic. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use the per-batch fences to wait on the last submitted batchBoris Brezillon2019-10-034-15/+47
| | | | | | | | | We just replace the per-context out_sync object by a pointer to the the fence of the last last submitted batch. Pipelining of batches will come later. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>