summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* panfrost: ci: Fix parsing of crashed testsTomeu Vizoso2019-06-211-2/+2
| | | | | | | | Without this fix, LAVA isn't parsing crashes as failed tests, because the shell logging is interspersed within the fake deqp output. Signed-off-by: Tomeu Vizoso <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Conditionally submit fragment jobAlyssa Rosenzweig2019-06-211-1/+4
| | | | | | | If there are no tiling jobs and no clears, there is no need to submit a fragment job (relevant for transform feedback). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement rasterizer discardAlyssa Rosenzweig2019-06-211-2/+12
| | | | | | D'aww, look how cute that is now that scoreboarding is setup. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Track buffer initializationAlyssa Rosenzweig2019-06-214-2/+43
| | | | | | | | | | | | | | We want to know if a given slice of a buffer is initialized at a particular point in the execution of the program. This is accomplished easily enough -- start out uninitialized and upon an operation writing to the buffer, mark it initialized. The motivation is to optimize away expensive operations (like wallpaper blits) when reading from an uninitialized buffer; since it's uninitialized, the results of these operations are undefined, and it's legal to take the fast path ^_^ Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement command stream scoreboardingAlyssa Rosenzweig2019-06-217-143/+558
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a rather complex change, adding a lot of code but ideally cleaning up quite a bit as we go. Within a batch (single frame), there are multiple distinct Mali job types: SET_VALUE, VERTEX, TILER, FRAGMENT for the few that we emit right now (eventually more for compute and geometry shaders). Each hardware job has a mali_job_descriptor_header, which contains three fields of interest: job index, a dependencies list, and a next job pointer. The next job pointer in each job is used to form a linked list of submitted jobs. Easy enough. The job index and dependencies list, however, are used to form a dependency graph (a DAG, where each hardware job is a node and each dependency is a directed edge). Internally, this sets up a scoreboarding data structure for the hardware to dispatch jobs in parallel, enabling (for example) vertex shaders from different draws to execute in parallel while there are strict dependencies between tiling the geometry of a draw and running that vertex shader. For a while, we got by with an incredible series of total hacks, manually coding indices, lists, and dependencies. That worked for a moment, but combinatorial kaboom kicked in and it became an unmaintainable mess of spaghetti code. We can do better. This commit explicitly handles the scoreboarding by providing high-level manipulation for jobs. Rather than a command like "set dependency #2 to index 17", we can express quite naturally "add a dependency from job T on job V". Instead of some open-coded logic to copy a draw pointer into a delicate context array, we now have an elegant exposed API to simple "queue a job of type XYZ". The design is influenced by both our current requirements (standard ES2 draws and u_blitter) as well as the need for more complex scheduling in the future. For instance, blits can be optimized to use only a tiler job, without a vertex job first (since the screen-space vertices are known ahead-of-time) -- causing tiler-only jobs. Likewise, when using transform feedback with rasterizer discard enabled, vertex jobs are created (to run vertex shaders) with no corresponding tiler job. Both of these cases break the original model and could not be expressed with the open-coded logic. More generally, this will make it easier to add support for compute shaders, geometry shaders, and fused jobs (an optimization available on Bifrost). Incidentally, this moves quite a bit of state from the driver context to the batch, which helps with Rohan's refactor to eventually permit pipelining across framebuffers (one important outstanding optimization for FBO-heavy workloads). v2: Add comment explaining the meaning of "primary batch" as suggested by Tomeu (trivial - not reviewed). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Reviewed-by: Rohan Garg <[email protected]>
* anv: Implement "pop-free" clippingJason Ekstrand2019-06-212-4/+86
| | | | | | | | | | This is the preferred clipping mode since it doesn't mean your points disappear the moment part of the point crosses over the edge of the viewport and that lines have weird endpoints at viewport edges. We've just never bothered to hook it up until now. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Enable the guardband clip testJason Ekstrand2019-06-212-3/+21
| | | | | | | | | | In workloads where there is a lot of geometry drawn that crosses over the edge of the viewport, this should substantially improve clipper performance. Not really sure why it's taken 3 years to turn it on but we never got around to it. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965,iris: Move guardband calculations to a common locationJason Ekstrand2019-06-215-176/+126
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* android: virgl: fix libmesa_winsys_virgil_common build and dependenciesMauro Rossi2019-06-214-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following building errors and resolves Bug 110922 Fixes gallium_dri target missing symbols at linking. external/mesa/src/gallium/winsys/virgl/drm/Android.mk: error: libmesa_winsys_virgl (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) ... external/mesa/src/gallium/winsys/virgl/vtest/Android.mk: error: libmesa_winsys_virgl_vtest (STATIC_LIBRARIES android-x86_64) missing libmesa_winsys_virgl_common (STATIC_LIBRARIES android-x86_64) ... build/core/main.mk:728: error: exiting from previous errors. In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_socket.c:34: external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10: fatal error: 'virgl_resource_cache.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. In file included from external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.c:32: external/mesa/src/gallium/winsys/virgl/vtest/virgl_vtest_winsys.h:35:10: fatal error: 'virgl_resource_cache.h' file not found #include "virgl_resource_cache.h" ^~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: b18f09a ("virgl: Introduce virgl_resource_cache") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Tested-by: Clayton Craft <[email protected]>
* android: winsys/amdgpu,radv: fix generated amdgfxregs.h header dependeciesMauro Rossi2019-06-213-3/+4
| | | | | | | | | | | | | | Fix android building errors in winsys/amdgpu and radv due to 'amdgfxregs.h' not found. Changelog: amd/common - generated $(intermediated)/common path is added to exports winsys/amdgpu - libmesa_amd_common static dependency is added radv - correct generated $(intermediated)/common path is added to includes Fixes: f480b8a ("amd/common: use generated register header") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radv: add support for VK_KHR_depth_stencil_resolveSamuel Pitoiset2019-06-212-0/+22
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: pass sample locations for transitions before depth/stencil resolvesSamuel Pitoiset2019-06-213-1/+34
| | | | | | | | HTILE decompressions need the user sample locations if specified in the current subpass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clear the depth/stencil resolve attachment if necessarySamuel Pitoiset2019-06-211-18/+55
| | | | | | | | The driver might need to clear one aspect of the depth/stencil resolve attachment before performing the resolve itself. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: decompress HTILE if the resolve src image is compressedSamuel Pitoiset2019-06-211-1/+17
| | | | | | | | It's required to decompress HTILE before resolving with the compute path. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: select the depth/stencil resolve method based on some conditionsSamuel Pitoiset2019-06-211-13/+65
| | | | | | | Only fallback to the compute path for layers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement all depth/stencil resolve modes using computeSamuel Pitoiset2019-06-212-0/+522
| | | | | | | | | This path supports layers but it requires to decompress HTILE before resolving. The driver also needs to fixup HTILE after the resolve. This path is probably slower than the graphics one. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement all depth/stencil resolve modes using graphicsSamuel Pitoiset2019-06-212-0/+614
| | | | | | | | | When using graphics, the driver doesn't need to decompress HTILE before resolving. This path currently doesn't support layers so we have to fallback to the compute path. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: record if a render pass has depth/stencil resolve attachmentsSamuel Pitoiset2019-06-212-1/+29
| | | | | | | Only supported with vkCreateRenderPass2(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: rename has_resolve to has_color_resolveSamuel Pitoiset2019-06-213-5/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: emit framebuffer state from primary if secondary doesn't inherit itSamuel Pitoiset2019-06-211-0/+9
| | | | | | | | | | | | | | | | Otherwise fast color/depth clears can't work because they depend on the framebuffer. This fixes the following CTS (when the small hint is disabled): - dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer - dEQP-VK.geometry.layered.2d_array.secondary_cmd_buffer - dEQP-VK.geometry.layered.cube.secondary_cmd_buffer - dEQP-VK.geometry.layered.cube_array.secondary_cmd_buffer Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110810 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107986 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* drisw: move build logic to build systemsEric Engestrom2019-06-211-6/+4
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* panfrost: ci: Exclude two more flip-flop from resultsTomeu Vizoso2019-06-211-1/+4
| | | | | | | | | | | | | These three tests pass on RK3399, but fail on RK3288: dEQP-GLES2.functional.shaders.matrix.div.const_lowp_mat2_mat2_vertex dEQP-GLES2.functional.shaders.operator.unary_operator.pre_increment_effect.highp_ivec4_vertex dEQP-GLES2.functional.shaders.texture_functions.vertex.texture2dprojlod_vec3 They reliably pass when run individually, but reliably fail when run in a full CI run. Signed-off-by: Tomeu Vizoso <[email protected]>
* gallium/st: Add Gallium hud to swrast driversGert Wollny2019-06-211-0/+3
| | | | | Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* v3d: flush jobs writing to vertex buffers used in the current draw callIago Toral Quiroga2019-06-211-0/+9
| | | | | | | | | | | | | | | This can happen when any of our vertex buffers was written by a previous transform feedback draw. Fixes the following piglit tests: spec/ext_transform_feedback/position-render-bufferbase spec/ext_transform_feedback/position-render-bufferbase-discard spec/ext_transform_feedback/position-render-bufferoffset spec/ext_transform_feedback/position-render-bufferoffset-discard spec/ext_transform_feedback/position-render-bufferrange spec/ext_transform_feedback/position-render-bufferrange-discard Reviewed-by: Eric Anholt <[email protected]>
* v3d: flush jobs reading from transform feedback output buffersIago Toral Quiroga2019-06-211-2/+24
| | | | | | | | | | | If we are about to write to a transform feedback buffer, we should make sure that we flush any prior work that intended to read from any of these buffers. Fixes piglit test: spec/ext_transform_feedback/immediate-reuse Reviewed-by: Eric Anholt <[email protected]>
* v3d: add a helper to check if transform feedback is enabledIago Toral Quiroga2019-06-212-2/+8
| | | | | | v2: We should be safe assuming that bind_vs != NULL (Eric) Reviewed-by: Eric Anholt <[email protected]>
* llvmpipe: make remove_shader_variant static.Dave Airlie2019-06-212-5/+1
| | | | | | | this isn't used outside this file. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* util/os_file: resize buffer to what was actually neededEric Engestrom2019-06-201-0/+11
| | | | | | | | Fixes: 316964709e21286c2af5 "util: add os_read_file() helper" Reported-by: Jason Ekstrand <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* panfrost: ci: Update expectationsTomeu Vizoso2019-06-201-54/+0
| | | | | | These tests have been fixed recently. Signed-off-by: Tomeu Vizoso <[email protected]>
* panfrost/midgard: Broadcast swizzleAlyssa Rosenzweig2019-06-201-12/+38
| | | | | | | | | | | | | | Fixes regression in shaders using ball/etc by explicitly passing through the number of channels in the NIR op and broadcasting the last components of the channel appropriately, as the Midgard ops are all vec4 implicitly but NIR can be vec2/3. v2: Don't also regress every other swizzle in Equestria. v3: Don't regress the swizzles at Canterlot High either. Signed-off-by: Alyssa Rosenzweig <[email protected]> Acked-by: Tomeu Vizoso <[email protected]>
* iris: Use stream uploader for shader draw parameters.Kenneth Graunke2019-06-201-2/+2
| | | | | | | | | | Most vertex data lives in user VBOs in IRIS_MEMZONE_OTHER, which typically have high bits set to 0xffff. The shader draw parameters were being uploaded in IRIS_MEMZONE_DYNAMIC, which have high bets set to 0x2. This was causing a lot of ping-ponging of high bits, leading to unnecessary VF cache flushing. Cuts 7.2% of the flushes in the Civizilation VI demo on Kabylake GT2.
* iris: Don't check VF address high bits when there is no buffer.Kenneth Graunke2019-06-201-1/+1
| | | | | | | If there is no buffer, then it doesn't matter. Leave the old stale high bits in place (for next time) and don't bother invalidating. Cuts 5.6% of the flushes in the Civilization VI demo on Kabylake GT2.
* iris: Drop RT flushes from depth stencil clearing flushes.Kenneth Graunke2019-06-204-9/+8
| | | | | These write depth and stencil, not color writes, so there's no need to flush the render target.
* iris: Don't bother with PIPE_CONTROLs for CPU writes and no historyKenneth Graunke2019-06-201-6/+9
| | | | | | | | | | | | | If a buffer has no usage history, we don't have any read only cache invalidates to do. If we've written it with the CPU, we don't need to flush the render cache. The only bit remaining is the CS stall from iris_flush_bits_for_history. We can just skip the PIPE_CONTROL in this case. This is pretty common - an app creates a buffer, fills it with data, and then binds it for some purpose. Cuts 36% of the flushes in Manhattan 3.0 on Kabylake GT2.
* iris: Only do an RT flush for transfer maps if using copy_region.Kenneth Graunke2019-06-201-1/+1
| | | | | | If we wrote the data via the CPU, there's no point in doing a render target flush. If using BLORP, we do want a render target flush so the data lands.
* iris: Use iris_flush_bits_for_history in iris_transfer_flush_regionKenneth Graunke2019-06-201-5/+12
| | | | | | | Instead of using the combined iris_flush_and_dirty_for_history, use iris_flush_bits_for_history directly - we were already using the split out iris_dirty_for_history. There's no need to dirty twice, and we can avoid the looping altogether for non-buffers.
* iris: Avoid double flushing in iris_transfer_flush_region when copying.Kenneth Graunke2019-06-201-4/+3
| | | | | | | | | | My intention was to have iris_copy_region not do flushing, and leave that up to the callers. iris_resource_copy_region needs to do this, but iris_transfer_flush_region was already doing it. The net result was that we were doing it twice for transfers. So, move the flushing from iris_copy_region to iris_resource_copy_region so that it only happens in the callers as I intended.
* iris: Fix iris_flush_and_dirty_history to actually dirty history.Kenneth Graunke2019-06-201-0/+2
| | | | | | | When I split iris_flush_and_dirty_history into two helper functions, I accidentally made it stop dirtying. Which was...sort of the point. Fixes: 21688a306b2 iris: Split iris_flush_and_dirty_for_history into two helpers.
* iris: Add maybe_flush calls to texture_barrier and memory_barrierKenneth Graunke2019-06-201-0/+3
| | | | | | | | | Otherwise, tests which loop on glMemoryBarrier may run us out of batch space with piles of flushing. (Ideally, we'd elide those bonus PIPE_CONTROLs, but presumably this isn't that common of a case...) Piglit's arb_pipeline_statistics_query-comp would hit this case after some of the next patches remove other PIPE_CONTROLs with maybe_flushes.
* iris: Implement INTEL_DEBUG=pc for pipe control logging.Kenneth Graunke2019-06-2013-57/+170
| | | | | | | | This prints a log of every PIPE_CONTROL flush we emit, noting which bits were set, and also the reason for the flush. That way we can see which are caused by hardware workarounds, render-to-texture, buffer updates, and so on. It should make it easier to determine whether we're doing too many flushes and why.
* panfrost: Skip shading unaffected tilesAlyssa Rosenzweig2019-06-205-2/+51
| | | | | | | | | | | Looking at the scissor, we can discard some tiles. We specifially don't care about the scissor on the wallpaper, since that's a no-op if the entire tile is culled. v2: Clarify clear comment (not reviewed but trivial). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* glx: fix glvnd pointer typesEric Engestrom2019-06-202-3/+3
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110709 Fixes: 22a9e00aab66d3dd6890 ("glx: Implement the libglvnd interface.") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glx: drop misleading comment about the file being "generated"Eric Engestrom2019-06-201-4/+0
| | | | | | | | | | | | | | | | This `gen_scrn_dispatch.pl` has never existed, in the sense that NVIDIA never published it. There have been a number (6) of commits to fix various things in there over the years, and never anything from NVIDIA. For all intents and purposes this file is hand-written and hand-maintained, and we're on our own. Let's make this clear by removing this misleading comment. Suggested-by: Eric Anholt <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Acked-by: Emil Velikov <[email protected]>
* nir/lower_tex: Add an assert() in nir_lower_txs_lod()Boris Brezillon2019-06-201-0/+1
| | | | | | | | | We don't expect the output of a TXS instruction to be wider than a vec3. Add an assert() to make sure this never happens. Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Set job requirements during drawTomeu Vizoso2019-06-202-1/+2
| | | | | | | | | | | Right now we are doing it at a moment when we don't have all the information we need. Signed-off-by: Tomeu Vizoso <[email protected]> Suggested-by: Alyssa Rosenzweig <[email protected]> Acked-by: Rohan Garg <[email protected]> Cc: Rohan Garg <[email protected]> Fixes: bfca21b622df ("panfrost: Figure out job requirements in pan_job.c")
* panfrost/meson: Link with libpanfrost_sharedAlyssa Rosenzweig2019-06-201-1/+1
| | | | | | Fixes: 035a07c0 ("panfrost: Switch to lima tiling") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* freedreno/ir3: fix typoHyunjun Ko2019-06-201-1/+1
| | | | | Fixes: a9b556d3a04 ("freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov") Reviewed-by: Rob Clark <[email protected]>
* panfrost: Load from tiled imagesAlyssa Rosenzweig2019-06-201-2/+15
| | | | | | | Now that we have lima tiling code available, use it to load from a tiled source. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Switch to lima tilingAlyssa Rosenzweig2019-06-205-265/+11
| | | | | | | | | Lima and Panfrost both have implementations of software tiling (the Lima one was forked off the Panfrost one which was forked off the original Lima one...). Switch to the most recent Lima code, since it's more complete than ours at this point. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix tiled NPOT textures with bpp<4Alyssa Rosenzweig2019-06-201-3/+3
| | | | | | | | | Panfrost's tiling routines (incorrectly) ignored the source stride, masking this bug; lima's routines respect this stride, causing issues when tiling NPOT textures whose stride is not a multiple of 64 (for instance, NPOT textures with bpp=1). Signed-off-by: Alyssa Rosenzweig <[email protected]>