aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/lima
Commit message (Collapse)AuthorAgeFilesLines
* nir: Drop the ssbo_offset to atomic lowering.Eric Anholt2020-01-211-1/+1
| | | | | | | | | | | | | The arguments passed in were: - prog->info.num_ssbos - prog->nir->info.num_ssbos - arbitrary values for standalone compilers The num_ssbos should match between the prog's info and prog->nir's info until this lowering happens. Reviewed-by: Marek Olšák <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
* panfrost: Rework linear<--->tiled conversionsAlyssa Rosenzweig2020-01-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a lot going on here (it's a ton of commits squashed together since otherwise this would be impossible to review...) 1. We have a fast path for linear->tiled for whole (aligned) tiles, but we have to use a slow path for unaligned accesses. We can get a pretty major win for partial updates by using this slow path simply on the borders of the update region, and then hit the fast path for the tile-aligned interior. This does require some shuffling. 2. Mark the LUTs constant, which allows the compiler to inline them, which pairs well with loop unrolling (eliminating the memory accesses and just becoming some immediates.. which are not as immediate on aarch64 as I'd like..) 3. Add fast path for bpp1/2/8/16. These use the same algorithm and we have native types for them, so may as well get the fast path. 4. Drop generic path for bpp != 1/2/8/16, since these formats are generally awful and there's no way to tile them efficienctly and honestly there's not a good reason too either. Lima doesn't support any of these formats; Panfrost can make the opinionated choice to make them linear. 5. Specialize the unaligned routines. They don't have to be fully generic, they just can't assume alignment. So now they should be nearly as fast as the aligned versions (which get some extra tricks to be even faster but the difference might be neglible on some workloads). 6. Specialize also for the size of the tile, to allow 4x4 tiling as well as 16x16 tiling. This allows compressed textures to be efficiently tiled with the same routines (so we add support for tiling ASTC/ETC textures while we're at it) Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* panfrost,lima: De-Galliumize tiling routinesAlyssa Rosenzweig2020-01-211-2/+4
| | | | | | | | | | | | There's an implicit dependence on Gallium here that will add more complexity than needed when testing/optimizing out of driver as well as potentially Vulkanizing. We don't need a full pipe_box, just the x/y/w/h properties directly. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* lima: Fix alpha blendingAndreas Baierl2020-01-161-23/+101
| | | | | | | | | | | | | | | | | | | Introduce separate helper functions to set the blendfactor bits. Lima uses bits 0-2 for the type, bit 3 sets the inverted function and bit 4 is set if alpha is used. alpha_src_factor and alpha_dst_factor don't need the alpha bit, so they are masked with 0xf. There is only place for 4 bits anyway. If alpha_src_factor is PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE, we need to change it to PIPE_BLENDFACTOR_ONE first. This is exactly what the blob does and we pass all dEQP-GLES2.functional.fragment_ops.blend.* tests now. Better than the blob btw... Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Andreas Baierl <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
* lima: fix handling of reverse depth rangeVasily Khoruzhick2020-01-162-4/+16
| | | | | | | | | | | | | Looks like we need to handle cases when near > far and near == far. In first case we just need to swap near and far, and in second we need subtract epsilon from near if it's not zero. Fixes 10 tests in dEQP-GLES2.functional.depth_range.* Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400>
* lima/ppir: implement full liveness analysis for regallocErico Nunes2020-01-156-166/+359
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing liveness analysis in ppir still ultimately relies on a single continuous live_in and live_out range per register and was observed to be the bottleneck for register allocation on complicated examples with several control flow blocks. The use of live_in and live_out ranges was fine before ppir got control flow, but now it ends up creating unnecessary interferences as live_in and live_out ranges may span across entire blocks after blocks get placed sequentially. This new liveness analysis implementation generates a set of live variables at each program point; before and after each instruction and beginning and end of each block. This is a global analysis and propagates the sets of live registers across blocks independently of their sequence. The resulting sets optimally represent all variables that cannot share a register at each program point, so can be directly translated as interferences to the register allocator. Special care has to be taken with non-ssa registers. In order to properly define their live range, their alive components also need to be tracked. Therefore ppir can't use simple bitsets to keep track of live registers. The algorithm uses an auxiliary set data structure to keep track of the live registers. The initial implementation used only trivial arrays, however regalloc execution time was then prohibitive (>1minute on Cortex-A53) on extreme benchmarks with hundreds of instructions, hundreds of registers and several spilling iterations, mostly due to the n^2 complexity to generate the interferences from the live sets. Since the live registers set are only a very sparse subset of all registers at each instruction, iterating only over this subset allows it to run very fast again (a couple of seconds for the same benchmark). Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
* lima/ppir: remove orphan load node after cloningErico Nunes2020-01-153-1/+27
| | | | | | | | | | | | | | | There are some cases in shades using control flow where the varying load is cloned to every block, and then the original node is left orphan. This is not harmful for program execution, but it complicates analysis for register allocation as there is now a case of writing to a register that is never read. While ppir doesn't have a dead code elimination pass for its own optimizations and it is not hard to detect when we cloned the last load, let's remove it early. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
* lima: add new findings to texture descriptorVasily Khoruzhick2020-01-134-12/+47
| | | | | | | | | | | | | | | | | | | Lower 8 bits of unknown_1_3 seems to be min_lod, rest of 4 bits + miplevels are max_lod and min_mipfilter seems to be lod bias. All are in fixed format with 4 bit integer and 4 bit fraction, lod_bias also has sign bit. Blob also seems to do some magic with lod_bias if min filter is nearest -- it adds 0.5 to lod_bias in this case. Same story when all filters are nearest and mipmapping is enabled, but in this case it subtracts 1/16 from lod_bias. Fixes 134 dEQP tests in dEQP-GLES2.functional.texture.* Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>
* lima: Add stencil supportAndreas Baierl2020-01-132-27/+64
| | | | | | | | | | | | | | This re-enables and fixes support for stencil buffer. It fixes 365 stencil related deqp tests. All tests that use INCR, INCR_WRAR, DECR and DECR_WRAP as a stencil op still fail, but they also fail with the blob, so we may ignore that for now. We still have dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked failing, which is strange because it's the only one out of the depth_stencil_clear.* set. Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima/parser: Make rsw alpha blend parsing more readableAndreas Baierl2020-01-131-4/+5
| | | | | Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima: fix PIPE_CAP_* to mark features that aren't supported yetVasily Khoruzhick2020-01-121-0/+6
| | | | | | | | | | lima doesn't support alpha test, flat shading, two-sided color nor clip planes. We can enable these caps when corresponding hw features are implemented in the driver. Reviewed-by: Qiang Yu <[email protected]> Tested-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: implement polygon offsetVasily Khoruzhick2020-01-121-14/+9
| | | | | | | | Fixes some of dEQP-GLES2.functional.polygon_offset.* tests and shadows in Q3A. Reviewed-by: Qiang Yu <[email protected]> Tested-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: fix viewport clippingVasily Khoruzhick2020-01-121-5/+17
| | | | | | | | | | | | | | | Apparently Mali4x0 doesn't do viewport clipping, so anything rendered beyond viewport is still rendered. Looks like we need to use scissors to do clipping. Fixes most of dEQP-GLES2.functional.clipping.*, 6 out of 7 remaining failures fail on blob as well. Remaining [1] fails on many other gallium drivers. [1] dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z Suggested-by: Ilia Mirkin <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Tested-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: fix PLBU_CMD_PRIMITIVE_SETUP commandVasily Khoruzhick2020-01-122-21/+16
| | | | | | | | | | Apparently it doesn't depend on primitive type, the value only depends on whether we specify point size via PLBU command -- bit 12 is set in this case Reviewed-by: Qiang Yu <[email protected]> Tested-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: drop support for R8G8B8 formatVasily Khoruzhick2020-01-091-1/+0
| | | | | | | | | We can only sample from 24-bit packed format and can't render into it and it causes chromium-based browsers to fail when they create FBO with GL_RGB format. Drop R8G8B8 alltogether so mesa can promote it to RGBX format. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: add debug flag to disable tilingVasily Khoruzhick2020-01-103-1/+4
| | | | | | | | | Add debug flag to disable tiling. Note that it prevents lima from creating tiled buffers, but it's still able to import them if modifier is specified Reviewed-by: Andreas Baierl <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: use linear layout for shared buffers if modifier is not specifiedVasily Khoruzhick2020-01-101-1/+8
| | | | | | | | | | Use linear layout for shared buffers if modifier is not specified and use linear layout when importing buffers with invalid modifier. Fixes: 01a451b04d2d ("lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()") Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: fix PP stream terminator sizeVasily Khoruzhick2020-01-051-1/+3
| | | | | | | | | | PP stream terminator size seems to be 4 words, it worked with full PP stream because we align stream beginning to 32 bytes and BO is initialized with zeroes. But with partial PP stream it sometimes break if for new PP stream we reuse BO that has non-zero value at this place. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: don't reload and redraw tiles that were not updatedVasily Khoruzhick2020-01-053-7/+67
| | | | | | | | | We don't need to reload and redraw some tiles if framebuffer was not cleared and scissor test was enabled for some of draws. This simple optimization fixes cursor lag in X11 Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: postpone PP stream generationVasily Khoruzhick2020-01-051-11/+17
| | | | | | | | | This commit postpones PP stream generation till job is submitted. Doing that this late allows us to skip reloading and redrawing tiles that were not updated. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/parser: Fix VS cmd stream parserAndreas Baierl2020-01-051-2/+2
| | | | | | | prefetch is int, not bool. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima/parser: Fix rsw parserAndreas Baierl2020-01-051-2/+0
| | | | | | | Drop assert as it is not necessary and used wrong anyway. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima: fix allocation of GP outputs storage for indexed drawVasily Khoruzhick2020-01-031-3/+4
| | | | | | | | | | | | | For indexed draw number of VS invocations is (ctx->max_index - ctx->min_index + 1), so we have to use this number when calculating space for varyings, gl_Position and gl_PointSize. Fixes dEQP-GLES2.functional.buffer.write.use.index_array.array and dEQP-GLES2.functional.buffer.write.use.index_array.element_array Reviewed-by: Andreas Baierl <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: set shader caps to optimize control flowErico Nunes2019-12-201-2/+16
| | | | | | | | | | | With these new caps, nir is able to unroll loops and optimize conditionals much more efficiently in both gpit and ppir. panfrost and vc4 were used as reference for the values. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>
* lima/ppir: remove assert on ppir_emit_tex unsupported featureErico Nunes2019-12-201-1/+0
| | | | | | | | | | | This assert causes testing tools such as shaderdb to abort on some test cases. This is an unsupported feature and not a compiler bug. The compilation error is already propagated correctly, so we can remove the assert to allow testing tools to run to completion. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>
* lima/ppir: fix lod bias srcErico Nunes2019-12-205-11/+16
| | | | | | | | | | | | | | | | | | | | ppir has some code that operates on all ppir_src variables, and for that uses ppir_node_get_src. lod bias support introduced a separate ppir_src that is inaccessible by that function, causing it to be missed by the compiler in some routines. Ultimately this caused, in some cases, a bug in const lowering: .../pp/lower.c:42: ppir_lower_const: Assertion `src != NULL' failed. This fix moves the ppir_srcs in ppir_load_texture_node together so they don't get missed. Fixes: 721d82cf061 lima/ppir: add lod-bias support Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>
* lima: Fix dump file creationAndreas Baierl2019-12-201-3/+5
| | | | | | | | | | | | Otherwise lima_dump_file_next() always opens a new file and creates the dumps regardless of what the environment variables say. Fixes d71cd245d74 ('lima: Rotate dump files after each finished pp frame') Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Andreas Baierl <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179>
* lima: Rotate dump files after each finished pp frameAndreas Baierl2019-12-195-13/+48
| | | | | | | | | | | | | | This rotates the dump files like the mali-syscall-tracker does. After each finished pp frame a new file is generated. They are numbered like lima.dump.0000, lima.dump.0001 ... The filename and path can be given with the new environment variable LIMA_DUMP_FILE. Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Andreas Baierl <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175>
* lima: drop suballocatorVasily Khoruzhick2019-12-194-30/+14
| | | | | | | | | | Since we're using a separate per-draw BO for GP outputs we don't need suballocator anymore. Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>
* lima: use single BO for GP outputsVasily Khoruzhick2019-12-193-43/+43
| | | | | | | | | | Varyings, gl_Position and gl_PointSize are all GP outputs, so we can use a single BO for them all. Also that allows us to get rid of suballocator. Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>
* lima: split draw calls on 64k verticesErico Nunes2019-12-141-48/+97
| | | | | | | | | | | | | The Mali400 only supports draws with up to 64k vertices per command. To handle this, break the draw_vbo call into multiple commands. Indexed drawing is left to a separate code path. This implementation was ported from vc4_draw_vbo. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
* lima: refactor indexed draw indices uploadErico Nunes2019-12-142-15/+19
| | | | | | | | | | | | As of this commit this is just a refactor in preparation to enable support for more than 64k vertices. To support splitting the draw_vbo call, indices shouldn't be re-uploaded every time. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
* lima: allocate separate bo to store varyingsErico Nunes2019-12-142-9/+18
| | | | | | | | | | | | | | The current strategy using the suballocator with fixed size doesn't scale and causes some programs with large number of vertices (like some glmark2 scenes) to crash. Change it to dynamically allocate a separate bo to accomodate for arbitrary number of vertices. This also fixes the buffer read/write flags for gp. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
* gallium/util: add alignment parameter to util_upload_index_bufferErico Nunes2019-12-141-1/+1
| | | | | | | | | | At least on Mali Utgard, index buffers need to be aligned on 0x40. To avoid duplicating this, add an alignment parameter. Keep the previous default for the other existing users. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
* lima/parser: Add texture descriptor parserAndreas Baierl2019-12-135-0/+131
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
* lima/parser: Add RSW parsingAndreas Baierl2019-12-135-0/+185
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
* lima/parser: Some fixes and cleanupsAndreas Baierl2019-12-132-46/+36
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
* lima/ppir: enable lower_fdphErico Nunes2019-12-111-0/+1
| | | | | | | | | Otherwise we may lower some fdot to fdph which is not implemented in pp. Fixes #2126 Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima: expose tiled format modifier in query_dmabuf_modifiers()Vasily Khoruzhick2019-12-091-0/+1
| | | | | | Fixes: 8c12f4e5f24f ("lima: enable tiling") Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()Vasily Khoruzhick2019-12-091-0/+4
| | | | | | | | | Assume that resource is tiled if we get DRM_FORMAT_MOD_INVALID in resource_from_handle() and we don't have RO. Fixes: 8c12f4e5f24f ("lima: enable tiling") Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: enable tilingVasily Khoruzhick2019-12-041-11/+30
| | | | | | | | | | | Now that we have tiled format modifier merged into linux we can enable tiling. That should improve overall performance and also workaround broken mipmapping for linear textures since now we prefer tiled textures. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/ppir: add lod-bias supportArno Messiaen2019-11-205-5/+33
| | | | | Signed-off-by: Arno Messiaen <[email protected]> Reviewed-by: Erico Nunes <[email protected]>
* lima/streamparser: Add findings introduced with gl_PointSizeAndreas Baierl2019-11-201-2/+22
| | | | | Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima/streamparser: Fix typo in vs semaphore parserAndreas Baierl2019-11-201-1/+1
| | | | | Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima: Parse VS and PLBU command stream while making a dumpAndreas Baierl2019-11-177-0/+461
| | | | | | | | This makes the streams more readable and comparable with the blob's parser as it parses the VS and PLBU stream and shows the currently known values. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]>
* lima: Beautify stream dumpsAndreas Baierl2019-11-171-7/+11
| | | | | | | | | | | Change the dump, that the output looks more like the output of mali-syscall-tracker [1]. This is a preparation for a more detailed stream analysis. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Andreas Baierl <[email protected]> [1]: https://gitlab.freedesktop.org/lima/mali-syscall-tracker
* util: Move gallium's PIPE_FORMAT utils to /util/format/Eric Anholt2019-11-142-2/+2
| | | | | | | | | | | | | | | To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to move their helpers out of gallium. Since u_format used util_copy_rect(), I moved that in there, too. I've put it in a separate directory in util/ because it's a big chunk of related code, and it's not clear to me whether we might want it as a separate library from libmesa_util at some point. Closes: #1905 Acked-by: Marek Olšák <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* lima: fix bo submit memory leakErico Nunes2019-11-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix memory leak on allocation for lima submit, reported by valgrind. 128 bytes in 1 blocks are definitely lost in loss record 38 of 84 at 0x484A6E8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so) by 0x58689C7: util_dynarray_ensure_cap (u_dynarray.h:91) by 0x5868BBB: util_dynarray_grow_bytes (u_dynarray.h:139) by 0x5868BBB: lima_submit_add_bo (lima_submit.c:113) by 0x585D7D3: lima_ctx_buff_va (lima_context.c:57) by 0x586378F: lima_pack_plbu_cmd (lima_draw.c:802) by 0x586378F: lima_draw_vbo (lima_draw.c:1351) by 0x5406A2F: u_vbuf_draw_vbo (u_vbuf.c:1184) by 0x55D0A57: st_draw_vbo (st_draw.c:268) by 0x55576CB: _mesa_draw_arrays (draw.c:374) by 0x55576CB: _mesa_draw_arrays (draw.c:351) by 0x43610B: Mesh::render_vbo() (mesh.cpp:583) by 0x415DBB: SceneBuild::draw() (scene-build.cpp:242) by 0x41131B: MainLoop::draw() (main-loop.cpp:133) by 0x411947: MainLoop::step() (main-loop.cpp:108) Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima: fix nir shader memory leakErico Nunes2019-11-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | Fix memory leak on allocation for nir shader, reported by valgrind. 3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84 at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so) by 0x5750817: ralloc_size (ralloc.c:119) by 0x5750977: rzalloc_size (ralloc.c:151) by 0x575C173: nir_shader_create (nir.c:45) by 0x5763ACB: nir_shader_clone (nir_clone.c:728) by 0x55D5003: st_create_fp_variant (st_program.c:1242) by 0x55D789F: st_get_fp_variant (st_program.c:1522) by 0x55D789F: st_get_fp_variant (st_program.c:1507) by 0x56400C3: st_update_fp (st_atom_shader.c:163) by 0x563D333: st_validate_state (st_atom.c:261) by 0x55D07CB: prepare_draw (st_draw.c:132) by 0x55D08DF: st_draw_vbo (st_draw.c:184) by 0x55576CB: _mesa_draw_arrays (draw.c:374) by 0x55576CB: _mesa_draw_arrays (draw.c:351) Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima: add support for gl_PointSizeVasily Khoruzhick2019-11-055-32/+90
| | | | | | | | | | | | | GP handles gl_PointSize similar to gl_Position, i.e. it needs separate buffer and it has special type in varying descriptors, also for indexed draw we need to emit special PLBU command to pass address of gl_PointSize buffer. Blob also clamps gl_PointSize to 1 .. 100 (as well as line width), so let's do the same. Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>