| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
The arguments passed in were:
- prog->info.num_ssbos
- prog->nir->info.num_ssbos
- arbitrary values for standalone compilers
The num_ssbos should match between the prog's info and prog->nir's info
until this lowering happens.
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's a lot going on here (it's a ton of commits squashed together
since otherwise this would be impossible to review...)
1. We have a fast path for linear->tiled for whole (aligned) tiles, but we
have to use a slow path for unaligned accesses. We can get a pretty
major win for partial updates by using this slow path simply on the
borders of the update region, and then hit the fast path for the
tile-aligned interior. This does require some shuffling.
2. Mark the LUTs constant, which allows the compiler to inline them,
which pairs well with loop unrolling (eliminating the memory accesses
and just becoming some immediates.. which are not as immediate on
aarch64 as I'd like..)
3. Add fast path for bpp1/2/8/16. These use the same algorithm and we
have native types for them, so may as well get the fast path.
4. Drop generic path for bpp != 1/2/8/16, since these formats are
generally awful and there's no way to tile them efficienctly and
honestly there's not a good reason too either. Lima doesn't support any
of these formats; Panfrost can make the opinionated choice to make them
linear.
5. Specialize the unaligned routines. They don't have to be fully
generic, they just can't assume alignment. So now they should be nearly
as fast as the aligned versions (which get some extra tricks to be even
faster but the difference might be neglible on some workloads).
6. Specialize also for the size of the tile, to allow 4x4 tiling as well
as 16x16 tiling. This allows compressed textures to be efficiently tiled
with the same routines (so we add support for tiling ASTC/ETC textures
while we're at it)
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's an implicit dependence on Gallium here that will add more
complexity than needed when testing/optimizing out of driver as well as
potentially Vulkanizing. We don't need a full pipe_box, just the x/y/w/h
properties directly.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce separate helper functions to set the blendfactor bits.
Lima uses bits 0-2 for the type, bit 3 sets the inverted function
and bit 4 is set if alpha is used.
alpha_src_factor and alpha_dst_factor don't need the alpha bit, so
they are masked with 0xf. There is only place for 4 bits anyway.
If alpha_src_factor is PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE, we need
to change it to PIPE_BLENDFACTOR_ONE first.
This is exactly what the blob does and we pass all
dEQP-GLES2.functional.fragment_ops.blend.* tests now.
Better than the blob btw...
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Looks like we need to handle cases when near > far and near == far.
In first case we just need to swap near and far, and in second we
need subtract epsilon from near if it's not zero.
Fixes 10 tests in dEQP-GLES2.functional.depth_range.*
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing liveness analysis in ppir still ultimately relies on a
single continuous live_in and live_out range per register and was
observed to be the bottleneck for register allocation on complicated
examples with several control flow blocks.
The use of live_in and live_out ranges was fine before ppir got control
flow, but now it ends up creating unnecessary interferences as live_in
and live_out ranges may span across entire blocks after blocks get
placed sequentially.
This new liveness analysis implementation generates a set of live
variables at each program point; before and after each instruction and
beginning and end of each block.
This is a global analysis and propagates the sets of live registers
across blocks independently of their sequence.
The resulting sets optimally represent all variables that cannot share a
register at each program point, so can be directly translated as
interferences to the register allocator.
Special care has to be taken with non-ssa registers. In order to
properly define their live range, their alive components also need to be
tracked. Therefore ppir can't use simple bitsets to keep track of live
registers.
The algorithm uses an auxiliary set data structure to keep track of the
live registers. The initial implementation used only trivial arrays,
however regalloc execution time was then prohibitive (>1minute on
Cortex-A53) on extreme benchmarks with hundreds of instructions,
hundreds of registers and several spilling iterations, mostly due to the
n^2 complexity to generate the interferences from the live sets. Since
the live registers set are only a very sparse subset of all registers at
each instruction, iterating only over this subset allows it to run very
fast again (a couple of seconds for the same benchmark).
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are some cases in shades using control flow where the varying load
is cloned to every block, and then the original node is left orphan.
This is not harmful for program execution, but it complicates analysis
for register allocation as there is now a case of writing to a register
that is never read.
While ppir doesn't have a dead code elimination pass for its own
optimizations and it is not hard to detect when we cloned the last load,
let's remove it early.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lower 8 bits of unknown_1_3 seems to be min_lod,
rest of 4 bits + miplevels are max_lod and min_mipfilter seems to be
lod bias. All are in fixed format with 4 bit integer and 4 bit fraction,
lod_bias also has sign bit.
Blob also seems to do some magic with lod_bias if min filter is nearest --
it adds 0.5 to lod_bias in this case. Same story when all filters are
nearest and mipmapping is enabled, but in this case it subtracts 1/16
from lod_bias.
Fixes 134 dEQP tests in dEQP-GLES2.functional.texture.*
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3359>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This re-enables and fixes support for stencil buffer.
It fixes 365 stencil related deqp tests. All tests that use INCR, INCR_WRAR,
DECR and DECR_WRAP as a stencil op still fail, but they also fail with the
blob, so we may ignore that for now.
We still have dEQP-GLES2.functional.depth_stencil_clear.depth_stencil_masked
failing, which is strange because it's the only one out of the
depth_stencil_clear.* set.
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
lima doesn't support alpha test, flat shading, two-sided color nor
clip planes. We can enable these caps when corresponding hw features
are implemented in the driver.
Reviewed-by: Qiang Yu <[email protected]>
Tested-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes some of dEQP-GLES2.functional.polygon_offset.* tests and shadows in Q3A.
Reviewed-by: Qiang Yu <[email protected]>
Tested-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently Mali4x0 doesn't do viewport clipping, so anything rendered beyond viewport
is still rendered. Looks like we need to use scissors to do clipping.
Fixes most of dEQP-GLES2.functional.clipping.*, 6 out of 7 remaining failures
fail on blob as well. Remaining [1] fails on many other gallium drivers.
[1] dEQP-GLES2.functional.clipping.triangle_vertex.clip_three.clip_neg_x_neg_z_and_pos_x_pos_z_and_neg_x_neg_y_pos_z
Suggested-by: Ilia Mirkin <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
Tested-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Apparently it doesn't depend on primitive type, the value
only depends on whether we specify point size via PLBU command --
bit 12 is set in this case
Reviewed-by: Qiang Yu <[email protected]>
Tested-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We can only sample from 24-bit packed format and can't render into it and
it causes chromium-based browsers to fail when they create FBO with GL_RGB
format. Drop R8G8B8 alltogether so mesa can promote it to RGBX format.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add debug flag to disable tiling. Note that it prevents lima from creating
tiled buffers, but it's still able to import them if modifier is specified
Reviewed-by: Andreas Baierl <[email protected]>
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use linear layout for shared buffers if modifier is not specified
and use linear layout when importing buffers with invalid modifier.
Fixes: 01a451b04d2d ("lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()")
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
PP stream terminator size seems to be 4 words, it worked with full PP
stream because we align stream beginning to 32 bytes and BO is
initialized with zeroes. But with partial PP stream it sometimes break
if for new PP stream we reuse BO that has non-zero value at this place.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We don't need to reload and redraw some tiles if framebuffer was not
cleared and scissor test was enabled for some of draws. This simple
optimization fixes cursor lag in X11
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This commit postpones PP stream generation till job is submitted.
Doing that this late allows us to skip reloading and redrawing tiles
that were not updated.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
| |
prefetch is int, not bool.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
|
|
|
|
|
|
|
| |
Drop assert as it is not necessary and used wrong anyway.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For indexed draw number of VS invocations is (ctx->max_index - ctx->min_index + 1),
so we have to use this number when calculating space for varyings, gl_Position and
gl_PointSize.
Fixes dEQP-GLES2.functional.buffer.write.use.index_array.array and
dEQP-GLES2.functional.buffer.write.use.index_array.element_array
Reviewed-by: Andreas Baierl <[email protected]>
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
With these new caps, nir is able to unroll loops and optimize
conditionals much more efficiently in both gpit and ppir.
panfrost and vc4 were used as reference for the values.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>
|
|
|
|
|
|
|
|
|
|
|
| |
This assert causes testing tools such as shaderdb to abort on some test
cases. This is an unsupported feature and not a compiler bug. The
compilation error is already propagated correctly, so we can remove the
assert to allow testing tools to run to completion.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3176>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ppir has some code that operates on all ppir_src variables, and for that
uses ppir_node_get_src.
lod bias support introduced a separate ppir_src that is inaccessible by
that function, causing it to be missed by the compiler in some routines.
Ultimately this caused, in some cases, a bug in const lowering:
.../pp/lower.c:42: ppir_lower_const: Assertion `src != NULL' failed.
This fix moves the ppir_srcs in ppir_load_texture_node together so they
don't get missed.
Fixes: 721d82cf061 lima/ppir: add lod-bias support
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3185>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise lima_dump_file_next() always opens a new file and creates the
dumps regardless of what the environment variables say.
Fixes d71cd245d74 ('lima: Rotate dump files after each finished pp frame')
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3179>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This rotates the dump files like the mali-syscall-tracker does.
After each finished pp frame a new file is generated. They are
numbered like lima.dump.0000, lima.dump.0001 ...
The filename and path can be given with the new environment
variable LIMA_DUMP_FILE.
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3175>
|
|
|
|
|
|
|
|
|
|
| |
Since we're using a separate per-draw BO for GP outputs we don't
need suballocator anymore.
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>
|
|
|
|
|
|
|
|
|
|
| |
Varyings, gl_Position and gl_PointSize are all GP outputs, so we
can use a single BO for them all. Also that allows us to get rid
of suballocator.
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3158>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Mali400 only supports draws with up to 64k vertices per command.
To handle this, break the draw_vbo call into multiple commands.
Indexed drawing is left to a separate code path.
This implementation was ported from vc4_draw_vbo.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of this commit this is just a refactor in preparation to enable
support for more than 64k vertices.
To support splitting the draw_vbo call, indices shouldn't be re-uploaded
every time.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current strategy using the suballocator with fixed size doesn't
scale and causes some programs with large number of vertices (like some
glmark2 scenes) to crash.
Change it to dynamically allocate a separate bo to accomodate for
arbitrary number of vertices.
This also fixes the buffer read/write flags for gp.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
|
|
|
|
|
|
|
|
|
|
| |
At least on Mali Utgard, index buffers need to be aligned on 0x40.
To avoid duplicating this, add an alignment parameter.
Keep the previous default for the other existing users.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
|
|
|
|
|
| |
Signed-off-by: Andreas Baierl <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
|
|
|
|
|
| |
Signed-off-by: Andreas Baierl <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
|
|
|
|
|
| |
Signed-off-by: Andreas Baierl <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2980>
|
|
|
|
|
|
|
|
|
| |
Otherwise we may lower some fdot to fdph which is not implemented in pp.
Fixes #2126
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
| |
Fixes: 8c12f4e5f24f ("lima: enable tiling")
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Assume that resource is tiled if we get DRM_FORMAT_MOD_INVALID
in resource_from_handle() and we don't have RO.
Fixes: 8c12f4e5f24f ("lima: enable tiling")
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have tiled format modifier merged into linux we can enable tiling.
That should improve overall performance and also workaround broken mipmapping
for linear textures since now we prefer tiled textures.
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Arno Messiaen <[email protected]>
Reviewed-by: Erico Nunes <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes the streams more readable and comparable with the blob's parser
as it parses the VS and PLBU stream and shows the currently known values.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Change the dump, that the output looks more like the output of
mali-syscall-tracker [1].
This is a preparation for a more detailed stream analysis.
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
[1]: https://gitlab.freedesktop.org/lima/mali-syscall-tracker
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to
move their helpers out of gallium. Since u_format used
util_copy_rect(), I moved that in there, too.
I've put it in a separate directory in util/ because it's a big chunk
of related code, and it's not clear to me whether we might want it as
a separate library from libmesa_util at some point.
Closes: #1905
Acked-by: Marek Olšák <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix memory leak on allocation for lima submit, reported by valgrind.
128 bytes in 1 blocks are definitely lost in loss record 38 of 84
at 0x484A6E8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
by 0x58689C7: util_dynarray_ensure_cap (u_dynarray.h:91)
by 0x5868BBB: util_dynarray_grow_bytes (u_dynarray.h:139)
by 0x5868BBB: lima_submit_add_bo (lima_submit.c:113)
by 0x585D7D3: lima_ctx_buff_va (lima_context.c:57)
by 0x586378F: lima_pack_plbu_cmd (lima_draw.c:802)
by 0x586378F: lima_draw_vbo (lima_draw.c:1351)
by 0x5406A2F: u_vbuf_draw_vbo (u_vbuf.c:1184)
by 0x55D0A57: st_draw_vbo (st_draw.c:268)
by 0x55576CB: _mesa_draw_arrays (draw.c:374)
by 0x55576CB: _mesa_draw_arrays (draw.c:351)
by 0x43610B: Mesh::render_vbo() (mesh.cpp:583)
by 0x415DBB: SceneBuild::draw() (scene-build.cpp:242)
by 0x41131B: MainLoop::draw() (main-loop.cpp:133)
by 0x411947: MainLoop::step() (main-loop.cpp:108)
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix memory leak on allocation for nir shader, reported by valgrind.
3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84
at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
by 0x5750817: ralloc_size (ralloc.c:119)
by 0x5750977: rzalloc_size (ralloc.c:151)
by 0x575C173: nir_shader_create (nir.c:45)
by 0x5763ACB: nir_shader_clone (nir_clone.c:728)
by 0x55D5003: st_create_fp_variant (st_program.c:1242)
by 0x55D789F: st_get_fp_variant (st_program.c:1522)
by 0x55D789F: st_get_fp_variant (st_program.c:1507)
by 0x56400C3: st_update_fp (st_atom_shader.c:163)
by 0x563D333: st_validate_state (st_atom.c:261)
by 0x55D07CB: prepare_draw (st_draw.c:132)
by 0x55D08DF: st_draw_vbo (st_draw.c:184)
by 0x55576CB: _mesa_draw_arrays (draw.c:374)
by 0x55576CB: _mesa_draw_arrays (draw.c:351)
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GP handles gl_PointSize similar to gl_Position, i.e. it needs
separate buffer and it has special type in varying descriptors, also
for indexed draw we need to emit special PLBU command to pass
address of gl_PointSize buffer.
Blob also clamps gl_PointSize to 1 .. 100 (as well as line width),
so let's do the same.
Reviewed-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|