| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
When swr driver is in use it print detected architecture
message to std::err. It can be harmfull when swr is using
in multinodes environments.
It can be enabled setting env var SWR_PRINT_INFO to 1.
Reviewed-by: Jan Zielinski <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the following valgrind error:
Invalid read of size 16
at 0x28F458A1: si_set_sampler_view_desc (in radeonsi_drv_video.so)
by 0x28F4657E: si_set_sampler_views (in radeonsi_drv_video.so)
by 0x28D62BF5: util_compute_blit (in radeonsi_drv_video.so)
by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
Address 0x18142a10 is 0 bytes inside a block of size 48 free'd
at 0x48369AB: free (vg_replace_malloc.c:540)
by 0x28D62D51: util_compute_blit (in radeonsi_drv_video.so)
by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
Block was alloc'd at
at 0x4837B65: calloc (vg_replace_malloc.c:762)
by 0x28EFB2EC: si_create_sampler_state (in radeonsi_drv_video.so)
by 0x28D62C30: util_compute_blit (in radeonsi_drv_video.so)
by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
Fixes: 69430d7e59e ("va: use a compute shader for the blit")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2321
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce separate helper functions to set the blendfactor bits.
Lima uses bits 0-2 for the type, bit 3 sets the inverted function
and bit 4 is set if alpha is used.
alpha_src_factor and alpha_dst_factor don't need the alpha bit, so
they are masked with 0xf. There is only place for 4 bits anyway.
If alpha_src_factor is PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE, we need
to change it to PIPE_BLENDFACTOR_ONE first.
This is exactly what the blob does and we pass all
dEQP-GLES2.functional.fragment_ops.blend.* tests now.
Better than the blob btw...
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
|
|
|
|
|
|
|
|
|
| |
This implements HW workaround #1409600907 for iris driver.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3378>
|
|
|
|
|
|
|
|
| |
v2: add comment (Ken)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3408>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Looks like we need to handle cases when near > far and near == far.
In first case we just need to swap near and far, and in second we
need subtract epsilon from near if it's not zero.
Fixes 10 tests in dEQP-GLES2.functional.depth_range.*
Reviewed-by: Qiang Yu <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3400>
|
|
|
|
|
|
|
|
|
|
| |
No stride / no attributes means that nothing is being written to the
buffer. However it might still prevent primitives from being written out
to the other buffers. Disabling it entirely seems to fix it.
Fixes GTF-GL45.gtf30.GL3Tests.transform_feedback.transform_feedback_overflow
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing liveness analysis in ppir still ultimately relies on a
single continuous live_in and live_out range per register and was
observed to be the bottleneck for register allocation on complicated
examples with several control flow blocks.
The use of live_in and live_out ranges was fine before ppir got control
flow, but now it ends up creating unnecessary interferences as live_in
and live_out ranges may span across entire blocks after blocks get
placed sequentially.
This new liveness analysis implementation generates a set of live
variables at each program point; before and after each instruction and
beginning and end of each block.
This is a global analysis and propagates the sets of live registers
across blocks independently of their sequence.
The resulting sets optimally represent all variables that cannot share a
register at each program point, so can be directly translated as
interferences to the register allocator.
Special care has to be taken with non-ssa registers. In order to
properly define their live range, their alive components also need to be
tracked. Therefore ppir can't use simple bitsets to keep track of live
registers.
The algorithm uses an auxiliary set data structure to keep track of the
live registers. The initial implementation used only trivial arrays,
however regalloc execution time was then prohibitive (>1minute on
Cortex-A53) on extreme benchmarks with hundreds of instructions,
hundreds of registers and several spilling iterations, mostly due to the
n^2 complexity to generate the interferences from the live sets. Since
the live registers set are only a very sparse subset of all registers at
each instruction, iterating only over this subset allows it to run very
fast again (a couple of seconds for the same benchmark).
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are some cases in shades using control flow where the varying load
is cloned to every block, and then the original node is left orphan.
This is not harmful for program execution, but it complicates analysis
for register allocation as there is now a case of writing to a register
that is never read.
While ppir doesn't have a dead code elimination pass for its own
optimizations and it is not hard to detect when we cloned the last load,
let's remove it early.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3358>
|
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Enables EGL_ANDROID_native_fence_sync.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After flushing batches, iris_fence_flush() asks the kernel whether
each batch's last_syncpt has already signalled or not. (The idea is
that either the compute or render batch may not have actually had any
work queued up, so last_syncpt there might have been signalled a long
time ago.) If it's already completed, we don't bother to record it.
A strange corner is the case of repeated flushes. For example, we
might flush for some reason, and hit a glFlush(), and hit SwapBuffers.
It's possible for all the batches to have been flushed previously, -and-
for them to have actually completed. In this case, we'll see that there
are no syncobj's to wait on, and record fence->count == 0.
This works fine internally - fence_finish can see count == 0 and realize
that it doesn't need to wait, for example. But when working with native
FDs, we may be asked to export a fence with count == 0. So we need an
actual synchronization primitive we can hand off. Because all of the
relevant batches had been signalled when creating the fence, we want the
new dummy fence to be signalled as well.
So we just make a signalled syncobj and export it.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3399>
|
|
|
|
|
|
|
| |
Cc: 19.2 19.3 <[email protected]>
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3338>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3338>
|
|
|
|
|
|
|
| |
Fixes: 3b143369a55 "ac/nir, radv, radeonsi: Switch to using ac_shader_args"
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3338>
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
gs_invocation_id VGPR
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This change eases code reading ("fmask_is_identity = true" is clearer than
"fmask_is_not_identity = false").
Initialization is not changed so fmask_is_identity is false when a texture is
created.
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
|
|
|
|
|
|
|
| |
It's not used and avoid infinite recursion when used from si_compute_expand_fmask
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
|
|
|
|
|
|
|
| |
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2248
Fixes: 095a58204d9 ("radeonsi: expand FMASK before MSAA image stores are used")
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'coord' variable was using TGSI_WRITEMASK_XYZ so subsequent uses of
TGSI_WRITEMASK_W were dropped.
The result for a 2 samples program was:
0: UMAD TEMP[0].xy, SV[1].xyyy, IMM[0].xxxx, SV[0].xyyy
1: STORE IMAGE[0], TEMP[0], TEMP[1], RESTRICT, 2D_MSAA
2: STORE IMAGE[0], TEMP[0], TEMP[2], RESTRICT, 2D_MSAA
3: END
instead of the expected:
0: UMAD TEMP[0].xy, SV[1].xyyy, IMM[0].xxxx, SV[0].xyyy
1: MOV TEMP[0].w, IMM[0].yyyy
2: LOAD TEMP[1], IMAGE[0], TEMP[0], RESTRICT, 2D_MSAA
3: MOV TEMP[0].w, IMM[0].zzzz
4: LOAD TEMP[2], IMAGE[0], TEMP[0], RESTRICT, 2D_MSAA
5: MOV TEMP[0].w, IMM[0].yyyy
6: STORE IMAGE[0], TEMP[0], TEMP[1], RESTRICT, 2D_MSAA
7: MOV TEMP[0].w, IMM[0].zzzz
8: STORE IMAGE[0], TEMP[0], TEMP[2], RESTRICT, 2D_MSAA
9: END
This fixes half of https://gitlab.freedesktop.org/mesa/mesa/issues/2248
Fixes: 095a58204d9 ("radeonsi: expand FMASK before MSAA image stores are used")
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
|
|
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3383>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3383>
|
|
|
|
|
|
|
|
|
|
| |
This fixes YUYV support on etnaviv.
Fixes: 7404833c "gallium: add handling for YUV planar surfaces"
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1896>
|
|
|
|
|
|
|
| |
Will be used by EXT_EGL_image_storage later.
Acked-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3375>
|
|
|
|
|
|
|
|
| |
This is an attempt to clean up si_shader.c.
v2: don't move code that is not specific to LLVM
Reviewed-by: Timothy Arceri <[email protected]> (v1)
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
this was for compatibility with TGSI
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As pointed out by Boris, what we were calling PAN_LINEAR depth textures
was in fact u-interleaved tiled (!), but we never noticed since we
flipped the flag used for sampling, leading to all sorts of fun bugs
when attempting to directly acess depth textures from the CPU. Which
begs the question -- if what we called LINEAR was tiled, how do we
actually render linear depth textures? It turns out the flags for AFBC
form a mali_block_format 2-bit code just like their render-target
counterparts, so we can render to any of the above.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reported-by: Boris Brezillon <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393>
|
|
|
|
|
| |
Fixes: 363b4027fcb - radeonsi: put up to 5 VBO descriptors into user SGPRs
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|