| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
If we wrote the data via the CPU, there's no point in doing a render
target flush. If using BLORP, we do want a render target flush so the
data lands.
|
|
|
|
|
|
|
| |
Instead of using the combined iris_flush_and_dirty_for_history, use
iris_flush_bits_for_history directly - we were already using the split
out iris_dirty_for_history. There's no need to dirty twice, and we can
avoid the looping altogether for non-buffers.
|
|
|
|
|
|
|
|
|
|
| |
My intention was to have iris_copy_region not do flushing, and leave
that up to the callers. iris_resource_copy_region needs to do this,
but iris_transfer_flush_region was already doing it. The net result
was that we were doing it twice for transfers.
So, move the flushing from iris_copy_region to iris_resource_copy_region
so that it only happens in the callers as I intended.
|
|
|
|
|
|
|
| |
When I split iris_flush_and_dirty_history into two helper functions,
I accidentally made it stop dirtying. Which was...sort of the point.
Fixes: 21688a306b2 iris: Split iris_flush_and_dirty_for_history into two helpers.
|
|
|
|
|
|
|
|
|
| |
Otherwise, tests which loop on glMemoryBarrier may run us out of
batch space with piles of flushing. (Ideally, we'd elide those bonus
PIPE_CONTROLs, but presumably this isn't that common of a case...)
Piglit's arb_pipeline_statistics_query-comp would hit this case after
some of the next patches remove other PIPE_CONTROLs with maybe_flushes.
|
|
|
|
|
|
|
|
| |
This prints a log of every PIPE_CONTROL flush we emit, noting which bits
were set, and also the reason for the flush. That way we can see which
are caused by hardware workarounds, render-to-texture, buffer updates,
and so on. It should make it easier to determine whether we're doing
too many flushes and why.
|
|
|
|
|
|
|
|
|
|
|
| |
Looking at the scissor, we can discard some tiles. We specifially don't
care about the scissor on the wallpaper, since that's a no-op if the
entire tile is culled.
v2: Clarify clear comment (not reviewed but trivial).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110709
Fixes: 22a9e00aab66d3dd6890 ("glx: Implement the libglvnd interface.")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This `gen_scrn_dispatch.pl` has never existed, in the sense that NVIDIA
never published it. There have been a number (6) of commits to fix
various things in there over the years, and never anything from NVIDIA.
For all intents and purposes this file is hand-written and
hand-maintained, and we're on our own.
Let's make this clear by removing this misleading comment.
Suggested-by: Eric Anholt <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We don't expect the output of a TXS instruction to be wider than a
vec3. Add an assert() to make sure this never happens.
Suggested-by: Jason Ekstrand <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Right now we are doing it at a moment when we don't have all the
information we need.
Signed-off-by: Tomeu Vizoso <[email protected]>
Suggested-by: Alyssa Rosenzweig <[email protected]>
Acked-by: Rohan Garg <[email protected]>
Cc: Rohan Garg <[email protected]>
Fixes: bfca21b622df ("panfrost: Figure out job requirements in pan_job.c")
|
|
|
|
|
|
| |
Fixes: 035a07c0 ("panfrost: Switch to lima tiling")
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Fixes: a9b556d3a04 ("freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov")
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Now that we have lima tiling code available, use it to load from a tiled
source.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Lima and Panfrost both have implementations of software tiling
(the Lima one was forked off the Panfrost one which was forked off the
original Lima one...). Switch to the most recent Lima code, since it's
more complete than ours at this point.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Panfrost's tiling routines (incorrectly) ignored the source stride,
masking this bug; lima's routines respect this stride, causing issues
when tiling NPOT textures whose stride is not a multiple of 64
(for instance, NPOT textures with bpp=1).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow both drivers to share this code. Both drivers
build-tested with meson. Android build not tested.
v2: Change naming from tiling->shared, in case Lima and Panfrost can
share more in the future. Fix Android build system.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-and-tested-by: Qiang Yu <[email protected]>
|
|
|
|
| |
We have them, may as well use them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can rely on only one kind of synchronization object (drm-syncobj)
when it is available. This reduces the number of file descriptors we
use in our implementation.
This will be required later for timeline semaphores implementation, at
this point we won't ever want to use anything else but syncobjs.
v2: Only use has_syncobj for semaphores (Jason)
v3: Only has_syncobj in assert on semaphores in QueueSubmit (Jason)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
These do more harm than good at this point.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This is all zero for anything but fragment shaders.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Looking at internal evidence (later fields including a literal other
compute job inception-style, seeming memory corruption, no clear
function, and the field after this being a pointer to *itself*), it
looks like this is really a much smaller descriptor.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
In OpenGL, uniforms generally represent fp32 vec4s (at least in highp
mode). In OpenCL, they represent vec2s of 64-bit pointers.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Float is ambiguous.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Just as an aid.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
There is fundamentally not a framebuffer associated with a compute job.
Allocate a new structure for it so we don't mess up graphics when
decoding.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
These tests are failing at times, blacklist for now:
dEQP-GLES2.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgba
dEQP-GLES2.functional.fbo.render.shared_colorbuffer_clear.tex2d_rgb
dEQP-GLES2.functional.shaders.matrix.mul.dynamic_highp_mat4_vec4_vertex
Signed-off-by: Tomeu Vizoso <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When DRI_CONF_GLES_EMULATE_BGRA was added for the virgl driver, it
missed a DRI_CONF_OPT_END.
This make some drivers, like v4c/v3d to crash with the following
error:
Fatal error in __driConfigOptions line 99, column 2: mismatched tag.
Not sure why it doesn't fail with virgl.
Fixes: b79366344929c6e477c64a63f246c6db0766a71c
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GCC should be able to figure out that all the possible enum values are
exhausted in the switch() and all the branches return from the function,
but apparently it doesn't, so let's tell the compiler explicitly.
This gets rid of the following warnings in GCC 9:
[1/24] Compiling C object 'src/intel/isl/60d23f8@@isl@sta/isl.c.o'.
../src/intel/isl/isl.c: In function ‘isl_surf_init_s’:
../src/intel/isl/isl.c:1569:10: warning: ‘array_pitch_el_rows’ may be used uninitialized in this function [-Wmaybe-uninitialized]
1569 | *surf = (struct isl_surf) {
| ~~~~~~^~~~~~~~~~~~~~~~~~~~~
1570 | .dim = info->dim,
| ~~~~~~~~~~~~~~~~~
1571 | .dim_layout = dim_layout,
| ~~~~~~~~~~~~~~~~~~~~~~~~~
1572 | .msaa_layout = msaa_layout,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1573 | .tiling = tiling,
| ~~~~~~~~~~~~~~~~~
1574 | .format = info->format,
| ~~~~~~~~~~~~~~~~~~~~~~~
1575 |
|
1576 | .levels = info->levels,
| ~~~~~~~~~~~~~~~~~~~~~~~
1577 | .samples = info->samples,
| ~~~~~~~~~~~~~~~~~~~~~~~~~
1578 |
|
1579 | .image_alignment_el = image_align_el,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1580 | .logical_level0_px = logical_level0_px,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1581 | .phys_level0_sa = phys_level0_sa,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1582 |
|
1583 | .size_B = size_B,
| ~~~~~~~~~~~~~~~~~
1584 | .alignment_B = base_alignment_B,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1585 | .row_pitch_B = row_pitch_B,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1586 | .array_pitch_el_rows = array_pitch_el_rows,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1587 | .array_pitch_span = array_pitch_span,
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1588 |
|
1589 | .usage = info->usage,
| ~~~~~~~~~~~~~~~~~~~~~
1590 | };
| ~
../src/intel/isl/isl.c:1488:24: warning: ‘*((void *)&phys_total_el+4)’ may be used uninitialized in this function [-Wmaybe-uninitialized]
1488 | struct isl_extent2d phys_total_el;
| ^~~~~~~~~~~~~
../src/intel/isl/isl.c:1335:38: warning: ‘phys_total_el’ may be used uninitialized in this function [-Wmaybe-uninitialized]
1335 | isl_align_div(phys_total_el->w * tile_el_scale,
| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
../src/intel/isl/isl.c:1488:24: note: ‘phys_total_el’ was declared here
1488 | struct isl_extent2d phys_total_el;
| ^~~~~~~~~~~~~
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
It's tricky on GFX9, so only GFX8 for now.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
And fallback to slow color clears.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
For clearing only one level.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This basically boils down to supporting persistent and coherent buffer
storage.
We chose to use coherent buffer storage for all persistent buffers
even if it's not explicitly specified, since using glMemoryBarrier to
obtain coherency would be particularly expensive in our driver stack,
and require a lot of additional bookkeeping.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For svga, the use of persistent / coherent maps is typically slightly
slower than without them. It's probably a bit case-dependent and
possible to tune, but for now, make sure we can disable those.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With SWTNL and index translation we're mapping buffers for reading. These
buffers are commonly upload_mgr buffers that might already be referenced
by another submitted or unsubmitted GPU command. A synchronous map will
then trigger a flush and sync, at least on Linux that doesn't distinguish
between read- and write referencing. So map these buffers async. If they
for some obscure reason happen to be dirty (stream-output, buffer-copy),
the resource_buffer code will read-back and sync anyway. For persistent /
coherent buffers a corresponding read-back and sync will happen in the
kernel fault handler.
Testing: Piglit quick. No regressions.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the case of SWTNL and index translation we were uploading index buffers
and then reading out from them using the CPU. Furthermore, when translating
indices we often cached the results with an upload_mgr buffer, causing the
cached indexes to be immediately discarded on the next write to that
upload_mgr buffer.
Fix this by only uploading when we know the index buffer is going to be
used by hardware. If translating, only cache translated indices if the
original buffer was not a user buffer. In the latter case when we're not
caching, use an upload_mgr buffer for the hardware indices.
This means we can also remove the SWTNL hand-crafted index buffer upload
mechanism in favour of the upload_mgr.
Finally avoid using util_upload_index_buffer(). It wastes index buffer
space by trying to make sure that the offset of the indices in the
upload_mgr buffer is larger or equal to the position of the indices in
the source buffer. From what I can tell, the SVGA device does not
require that.
Testing done: Piglit quick. No regressions.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Add a flag in the surface cache key and a winsys usage flag to
specify coherent memory.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Previously unsynchronized maps have been assumed to also be persistent,
Now destinguish between persistent and unsynchronized map and also support
PIPE_TRANSFER_PERSISTENT from ARB_buffer_storage.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This useful for testing, also because with vtest the dri configuration
is not read.
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Gurchetan Singh <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On GLES hosts GL_SAMPLES_PASSED is emulated by GL_ANY_SAMPLES_PASSED which returns a boolen.
With this tweak the value that is returned if any sample passed can be set. This
may be of iterest when an application decides whether some geometry is rendered based
on an amount of visibility and not just a binary desicion. virgelrenderer sets a default
of 1024 on th host.
v2: Remove reference from virgl and correct description (Emil)
v3: Send the tweak binary encoded instead of using strings (Gurchetan)
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Gurchetan Singh <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
texture
With Qemu this final swizzle is not needed, but with vtest it is, i.e. it depends on
how a program using virglrenderer uses the surface that is rendered to, hence
a tweak is added.
v2: Update description and fix spelling (Emil)
v3: Send tweak as binary value instead of using strings (Gurchetan)
Reviewed-by: Gurchetan Singh <[email protected]>
|