| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VBO module maps a buffer with GL_MAP_FLUSH_EXPLICIT, and keeps
appending data, and calling glFlushMappedBufferRange(). We were
invalidating the VF cache each time it flushed a new range, which
results in a ton of VF flushes.
If the contents of the destination in the target range are undefined
(never even possibly written), this patch makes us assume that it's
likely not in the cache and so cache invalidations are required. If
the destination range is defined, we continue cache flushing as we may
need to expunge stale data.
This eliminates 88% of the VF cache invalidates on Manhattan 3.0.
Improves performance in Manhattan 3.0 on my Icelake 8x8 with the GPU
frequency locked to 700Mhz by 0.376724% +/- 0.0989183% (n=10).
|
|
|
|
|
|
|
|
|
|
| |
We were always resolving the buffer as if we were accessing it via
CPU maps, which don't understand any auxiliary surfaces. But we often
copy to a temporary using BLORP, which understands compression just
fine. So we can avoid the resolve, and accelerate the copy as well.
Fixes: 9d1334d2a0f ("iris: Use copy_region and staging resources to avoid transfer stalls")
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Create separate SURFACE_STATE for render target read in order to support
non coherent framebuffer fetch on broadwell.
Also we need to resolve framebuffer in order to support CCS_D.
v2: Add outputs_read check (Kenneth Graunke)
v3: 1) Import Curro's comment from get_isl_surf
2) Rename get_isl_surf method
3) Clean up allocation in case of failure
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
All helper functions are ported from i965 driver.
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
v2: Add missing space (Caio)
Signed-off-by: Sagar Ghuge <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The DRI interface for modifiers with aux data treats the aux data as a
separate plane of the main surface.
When the dri layer requests the plane associated with the aux data, we
save the required information into the dri aux plane image.
Later when the image is used, the dri plane image will be available in
the pipe_resource structure's `next` field. Therefore in iris, we
reconstruct the aux setup from this separate dri plane image when the
image is used.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
We want to use this in the transfer code and possibly for fast clears.
|
|
|
|
|
| |
These write depth and stencil, not color writes, so there's no need
to flush the render target.
|
|
|
|
|
|
|
|
| |
This prints a log of every PIPE_CONTROL flush we emit, noting which bits
were set, and also the reason for the flush. That way we can see which
are caused by hardware workarounds, render-to-texture, buffer updates,
and so on. It should make it easier to determine whether we're doing
too many flushes and why.
|
|
|
|
|
|
|
|
|
| |
this adds support for imports where the image data begins at an offset
from the start of the buffer, as used in h/x264
fixes kwg/mesa#47
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We create two new helpers, iris_flush_bits_for_history, and
iris_dirty_for_history, then use them in the existing function.
The first accumulates flush bits based on res->bind_history, but doesn't
actually perform a flush. This allows us to accumulate flush bits by
looping over multiple resources, but ultimately emit a single flush for
all of them.
The latter flags dirty bits without flushing, which again allows us to
handle multiple resources, but also is more convenient when writing from
the CPU where we don't need a flush (as in commit 4d12236072).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Applications frequently call glBufferSubData() to consecutive regions
of a VBO to append new vertex data. If no data exists there yet, we
can promote these to unsynchronized writes, even if the buffer is busy,
since the GPU can't be doing anything useful with undefined content.
This can avoid a bunch of unnecessary blitting on the GPU.
u_threaded_context would do this for us, and in fact prohibits us from
doing so (see TC_TRANSFER_MAP_NO_INFER_UNSYNCHRONIZED). But we haven't
hooked that up yet, and it may be useful to disable u_threaded_context
when debugging...at which point we'd still want this optimization. At
the very least, it would let us measure the benefit of threading
independently from this optimization. And it's not a lot of code.
Removes most stall avoidance blits in "Total War: WARHAMMER."
On my Skylake GT4e at 1920x1080, this appears to improve performance
in games by the following (but I did not do many runs for proper
statistics gathering):
----------------------------------------------
| DiRT Rally | +2% (avg) | + 2% (max) |
| Bioshock Infinite | +3% (avg) | + 9% (max) |
| Shadow of Mordor | +7% (avg) | +20% (max) |
----------------------------------------------
|
|
|
|
| |
This will be useful when rebinding images.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to skip some types of aux usages (for instance,
ISL_AUX_USAGE_HIZ when the hardware doesn't support it, or when we have
multisampling) when sampling from the surface.
Instead of checking for those cases while filling the surface state and
leaving it blank, let's have a version of aux.possible_usages for
sampling. This way we can also avoid allocating surface state for the
cases we don't use.
Fixes: a8b5ea8ef015ed4a "iris: Add function to update clear color in surface state."
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
v2: Update tracked clear color when we update the surface state.
v3: Update all aux surface states when updating the clear color.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also store clear color in the iris_resource.
Always allocate clear color state buffer.
v2:
- Make clear_color_offset be 64 bits (Ken).
- Simplify the logic to decide when to memset the aux buffer (Ken).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Need to use it for fast clearing depth buffers.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is similar to intel_miptree_map_blit and intel_buffer_object.c's
temporary blits in i965.
Improves performance of DiRT Rally by 20-25% by eliminating stalls.
Breaks piglit's spec/arb_shader_image_load_store/host-mem-barrier,
by using the GPU to do uploads, exposing a st/mesa issue where it
doesn't give us memory_barrier() calls. This is a pre-existing issue
and will be fixed by a later patch (currently out for review).
|
|
|
|
|
|
| |
If we change the aux state for a given resource, we need to re-emit the
binding table pointers for any stage that has such resource bound. Since
we don't track that, flag IRIS_ALL_DIRTY_BINDINGS and emit all of them.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
But without fast clears or HiZ per-level tracking just yet.
|
| |
|
| |
|
|
|
|
|
|
|
| |
(cleaned up by Ken - make sure a bunch of things were more obviously
not using res->surf, do allow checking res->surf.tiling == LINEAR,
drop format cpp checks that aren't needed, drop memzone handling for
images, assume buffers / non-buffers in a few places...)
|
| |
|
|
|
|
|
|
|
|
|
|
| |
When we blit, transfer, or copy_resource to a buffer, we need to flush
to ensure any stale data for that buffer is invalidated in the caches.
bind_history will inform us which caches need to be flushed.
Also, for any push constant buffers, we need to flag those dirty so
that we re-emit 3DSTATE_CONSTANT_*, causing the data to be re-pushed.
|
| |
|
|
|
|
|
| |
This will let us know what caches to flush / state to dirty when
altering the contents of a buffer.
|
| |
|
| |
|
|
|
|
| |
We'll need this for resolve tracking. There's also no genxml stuff here
|
| |
|
|
|
|
| |
Fixes a bunch of RGB bugs.
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
1. Write the code
2. Add comments
3. PROFIT (or just avoid cost of explaining or relearning things...)
|
| |
|
| |
|
|
|
|
| |
apparently we need this for u_threaded_context
|
| |
|
| |
|
| |
|
|
|
|
| |
for wider use
|