| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This is ported from the fragment shader code.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This is mostly ported from the fragment shader code.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This just adds the CS invocations counter.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds the dispatch code. It creates a job for the number
of blocks in the grid, and dispatches them to the threadpool
implementation. The threadpool then calls the JIT code to
execute the coroutines.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This creates the coroutine execution environment and the
main compute shaders that get executed inside it.
Each compute shader block is executed in it's own coroutine
execution shader, which each "thread" being a coroutine executed
inside it in sequence.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
This doesn't actually build any of the shaders yet, but just
builds up the framework necessary to start building the shaders
and variants.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Compute doesn't share dirty state with the fragment pipeline
so create a separate path for it.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This is mostly a port of the fragment shader framework
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
This adds the jit interface for compute shaders, it's based
on the fragment shader one.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
These mirror the fragment shader structs, this is just a framework.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
The compute shader will need it's own context like the frag shader
has, this just introduces the framework struct and allocates/frees
for it in the right places.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
In order to efficiently run a number of compute blocks, use
a threadpool that just allows for jobs with unique sequential
ids to be dispatched.
|
|
|
|
|
|
|
|
| |
In order to share the texture/image/sampler code with compute
shaders we need to reorg them to be at the front of context
same as draw does for vs/gs sharing.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Until we have somewhere we can do better, just hit it with a hammer.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This adds the image type to the fragment shader jit context
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This mirrors the vs/gs keys, and will be needed when adding images
support.
The const changes also mirror how the draw code work (as is needed
when we add images)
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Also handle setting late for shaders that use stores
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
This lets us create an image structure with the same basic
types as the texture one.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
This just cleans the code up so the texture/sampler type
creation can be reused.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Unused as of last commit.
Signed-off-by: Eric Engestrom <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Tested-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This automates the include_directories and dependencies tracking so that
all users of libmesa_util don't need to add them manually.
Next commit will remove the ones that were only added for that reason.
Signed-off-by: Eric Engestrom <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Tested-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
| |
It has nothing to do with the PIPE_SUBSYSTEM_* stuff from gallium.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Suggested-by: Jason Ekstrand <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This is why we can't have nice things. I'm sure there's someway
to do this with {0} but I really don't have time for that.
Fixes: 2631fd3b0bf ("gallivm: rework lp_build_tgsi_soa to take a struct")
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
The parameters were getting messy and I have to add a few more
for compute shaders, so clean it up before proceeding.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a relatively minimal change to adjust all the gallium interfaces
to use bool instead of boolean. I tried to avoid making unrelated
changes inside of drivers to flip boolean -> bool to reduce the risk of
regressions (the compiler will much more easily allow "dirty" values
inside a char-based boolean than a C99 _Bool).
This has been build-tested on amd64 with:
Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d
vc4 i915 svga virgl swr panfrost iris lima kmsro
Gallium st: mesa xa xvmc xvmc vdpau va
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PIPE_CAP_SM3 has always been an odd one out of all our caps. While most
other caps are fine-grained and single-purpose, this cap encode several
features in one. And since OpenGL cares more about single features, it'd
be nice to get rid of this one.
As it turns, this is now relatively simple. We only really care about
three features using this cap, and those already got their own caps. So
we can remove it, and make sure all current drivers just give the same
response to all of them.
The only place we *really* care about SM3 is in nine, and there we can
instead just re-construct the information based on the finer-grained
caps. This avoids DX9 semantics from needlessly leaking into all of the
drivers, most of who doesn't care a whole lot about DX9 specifically.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
This add support for setting shader buffers and passing them
to draw or binding them to the fragment shader jit.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This just adds the ssbo ptrs to the jit fragment shader api.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Need to pass ssbo + ssbo size pointers just like constants.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
Not all drivers support TGSI_OPCODE_DIV, so we should have a cap to be able
to check this.
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
this isn't used outside this file.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Inline the ring buffer and signal logic into lp_scene_queue instead of
using a u_ringbuffer. The code ends up simpler since there's no need
to handle serializing data from / to packets.
This fixes a crash when compiling Mesa with LTO, that happened because
of util_ringbuffer_dequeue() was writing data after the "header
packet", as shown below
struct scene_packet {
struct util_packet header;
struct lp_scene *scene;
};
/* Snippet of old lp_scene_deque(). */
packet.scene = NULL;
ret = util_ringbuffer_dequeue(queue->ring,
&packet.header,
sizeof packet / 4,
return packet.scene;
but due to the way aliasing analysis work the compiler didn't
considered the "&packet->header" to alias with "packet->scene". With
the aggressive inlining done by LTO, this would end up always
returning NULL instead of the content read by
util_ringbuffer_dequeue().
Issue found by Marco Simental and iThiago Macieira.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110884
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TGSI's FBFETCH instruction currently only supports reading from a single
render target, but NIR intrinsics can support multiple render targets.
radeonsi can only support fetching from RT 0, but other drivers may be
able to support fetching from any render target.
To express this, this patch renames PIPE_CAP_TGSI_FS_FBFETCH to simply
PIPE_CAP_FBFETCH, and converts it from a boolean "is FBFETCH supported?"
to an integer number of render targets which can be fetched.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
The _LEVELS assumes that the max is always power of two. For V3D 4.2, we
can support up to 7680 non-power-of-two MSAA textures, which will let X11
support dual 4k displays on newer hardware.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently set this state in the draw-module twice on each draw, but
which trashes this state. So far that's not a problem, because we don't
really do much from that function.
But it turns out, we're going to have to do more; namely flush when the
state changes. This will incur a large performance penalty due to the
excessive setting.
Instead, let's rely on the CSO caching making sure that
llvmpipe_set_so_targets doesn't get called needlessly, and setup the
state directly there instead.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
One special case, `src/util/xmlpool/.gitignore` is not entirely deleted,
as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`).
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If there is no last fence, due to no rendering happening yet, just
create a new signaled fence and return it, to match the expectations of
the EGL sync fence API.
Fixes random "Could not create sync fence 0x3003" assertion failures from
Skia on Android, coming from the following code:
https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427
Reproducible especially with thread count >= 4.
One could make the driver always keep the reference to the last fence,
but:
- the driver seems to explicitly destroy the fence whenever a rendering
pass completes and changing that would require a significant functional
change to the code. (Specifically, in lp_scene_end_rasterization().)
- it still wouldn't solve the problem of an EGL sync fence being created
and waited on without any rendering happening at all, which is
also likely to happen with Android code pointed to in the commit.
Therefore, the simple approach of always creating a fence is taken,
similarly to other drivers, such as radeonsi.
Tested with piglit llvmpipe suite with no regressions and following
tests fixed:
egl_khr_fence_sync
conformance
eglclientwaitsynckhr_flag_sync_flush
eglclientwaitsynckhr_nonzero_timeout
eglclientwaitsynckhr_zero_timeout
eglcreatesynckhr_default_attributes
eglgetsyncattribkhr_invalid_attrib
eglgetsyncattribkhr_sync_status
v2:
- remove the useless lp_fence_reference() dance (Nicolai),
- explain why creating the dummy fence is the right approach.
Signed-off-by: Tomasz Figa <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|