| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some NVIDIA hardware can accept 128 fragment shader input components,
but only have up to 124 varying-interpolated input components. We add a
new cap to express this cleanly. For most drivers, this will have the
same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader.
Fixes KHR-GL45.limits.max_fragment_input_components
Signed-off-by: Karol Herbst <[email protected]>
[imirkin: rebased, improved docs/commit message]
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Rob Clark <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: 19.0 <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Iris would like to use compact arrays for tesslevels and clip/cull
distances. radeonsi will likely want to switch to these at some point,
since it'll be necessary for GL_ARB_gl_spirv support, but it's not ready
for them just yet.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new cap that indicates whether the drivers supports
enabling/disabling the conversion from linear space to sRGB
for a framebuffer attachment. In Driver terms that this CAP indicates
whether the driver can switcht between a linear and and a sRGB surface
format for draw destinations witout changing the sourface itself.
v2: rename CAP to DEST_SURFACE_SRGB_CONTROL to reflect its
purpouse better (pointed out by Ilia Mirkin)
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
The semantic was removed in e6d93893662d.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Thanks to Ilia for catching this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Intel's blending hardware does not properly return 1.0 for destination
alpha for RGBX formats; it requires the factors to be overridden to
either zero or one. Broadcom vc4 and v3d also could use this override.
While overriding these factors is safe in general, Nouveau and Radeon
would prefer not to. Their blending hardware already returns correct
values for RGB/RGBX formats, and would like to avoid the resulting
per-buffer blending and independent blend factors (rgb != a) since it
can cause additional overhead.
I considered simply handling this in the driver, but it's not as nice.
pipe_blend_state doesn't have any format information, so we'd need the
hardware blend state to depend on both pipe_blend_state and
pipe_framebuffer_state. Furthermore, Intel GPUs don't have a native
RGBX_SNORM format, so I avoid exposing one, which makes Gallium fall
back to RGBA_SNORM. The pipe_surfaces we get in the driver have an RGBA
format, making it impossible to tell that there shouldn't be an alpha
channel. One could argue that st not handling it in that case is a bug.
To work around this, we'd have to expose RGBX pipe formats, mapped to
RGBA hardware formats, and add format swizzling special cases. All
doable, but it ends up being more code than I'd like.
st_atom_blend already has access to the right information and it's
trivial to accomplish there, so we just add a cap bit and do that.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gallium historically has treated pipeline statistics queries as a single
query, PIPE_QUERY_PIPELINE_STATISTICS, which returns a block of 11
values. This was originally patterned after the D3D1x API. Much later,
Brian introduced an OpenGL extension that exposed these counters - but
it exposes 11 separate queries, each of which returns a single value.
Today, st/mesa simply queries all 11 values, and returns a single value.
While pipeline statistics counters aren't typically performance
critical, this is still not a great fit. A D3D1x->GL translator might
request all 11 counters by creating 11 separate GL queries...which
Gallium would map to reads of all 11 values each time, resulting in a
total 121 counter reads. That's not ideal.
This patch adds a new cap, PIPE_CAP_QUERY_PIPELINE_STATISTICS_SINGLE,
and corresponding query type PIPE_QUERY_PIPELINE_STATISTICS_SINGLE.
When calling create_query(), q->index should be set to one of the
PIPE_STAT_QUERY_* enums to select a counter. Unlike the block query,
this returns the value in pipe_query_result::u64 (as it's a single
value) instead of the pipe_query_data_pipeline_statistics group.
We update st/mesa to expose ARB_pipeline_statistics_query if either
capability is set, preferring the new SINGLE variant when available.
Thanks to Roland, Ilia, and Marek for helping me sort this out.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Make sure that the next line starts with spaces so that bullets are
maintained throughout, add `` around a few more special tokens, and fix
SAMPLE_COUNT_TEXTURE -> SAMPLE_COUNT.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
ATOMFADD is a little special -- make drivers have to specify it
explicitly.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This is supported by at least NVIDIA hardware, and exposeable via GL
extensions.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This new pipe cap and the new nr_samples field in pipe_surface lets a
state tracker bind a render target with a different sample count than
the resource. This allows for implementing
EXT_multisampled_render_to_texture and
EXT_multisampled_render_to_texture2.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce a new capability for the maximum value of
pipe_vertex_element::src_offset. Initially just every driver
backend returns the value previously set from _mesa_init_constants.
So this shall end up in no functional change.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Mathias Fröhlich <[email protected]>
|
| |
|
| |
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves the evergreen-specific max-sizes out as a driver-cap, so
other drivers with less strict requirements also can use hw-atomics.
Remove ssbo_atomic as it's no longer needed.
We should now be able to use hw-atomics for some stages and not for
other, if needed.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Gurchetan Singh <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This gets rid of a r600 specific hack in the state-tracker, and prepares
for other drivers to be able to use hw-atomics.
While we're at it, clean up some indentation in the various drivers.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Gurchetan Singh <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not
PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER.
Drivers for such hardware would like to advertise support for
ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp.
This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit,
changes the extension enable to be based on that, and enables it
in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP
(so they continue supporting this mode).
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
| |
This commit does not add support for the opcodes in gallivm or tgsi_to_nir.c
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v1 -> v2:
- nv30 is _NOT_ scalar as suggested by Ilia Mirkin.
- Change from a screen cap to a shader cap as suggested
by Eric Anholt.
- radeonsi is scalar as suggested by Marek Olšák.
- Change missing ones to be scalar.
v2 -> v3:
- r600 prefers vec4 as suggested by Marek Olšák.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Brian Paul <[email protected]> (v2)
Reviewed-by: Marek Olšák <[email protected]> (v2)
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Required by radeonsi for optimal behavior.
|
|
|
|
|
|
|
|
|
|
| |
Nobody queries these and nobody sets them to anything useful,
the docs say TODO.
Drop them until a use appears.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
This has been unused since 100796c15c3a.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Calling this function will emit a fence signal operation into the
GPU's command stream.
v2: documentation typos
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Protects semaphore signaling functionality required by GL_EXT_semaphore.
v2: s/semaphore/fence
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Andres Rodriguez <[email protected]>
Reviewed-by: Wladimir J. van der Laan <[email protected]>
|
|
|
|
|
|
|
|
| |
The sample mask is used even if msaa is not explicity enabled when we
have a framebuffer with multisampled surfaces. That's DX behavior and
what the Radeon drivers do. Not sure about other drivers at this point.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This API binds atomic buffers for all bound shaders (as per the
GL semantics).
This is needed to support cross shader hw atomic counters.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-By: Gert Wollny <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for a hw atomic counters to TGSI.
A new register file for storing atomic counters is added,
along with a new atomic counter semantic, along with docs
for both.
v2: drop semantic, move hw counter to backend,
Ilia pointed out SSO would have busted my plan, and he
was right.
v3: drop BUFFER decls. (Marek)
v3.1: minor fixups for whitespace, set ureg error
if we overflow the hw atomic limits. (nha)
v3.2: fix some docs inconsistencies (Ilia)
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Tested-By: Gert Wollny <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This looks like an evergreen specific feature, but with atomic
counters AMD have hw specific counters they use instead of operating
on buffers directly. These are separate to the buffer atomics,
so require different limits and code paths.
I've left the CAP for atomic type extensible in case someone
else has a variant on this sort of thing (freedreno maybe?)
and needs to change it.
This adds all the CAPs required to add support for those atomic
counters, along with a related CAP for limiting the number of
output resources.
I'd like to land this and the st patch then I can start to
upstream the evergreen support for these and other GL4.x features.
v2: drop the ATOMIC_COUNTER_MODE cap, just use the return
from the HW counters. If 0 we use the current mode.
v3: fix some rebase errors (Gert Wollny)
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-By: Gert Wollny <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These bits are intended to be used by the ddebug hang detection and are
named in analogy to the Vulkan stage bits (and the corresponding Radeon
pipeline event).
Hang detection needs fences on the granularity of individual commands,
which nothing else really covers. The closest alternative would have
been PIPE_QUERY_GPU_FINISHED, but (a) queries are a per-context object
and we really want a per-screen object, (b) queries don't offer a
wait with timeout, and (c) in any case, PIPE_QUERY_GPU_FINISHED is
meant to imply that GPU caches are flushed, which the new bits
explicitly aren't.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Also document some subtleties of pipe_context::flush.
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
Some hw (evergreen) has a limit on how many combined (images/buffers/mrts)
a fragment shader can access.
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because vc4 can control the order that tiles are rasterized in, we can use
it to implement overlapping blits using normal drawing and
GL_ARB_texture_barrier, as long as we can tell the kernel what order to
render the tiles in.
This commit introduces the core gallium support, vc4 changes will follow.
v2: Fix on the simulator.
v3: Add the cap (disabled) to other drivers, add rst docs for the cap.
v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
v5: Drop vc4 changes from this commit, for clarity.
Reviewed-by: Nicolai Hähnle <[email protected]> (v3)
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The operation performed is all the same as LODQ, but with the usual
differences between dx10 and GL texture opcodes, that is separate resource
and sampler indices (plus result swizzling, and setting z/w channels
to zero).
Reviewed-by: Jose Fonseca <[email protected]>
Acked-by: Nicolai Hähnle <[email protected]>
|