| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
We now get all the tcs metadata from shader_info.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function is added here to ease refactoring towards using the new shared
shader_info. Once refactoring is complete and values are set directly it
will be removed.
We call it from _mesa_copy_linked_program_data() rather than glsl_to_nir()
so that the values will be set for all drivers. In order to do this some
calls need to be moved around so that we make sure to call
do_set_program_inouts() before _mesa_copy_linked_program_data()
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This allows us to do some small tidy ups, but will also allow us to call
a new function that copies values to a shared shader info from here.
In order to make this change this function now requires
_mesa_reference_program() to have previously been called.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are three intended functional changes here:
1. OpenGL 4.5 clarifies that primitive restart should only apply with index
buffers, so make that change explicit in the indirect draw path.
2. Make PrimitiveRestartFixedIndex work with indirect draws.
3. The change where primitive_restart is only set when the restart index can
actually have an effect (based on the size of indices) is also applied for
indirect draws.
Cc: 13.0 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows the driver to signal that it can't handle random
interleaving of attributes across buffers. This is required for
ARB_transform_feedback3, and it's initialized to whatever the previous
value of PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME was except for nv50 where
it is disabled. Note that the proprietary drivers never expose
ARB_transform_feedback3 on any GT21x's (where nouveau previously did),
and after some effort I was unable to get it to work.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even when enabled, primitive restart has no effect when the restart index
is larger than the representable values in the index buffer.
Fixes GL45-CTS.gtf31.GL3Tests.primitive_restart.primitive_restart_upconvert
for radeonsi VI.
v2: add an explanatory comment
Cc: "12.0 13.0" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]> (v1)
|
|
|
|
|
|
|
|
| |
Fixes a regression introduced by commit 777dcf81b.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98307
Reviewed-by: Marek Olšák <[email protected]>
Cc: 13.0 <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use a full writemask in this case. This is relevant e.g. when a function
has an inout argument which is an array of structs.
v2: use C-style comment (Timothy Arceri)
Reviewed-by: Marek Olšák <[email protected]> (v1)
Cc: 13.0 <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Set the type of the left-hand side to the same as the right-hand side,
so that when the base type is double, the writemask of the MOV instruction
is properly fixed up.
Reviewed-by: Marek Olšák <[email protected]>
Cc: 13.0 <[email protected]>
|
|
|
|
|
|
| |
v2: rebased
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
it's always true
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
I don't know what this was supposed to do, but all TGSI labels were
always 0.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
Never used. The GLSL compiler doesn't even look at EmitNoFunctions.
v2: add back "return" support in "main"
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
sizeof(glsl_to_tgsi_instruction): 384 -> 264
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
sizeof(glsl_to_tgsi_instruction): 416 -> 384
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
sizeof(glsl_to_tgsi_instruction): 464 -> 416
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I noticed that glsl_to_tgsi_instruction is too huge.
sizeof(glsl_to_tgsi_instruction): 752 -> 464 (-38%)
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
The corresponding opcodes for integers need to be treated the same as F2D.
Fixes GL45-CTS.gpu_shader_fp64.conversions.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When more than one atomic counter buffer is in use, UniformStorage[n].opaque
is set up to contain indices that are contiguous across all used buffers.
This appears to be used by i965 via NIR, but for TGSI we do not treat atomic
counter buffers as opaque, so using the data in the opaque array is incorrect.
Fixes GL45-CTS.compute_shader.resource-atomic-counter.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
See the comment in the code for an explanation. This fixes
GL45-CTS.buffer_storage.map_persistent_draw.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Properly handle the case where there is a gap in the assigned output locations,
e.g. a fragment shader writes to color buffer 2 but not to color buffers 0 & 1.
Fixes GL45-CTS.gtf33.GL3Tests.explicit_attrib_location.explicit_attrib_location_pipeline.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DEQP's clear tests like to give us x + w < 0 or y + h < 0. Since we
were comparing to an unsigned, it would get promoted to unsigned and come
out as bignum >= width or height and we would clear the whole fb instead
of none of the fb.
Fixes 10 tests under deqp-gles2/functional/color_clear.
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
None of the drivers which implement this hook do anything with the
texture parameter value. Drivers just look at the pname and set a
dirty flag if needed.
We were doing some ugly casting and type conversion to setup the
argument so that all goes away.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For performance tuning in drivers. It filters out window system
framebuffers and OpenGL renderbuffers.
radeonsi will use this to guess whether a depth buffer will be read
by a shader. There is no guarantee about what will actually happen.
This is a departure from PIPE_BIND flags which are defined to be strict
but they are useless in practice.
Acked-by: Roland Scheidegger <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Whether one or two slots are taken up by one API array depends on the
vertex shader, not on how the array is configured. When an array is
set up with fewer components than the shader expects, the high components
are undefined.
Fixes GL45-CTS.vertex_attrib_binding.basic-inputL-case1.
Cc: [email protected]
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Cc: [email protected]
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bug with offsets from uniforms which seems to have only been
noticed as a crash in piglit's
arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag
on radeonsi.
Cc: [email protected]
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gallium is completely oblivious to whether the fbo is flipped or not.
Only flip the stipple pattern when the fbo is flipped as well. Otherwise
the driver has no idea when to unflip the pattern.
Fixes bin/gl-2.1-polygon-stipple-fs -fbo
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
v2: mark llvmpipe & softpipe properly as well (Jason Wood)
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to be able to emit overlapping input and output array
declarations, we flip the logic of emitting those declarations on its
head: rather than iterating over slots and emitting the corresponding
declarations, we iterate over the declarations from GLSL and emit those.
v2: fix some regressions related to structs
v3: fix a regression in geometry and tessellation shader array handling
Acked-by: Edward O'Callaghan <[email protected]> (v2)
Reviewed-by: Dave Airlie <[email protected]> (v2)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, a shader may have an input/output array but not use some
entries in the middle. This happens with eON games, for example.
We emit declarations that cover the entire array range even if there are
some unused gaps. This patch now reflects that in the InputsRead etc.
fields to ensure the various input/outputMapping arrays are actually
correct, which will be important when we re-jiggle the way declarations
are emitted.
v2: fix a typo (Edward O'Callaghan)
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This optimization is incorrect with 64-bit operations, because the
channel-splitting logic in emit_asm ends up being applied twice to
the source operands.
A lucky coincidence of how the writemask test works resulted in this
optimization basically never being applied anyway. As far as I can tell,
the only case where it would (incorrectly) have been applied is something
like
dvec2 d;
float x = (float)d.y;
which nobody seems to have ever done. But the moral equivalent does occur
in one of the component layout piglit test.
Cc: [email protected]
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Empty writemasks mean "copy everything", so we can always just use the number
of vector elements (which uses the GLSL meaning here, i.e. each double is a
single element/writemask bit).
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extension is only exposed if the underlying driver supports
ARB_compute_shader and if PIPE_COMPUTE_MAX_VARIABLE_THREADS_PER_BLOCK
is set.
v3: - initialize max_variable_threads_per_block to 0
v2: - expose the ext based on that new cap
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
gl_LocalGroupSizeARB can be translated into TGSI_SEMANTIC_BLOCK_SIZE
which represents the block size in threads.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the sampler view code was scattered across several different
files.
Note, the previous REALLOC(), FREE() for st_texture_object::sampler_views
are replaced by realloc(), free() to avoid conflicting macros in Mesa vs.
Gallium.
Reviewed-by: Edward O'Callaghan <[email protected]>
Acked-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, st_get_texture_sampler_view_from_stobj() did a lot of work to
check if the texture parameters matched the sampler view (format,
swizzle, min/max lod, first/last layer, etc). We did this every time
we validated the texture state.
Now, we use a ctx->Driver.TexParameter() callback and a couple other
checks to proactively release texture views when we know that
view-related parameters have changed. Then, the validation step is
simplified:
- Search the texture's list of sampler views (just match the context).
- If found, we're done.
- Else, create a new sampler view.
There will never be old, out-of-date sampler views attached to texture
objects that we have to test.
Most apps create textures and set the texture parameters once. This
make sampler view validation much cheaper for that case.
Note that the old texture/sampler comparison code has been converted
into a set of assertions to verify that the sampler view is in fact
consistent with the texture parameters. This should help to spot any
potential regressions.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, we had code to compute the sampler view's format spread across two
different functions: in update_single_texture() and
st_get_texture_sampler_view_from_stobj(). Now it's all in one new function.
Also, use _mesa_texture_base_format() to simplify the code.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
And minor code reformatting.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
There's no need to cast to st_texture_image. Just use gl_texture_image.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|