| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Found by Coverity
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Albert Freeman <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Found by Coverity
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Broken in commit abdab88b30ab when adding arrays of arrays support
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Albert Freeman <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With static vertex counts, the final EOT write doesn't actually write
any data - it's just there to end the thread. Typically, the last
thing before ending the thread will be an EmitVertex() call, resulting
in a URB write. We can just set EOT on that.
Note that this isn't always possible - there might be an intervening
SSBO write/image store, or the URB write may have been in a loop.
shader-db statistics for geometry shaders only:
total instructions in shared programs: 3173 -> 3149 (-0.76%)
instructions in affected programs: 176 -> 152 (-13.64%)
helped: 8
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GS_OPCODE_SET_WRITE_OFFSET is a MUL with a constant src[1] and special
strides. We can easily make the generator handle constant src[0]
arguments by instead generating a MOV with the product of both operands.
This isn't necessarily a win in and of itself - instead of a MUL, we
generate a MOV, which should be basically the same cost. However, we
can probably avoid the earlier MOV to put src[0] into a register.
shader-db statistics for geometry shaders only:
total instructions in shared programs: 3207 -> 3173 (-1.06%)
instructions in affected programs: 3207 -> 3173 (-1.06%)
helped: 11
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Broadwell's 3DSTATE_GS contains new "Static Output" and "Static Vertex
Count" fields, which control a new optimization. Normally, geometry
shaders can output arbitrary numbers of vertices, which means that
resource allocation has to be done on the fly. However, if the number
of vertices is statically known, the hardware can pre-allocate resources
up front, which is more efficient.
Thanks to the new NIR GS intrinsics, this is easy. We just call the
function introduced in the previous commit to get the vertex count.
If it obtains a count, we stop emitting the extra 32-bit "Vertex Count"
field in the VUE, and instead fill out the 3DSTATE_GS fields.
Improves performance of Gl32GSCloth by 5.16347% +/- 0.12611% (n=91)
on my Lenovo X250 laptop (Broadwell GT2) at 1024x768.
shader-db statistics for geometry shaders only:
total instructions in shared programs: 3227 -> 3207 (-0.62%)
instructions in affected programs: 242 -> 222 (-8.26%)
helped: 10
v2: Don't break non-NIR paths (just skip this optimization).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The visitor was setting a mlen that was wrong for Broadwell, but the
generator was ignoring it and doing the right thing regardless. We may
as well move the logic fully into the visitor. This will be useful in
the next commit as well.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Some hardware (such as Broadwell) can run geometry shaders more
efficiently when the number of vertices emitted is statically known.
This pass provides a way to obtain the constant vertex count, or
-1 indicating that the vertex count is unknown/non-constant.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old code was disasterously complex - spread across multiple atoms
which may not even run, inspecting the dirty bits to try and decide
whether it was necessary to do checks...storing VS information in
brw_context...extra flagging...
This code tripped me and Carl up very badly when working on the
shader cache code. It's very fragile and hard to maintain.
Now that geometry shaders only depend on their inputs and don't have
to worry about the VS VUE map, we can dramatically simplify this:
just compute the VUE map coming out of the geometry shader stage
in brw_upload_programs. If it changes, flag it. Done.
v2: Also check vue_map.separable.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because we only support geometry shaders in core profile, we can safely
ignore any driver-extending of VS outputs.
Those are:
- Legacy userclipping (doesn't exist in core profile)
- Edgeflag copying (Gen4-5 only, no GS support)
- Point coord replacement (Gen4-5 only, no GS support)
- front/back color hacks (Gen4-5 only, no GS support)
v2: Rebase; leave a comment about why SSO works.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, our VUE map code always assigned slots to varyings
sequentially, in one contiguous block.
This was a bad fit for separate shaders - the GS input layout depended
or the VS output layout, so if we swapped out vertex shaders, we might
have to recompile the GS on the fly - which rather defeats the point of
using separate shader objects. (Tessellation would suffer from this
as well - we could have to recompile the HS, DS, and GS.)
Instead, this patch makes the VUE map for separate shaders use a fixed
layout, based on the input/output variable's location field. (This is
either specified by layout(location = ...) or assigned by the linker.)
Corresponding inputs/outputs will match up by location; if there's a
mismatch, we're allowed to have undefined behavior.
This may be less efficient - depending what locations were chosen, we
may have empty padding slots in the VUE. But applications presumably
use small consecutive integers for locations, so it hopefully won't be
much worse in practice.
3% of Dota 2 Reborn shaders are hurt, but only by 2 instructions.
This seems like a small price to pay for avoiding recompiles.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our plan of assigning consecutive slots doesn't work properly for
separate shader objects - at least, if we want to avoid recompiling them
whenever the interface changes.
As a first step, make assign_vue_map take an explicit slot parameter,
rather than implicitly incrementing it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Nothing actually relies on unused slots being initialized to
BRW_VARYING_SLOT_COUNT. Soon, we're going to have VUE maps with holes
in them, at which point pre-filling with BRW_VARYING_SLOT_PAD make a lot
more sense.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't just break for padding slots. Instead, treat them like
unwritten output variables, so we handle flushing and incrementing
urb_offset correctly.
Paul introduced the concept of padding slots back in 2011, but we've
never actually used them for anything. So it's unsurprising that the
scalar VS backend didn't handle them quite right.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 7f1a77ae664cca29208edc32ff82dc7ff4faa02b)
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit bcb9e1d26ba4198359300b50e5c188977cef932e)
|
|
|
|
|
|
|
|
| |
disabled
Signed-off-by: Timothy Arceri <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
V2: return 0 if not array rather than -1
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
These changes are also needed to allow linking of
struct and interface arrays of arrays.
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
if app pass 0 as frame_rate_num, it should not be encoded to the VUI.
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Passing NULL to C11 threads functions isn't safe, so there's no need for
our implementation to handle it. Cuts about 1k of .text.
text data bss dec hex filename
5009514 198440 26328 5234282 4fde6a i965_dri.so before
5008346 198440 26328 5233114 4fd9da i965_dri.so after
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
fixes Piglit test:
arb_program_interface_query/linker/query-varyings.shader_test
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140' blocks
or block members) considered as active 'shared' and 'std140' uniform
blocks and uniform block arrays, but did not include the block array
elements. Because of that, it was possible to have an active uniform
block array without any elements marked as used, making the assertion
((b->num_array_elements > 0) == b->type->is_array())
in link_uniform_blocks() fail.
Fixes the following 5 dEQP tests:
* dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18
* dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24
* dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19
* dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49
* dEQP-GLES3.functional.ubo.random.all_shared_buffer.36
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83508
Tested-by: Tapani Pälli <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
| |
v2:
- Mark it too for GLES 3.1
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
- Add tessellation shader constants support
v3:
- Add GLES 3.1 support.
v4:
- Move the getters to the proper place
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARB_program_interface_query
Including TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE queries.
v2:
- Use std430_array_stride() to get top level array stride following std430's rules.
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The error location won't be right, but fixing that would require to check
for this as we process each type of AST node that can involve a variable
read.
v2:
- Limit the check to buffer variables, image variables have different
semantics involved.
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
- Merge the error check for the readonly qualifier with the already
existing check for variables flagged as readonly (Timothy).
- Limit the check to buffer variables, image variables have different
semantics involved (Curro).
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
- Memory qualifiers on shader storage buffer objects do not come in the form
of layout qualifiers, they are block-level qualifiers.
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
- Save memory qualifier info in the top level members of a shader
storage block.
- Add a checks to record_compare() which is used when comparing
shader storage buffer declarations in different shaders.
- Always report an error for incompatible readonly/writeonly
definitions, whether they are present at block or field level.
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
object is bound
According to ARB_uniform_buffer_object spec:
"If the parameter (starting offset or size) was not specified when the
buffer object was bound (e.g. if bound with BindBufferBase), or if no
buffer object is bound to <index>, zero is returned."
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
| |
These handle querying the buffer name attached to a giving binding point
as well as the start offset and size of that buffer.
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
| |
Defined in ARB_shader_storage_buffer_object extension.
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
- Add ssbo_in the names of the static functions so it is clear that this
is specific to SSBO atomics.
v3:
- Move the check after the loop (Kristian Høgsberg)
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original GLSL IR intrinsics have been lowered to an internal
version that accepts a block index and an offset instead of a
SSBO reference.
v2 (Connor):
- Document the sources used by the atomic intrinsics.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|