| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
| |
v2:
- Set the value to 16 and drop the comment. (Kristian)
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes sure that user is still able to query properties about
variables that have gotten packed by lower_packed_varyings pass.
Fixes following OpenGL ES 3.1 test:
ES31-CTS.program_interface_query.separate-programs-vertex
v2: fix 'name included in packed list' check (Ilia Mirkin)
v3: iterate over instances of name using strtok_r (Ilia Mirkin)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marta Lofstedt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is required to store information about packed varyings, currently
these variables get lost and cannot be retrieved later in sensible way
for program interface queries. List will be utilized by next patch.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marta Lofstedt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
* Use _mesa_has_compute_shaders (Ilia)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Move API validation to _mesa_validate_DispatchCompute in
api_validate.c.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the immutable compressed texture didn't have the full mip pyramid,
this didn't work, because it tried to generate mip levels for non-existing
levels. _mesa_prepare_mipmap_level() would correctly handle this by returning
FALSE if the mip level didn't exist, however we actually created the
non-existing mip level right before that because we used _mesa_get_tex_image()
before calling _mesa_prepare_mipmap_level(). It would then proceed to crash
(we allocated the mip level, which is a bad idea on an immutable texture,
but didn't initialize the values, leading to assertion failures or segfaults).
Fix this by using _mesa_select_tex_image() instead and call it after
_mesa_prepare_mipmap_level(), as that function will allocate missing mip levels
for non-immutable textures already.
This fixes a (2 year old) crash with astromenace which was hack-fixed in ubuntu
packages instead: http://bugs.debian.org/718680 (I guess most apps do full mip
chains - I believe this app not doing it is actually unintentional, always one
level less than full mip chain...).
Cc: "10.6 11.0" <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... with only ARB_shader_atomic_counters.
I expected to see interactions with ARB_tessellation_shader in the
ARB_shader_atomic_counters spec, but they do not exist. It seems that we
should unconditionally expose these variables in the presence of
ARB_shader_atomic_counters:
gl_MaxTessControlAtomicCounters
gl_MaxTessEvaluationAtomicCounters
This partially reverts commit da7adb99e8. The commit also affected
gl_MaxTessControlImageUniforms and gl_MaxTessEvaluationImageUniforms
similarly but the ARB_shader_image_load_store spec does list an
interaction with ARB_tessellation_shader.
Cc: "11.0" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92095
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this commit, copy propagation is discarded if it involves
a uniform with an instruction that has 3 sources. But 3 sourced
instructions can access scalar values.
For example, this is what vec4_visitor::fix_3src_operand() is already
doing:
if (src.file == UNIFORM && brw_is_single_value_swizzle(src.swizzle))
return src;
Shader-db results (unfiltered) on NIR:
total instructions in shared programs: 6259650 -> 6241985 (-0.28%)
instructions in affected programs: 812755 -> 795090 (-2.17%)
helped: 7930
HURT: 0
Shader-db results (unfiltered) on IR:
total instructions in shared programs: 6445822 -> 6441788 (-0.06%)
instructions in affected programs: 296630 -> 292596 (-1.36%)
helped: 2533
HURT: 0
v2:
- Updated commit message, using Matt Turner suggestions
- Move the check after we've created the final value, as Jason
Ekstrand suggested
- Clean up the condition
v3:
- Move the check back to the original place, to keep things
tidy, as suggested by Jason Ekstrand
v4:
- Fixed missing is_single_value_swizzle() as pointed by Jason Ekstrand
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
The HUD adds '%' if max == 100.
Signed-off-by: Benjamin Bellec <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Cc: 11.0 <[email protected]>
Acked-by: Christian König <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
| |
Broken by: d082c5324914212f76e45be497229c7a0681f706
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92072
Cc: 10.6 11.0 <[email protected]>
Tested-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we assign hw regs to attributes, we don't incorporate the stride
and subreg_offset from the fs_reg. It's rarely used, but the integer
multiplication lowering uses unusual stride and subreg_offset
combination breaks when one source is an attribute.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91970
Cc: "10.6 11.0" <[email protected]>
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, core Mesa's _mesa_CopyImageSubData() created temporary textures
to wrap renderbuffer sources/destinations. This caused a bit of a mess in
the Mesa/gallium state tracker because we had to basically undo that
wrapping.
Instead, change ctx->Driver.CopyImageSubData() to take both gl_renderbuffer
and gl_texture_image src/dst pointers (one being null, the other non-null)
so the driver can handle renderbuffer vs. texture as needed.
For the i965 driver, we basically moved the code that wrapped textures
around renderbuffers from copyimage.c down into the met and driver code.
The old code in copyimage.c also made some questionable calls to
_mesa_BindTexture(), etc. which weren't undone at the end.
v2 (Jason Ekstrand): Rework the intel bits
v3 (Brian Paul): Update the temporary st_CopyImageSubData() function.
Reviewed-by: Topi Pohjolainen <[email protected]>
Tested-by: Kai Wasserbäch <[email protected]>
Tested-by: Nick Sarnie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Check for PIPE_FORMAT_R8_UNORM when setting up the copy shader.
Also re-enable the dest alpha blending with A8 destination that
actually turned out to be correct.
Verified using rendercheck that the composite operators
overreverse, in, out, atop, atopreverse and xor seem to work fine
with a8 destiation.
v2: Fix a copy-paste error.
Reported-by: Jose Fonseca <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It doesn't matter whether a write is saturated or not, in another
implementation it might even have been a separate opcode. This code was
most likely copied from the copy-propagation pass (where one does have
to distinguish saturation).
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Previously the code was trying to get the packing type from the array not the
interface.
Cc: Ian Romanick <[email protected]>
Cc: Antia Puentes <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
I left a bunch of code indented a level in the previous patch to make
the diff easier to read. But now we should fix that.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By performing the vertex counting in NIR, we're able to elide a ton of
useless safety checks around every EmitVertex() call:
total instructions in shared programs: 3952 -> 3720 (-5.87%)
instructions in affected programs: 3491 -> 3259 (-6.65%)
helped: 11
HURT: 0
Improves performance in Gl32GSCloth by 0.671742% +/- 0.142202% (n=621)
on Haswell GT3e at 1024x768.
This should also make it easier to implement Broadwell's "Static Vertex
Count" feature someday.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch also introduces a lowering pass to convert the simple GS
intrinsics to the new ones. See the comments above that for the
rationale behind the new intrinsics.
This should be useful for i965; it's a generic enough mechanism that I
could see other drivers potentially using it as well, so I don't feel
too bad about putting it in the generic code.
v2:
- Use nir_after_block_before_jump for the cursor (caught by Jason
Ekstrand - I'd mistakenly used nir_after_block when rebasing this
code onto the new NIR control flow API).
- Remove the old emit_vertex intrinsic at the end, rather than in
the middle (requested by Jason).
- Use state->... directly rather than locals (requested by Jason).
- Report progress from nir_lower_gs_intrinsics() (requested by me).
- Remove "Authors:" section from file comment (requested by
Michael Schellenberger Costa).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NIR control flow modification API churns the block structure,
splitting blocks, stitching them back together, and so on. Preserving
information about block dominance is hard (and probably not worthwhile).
This patch makes nir_cf_extract() throw away all metadata, like we do
when adding/removing jumps.
We then make the dead control flow pass compute dominance information
right before it uses it. This is necessary because earlier work by the
pass may have invalidated it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Calling unlink_blocks(block, block->successors[0]) will successfully
unlink the first successor, but then will shift block->successors[1]
down to block->successor[0]. So the successors[1] != NULL check will
always fail.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the case of "while (...) { break }". Or in NIR:
block block_0 (0x7ab640):
...
/* succs: block_1 */
loop {
block block_1:
/* preds: block_0 */
break
/* succs: block_2 */
}
block block_2:
Calling nir_handle_remove_jump(block_1, nir_jump_break) will remove the break.
Unfortunately, it would mangle the predecessors and successors.
Here, block_2->predecessors->entries == 1, so we would create a fake
link, setting block_1->successors[1] = block_2, and adding block_1 to
block_2's predecessor set. This is illegal: a block cannot specify the
same successor twice. In particular, adding the predecessor would have
no effect, as it was already present in the set.
We'd then call unlink_block_successors(), which would delete the fake
link and remove block_1 from block_2's predecessor set. It would then
delete successors[0], and attempt to remove block_1 from block_2's
predecessor set a second time...except that it wouldn't be present,
triggering an assertion failure.
The fix appears to be simple: simply unlink the block's successors and
recreate them to point at the correct blocks first. Then, add the fake
link. In the above example, removing the break would cause block_1 to
have itself as a successor (as it becomes an infinite loop), so adding
the fake link won't cause a duplicate successor.
v2: Add comments (requested by Connor Abbott) and fix commit message.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a bug where we mess up predecessors/successors due to the
ordering of unlinking/recreating edges/adding fake edges. In order to
fix that, I need everything in one routine.
However, calling block_add_normal_succs() isn't safe from
cleanup_cf_node() - it would crash trying to insert phi undefs.
So unfortunately I need to add a parameter.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider the following NIR:
block block_0;
/* succs: block_1 block_2 */
if (...) {
block block_1;
...
} else {
block block_2;
}
Calling split_block_beginning() on block_1 would break block_0's
successors: link_block() sets both successors of a block, so calling
link_block(block_0, new_block, NULL) would throw away the second
successor, leaving only /* succ: new_block */. This is invalid: the
block before an if statement must have two successors.
Changing the call to link_block(pred, new_block, pred->successors[0])
would correctly leave both successors in place, but because unlink_block
may shift successor[1] to successor[0], it may not preserve the original
order. NIR maintains a convention that successor[0] must point to the
"then" block, while successor[1] points to the "else" block, so we need
to take care to preserve this ordering.
This patch creates a new function that swaps out one successor for
another, preserving the ordering. It then uses this to fix the issue.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I need to do this in a second place, and I'd rather make a helper
function than cut and paste the code.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is invalid, and causes disasters if we try to unlink successors:
removing the first will work, but removing the second copy will fail
because the block isn't in the successor's predecessor set any longer.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's possible that, if a vecN operation is involved in a phi node, that we
could end up moving from a register to itself. If swizzling is involved,
we need to emit the move but. However, if there is no swizzling, then the
mov is a no-op and we might as well not bother emitting it.
Shader-db results on Haswell:
total instructions in shared programs: 6262536 -> 6259558 (-0.05%)
instructions in affected programs: 184780 -> 181802 (-1.61%)
helped: 838
HURT: 0
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I don't know of any piglit tests that are currently broken. However, there
is nothing stopping a vecN instruction from getting source modifiers and
lower_vec_to_movs is run after we lower to source modifiers.
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|