| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Now that the elements version handles both cases, remove the
non-elements version.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Keep information in acp_entry whether the entry is full or not, and
use the ACP in more nodes when visiting the instructions:
- add_copy: write whole variables to the ACP state (regardless the
type).
- visit(ir_dereference_variable *): perform the propagation here if we have a
full candidate. Element-wise here doesn't apply because the mask
isn't available at this point.
- visit_leave(ir_assignment *): process beyond scalar and vector, as
the full variables might have other types.
Also import an improvement from opt_copy_propagation.cpp: if ir_call
is an intrinsic, we know the variables affected, so keep going.
v2: (all from Eric Anholt)
Describe how acp_entry attributes are used.
Don't do book-keeping to avoid adding repeated element to
the dsts in write_elements().
v3: Use _mesa_set_remove_key. (Thomas Helland)
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now, we had separate passes for lowering gl_PatchVerticesIn to
a statically known constant (for TES inputs when linked against a TCS),
and a uniform in the other cases. Annoyingly, one had to be run before
nir_lower_system_values, and the other afterward. This simplified the
passes, but made life painful for the callers.
This patch combines both into a single pass. If you give it a non-zero
static count, it uses that. If you give it Mesa state slots, it turns
it back into a built-in uniform. Otherwise, it does nothing.
This also moves the i965 uniform lowering out to shared code.
v2: Make token arrays const.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This is controlled by a new nir_shader_compiler_options flag, and fixes
dEQP-GLES3.functional.shaders.builtin_variable.pointcoord on V3D.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This breaks printing input/output variables with more than
4 components like mat4.
Fixes: 1beef89ad8 ("nir: prepare for bumping up max components to 16")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Spotted in a shader in Batman: Arkham City.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
nir_sweep assumes that constants area always allocated off the variable
to which they belong. Violating this assumption causes them to get
freed early and leads to use-after-free bugs.
Fixes: 120da00975541 "nir: add serialization and deserialization"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107366
Reviewed-by: Lionel Landwerlin <[email protected]>
Tested-by: Mark Janes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
we need rounding modes on other conversions involving floats and it is easier
to rename f2f16_undef than renaming all the other ones.
v2: rebased on master
Reviewed-by: Jason Ekstrand <[email protected]>
Acked-by: Rob Clark <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
also move some of the GLSL builtins over we will need for implementing
some OpenCL builtins
v2: replace NIR_IMM_FP by nir_imm_floatN_t in ported code
fix up changes caused by swizzle rework
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Lightly edited to be valid 'C' code.
Is there a bug open to fix this upstream?
Acked-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Python 2 has a range() function which returns a list, and an xrange()
one which returns an iterator.
Python 3 lost the function returning a list, and renamed the function
returning an iterator as range().
As a result, using range() makes the scripts compatible with both Python
versions 2 and 3.
Signed-off-by: Mathieu Bridon <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Python 2, iterators had a .next() method.
In Python 3, instead they have a .__next__() method, which is
automatically called by the next() builtin.
In addition, it is better to use the iter() builtin to create an
iterator, rather than calling its __iter__() method.
These were also introduced in Python 2.6, so using it makes the script
compatible with Python 2 and 3.
Signed-off-by: Mathieu Bridon <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Python 2, dictionaries have 2 sets of methods to iterate over their
keys and values: keys()/values()/items() and iterkeys()/itervalues()/iteritems().
The former return lists while the latter return iterators.
Python 3 dropped the method which return lists, and renamed the methods
returning iterators to keys()/values()/items().
Using those names makes the scripts compatible with both Python 2 and 3.
Signed-off-by: Mathieu Bridon <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Spotted in a shader in Batman: Arkham City.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Delegating constructors is a C++11 feature, so this was breaking when
compiling with C++98. Change the copy_propagation_state() calls that
used the convenience constructor to use a static member function
instead.
Since copy_propagation_state is expected to be heap allocated, this
change is a good fit.
Tested-by: Vinson Lee <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107305
|
|
|
|
|
|
|
|
|
| |
Allow the capability to be exposed, and convert the new execution mode
into fs state.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The commit 76dfed8ae2d5 changed nir_intrinsics.h to be a generated
header, but the corresponding dependency was not updated for Android.
It causes the error:
[ 0% 19/4336] target C: libmesa_pipe_radeonsi <= external/mesa/src/gallium/drivers/radeonsi/si_debug.c
...
In file included from external/mesa/src/gallium/drivers/radeonsi/si_debug.c:25:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_pipe.h:28:
In file included from external/mesa/src/gallium/drivers/radeonsi/si_shader.h:140:
In file included from external/mesa/src/amd/common/ac_llvm_build.h:30:
external/mesa/src/compiler/nir/nir.h:966:10: fatal error: 'nir_intrinsics.h' file not found
^~~~~~~~~~~~~~~~~~
1 error generated.
Fixes: 76dfed8ae2d5 ("nir: mako all the intrinsics")
Signed-off-by: Chih-Wei Huang <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Mauro Rossi <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There always is a continue block, so let us just do unreachable.
Reviewed-by: Jason Ekstrand <[email protected]>
Fixes: 8cacf38f527 "nir: Do not use continue block after removing it."
CC: 18.1 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107312
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reinserting code directly before a jump means the block gets split
and merged, removing the original block and replacing it in the
process.
Hence keeping a pointer to the continue block over a reinsert
causes issues.
This code changes nir_opt_if to simply look for the new continue
block.
Reviewed-by: Jason Ekstrand <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107275
CC: 18.1 <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
else-branch
When handling 'if' in copy propagation elements, if a certain variable
was killed when processing the first branch of the 'if', then the
second would get any propagation from previous nodes.
x = y;
if (...) {
z = x; // This would turn into z = y.
x = 22; // x gets killed.
} else {
w = x; // This would NOT turn into w = y.
}
With the change, we let copy propagation happen independently in the
two branches and only then apply the killed values for the subsequent
code.
One example in shader-db part of shaders/unity/8.shader_test:
(assign (xyz) (var_ref col_1) (var_ref tmpvar_8) )
(if (expression bool < (swiz y (var_ref xlv_TEXCOORD0) )(constant float (0.000000)) ) (
(assign (xyz) (var_ref col_1) (expression vec3 + (var_ref tmpvar_8) ... ) ... )
)
(
(assign (xyz) (var_ref col_1) (expression vec3 lrp (var_ref col_1) ... ) ... )
))
The variable col_1 was replaced by tmpvar_8 in the then-part but not
in the else-part.
NIR deals well with copy propagation, so it already covered for the
missing ones that this patch fixes.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of keeping multiple acp_entries in lists, have a single
acp_entry per variable. With this, the implementation of clone is more
convenient and now fully implemented. In the previous code, clone was
only partial.
Before this patch, each acp_entry struct represented a write to a
variable including LHS, RHS and a mask of what channels were written
to. There were two main hash tables, the first (lhs_ht) stored a list
of acp_entries per LHS variable, with the values available to copy for
that variable; the second (rhs_ht) was a "reverse index" for the first
hash table, so stored acp_entries per RHS variable.
After the patch, there's a single acp_entry struct per LHS variable,
it contains an array with references to the RHS variables per
channel. There now is a single hash table, from LHS variable to the
corresponding entry. The "reverse index" is stored in the ACP entry,
in the form of a set of variables that copy from the LHS. To make the
clone operation cheaper, the ACP entries are created on demand.
This should not change the result of copy propagation, a later patch
will take advantage of the clone operation.
v2: Add note clarifying how the hashtable is destroyed.
v3: (all from Eric Anholt)
Add remove_unused_var_from_dsts() function for reuse.
Remove from dsts as we go instead of clearing at the end.
Add clarifying comment to erase().
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Separate higher level logic of visiting instructions and chosing when
to store and use new copy data from the datastructure holding the copy
propagation information. This will also make easier later patches that
change the structure.
v2: Remove empty destructor and clarify how hash tables are destroyed.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "__inst" will contain the name used for the variable of type
"__type *". Parenthesis is not necessary as the name itself shouldn't
be an expression.
Fixes warning:
In file included from ../../src/mesa/main/mtypes.h:49,
from ../../src/intel/compiler/brw_compiler.h:30,
from ../../src/intel/compiler/brw_shader.h:29,
from ../../src/intel/compiler/brw_fs.h:31,
from ../../src/intel/compiler/brw_fs_cse.cpp:24:
../../src/intel/compiler/brw_fs_cse.cpp: In member function ‘bool fs_visitor::opt_cse_local(bblock_t*)’:
../../src/compiler/glsl/list.h:675:12: warning: unnecessary parentheses in declaration of ‘entry’ [-Wparentheses]
__type *(__inst); \
^
../../src/intel/compiler/brw_fs_cse.cpp:257:10: note: in expansion of macro ‘foreach_in_list_use_after’
foreach_in_list_use_after(aeb_entry, entry, &aeb) {
^~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes warning:
../../src/compiler/spirv/vtn_variables.c: In function ‘var_decoration_cb’:
../../src/compiler/spirv/vtn_variables.c:1400:12: warning: ‘is_vertex_input’ may be used uninitialized in this function [-Wmaybe-uninitialized]
bool is_vertex_input;
^~~~~~~~~~~~~~~
The code used to set is_vertex_input in all possible codepaths, but
after 23edc5b1ef3 "spirv: translate default-block uniforms" the
compiler isn't sure all codepaths will initialize the variable.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: reword comment about lower_helper_invocations to be more clear
that it might not work on all hardware
v3: add special variant of load_sample_id which does not imply per-
sample shading
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Now the deref is the first src.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
| |
One of these was seen in a Deus Ex shader.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
OpenCL knows vector of size 8 and 16.
v2: rebased on master (nir_swizzle rework)
rework more declarations with nir_component_mask_t
adjust print_var_decl
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When handling loops in constant propagation, implement the "FINISHME"
comment like copy propagation: perform a first pass to find values
that can't be propagated, then perform a second pass with the ACP
containing still valid values.
Certain values are killed because the loop may run more than one
iteration, so we can't copy propagate them as they would be invalid in
the later iterations.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When handling 'if' in constant propagation, if a certain variable was
killed when processing the first branch of the 'if', then the second
would get any propagation from previous nodes. This is similar to the
change done for copy propagation code.
x = 1;
if (...) {
z = x; // This would turn into z = 1.
x = 22; // x gets killed.
} else {
w = x; // This would NOT turn into w = 1.
}
With the change, we let constant propagation happen independently in
the two branches and only then apply the killed values for the
subsequent code.
The new code use a single hash table for keeping the kills of both
branches (the branches only write to it), and it gets deleted after we
use -- instead of waiting for mem_ctx to collect it.
NIR deals well with constant propagation, so it already covered for
the missing ones that this patch fixes.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Empty initializer braces aren't valid c (it's a gnu extension, and
it's valid in c++).
Hopefully fixes appveyor / msvc build...
Fixes a3150c1d06ae7766c3d3fe3b33432e55c3c7527e
|
|
|
|
|
|
|
|
|
| |
This makes the arguments match the (thing, container) pattern used in
other nir_foreach macros and also renames it to make that a bit more
clear.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For one thing, the NIR opcodes for image load/store always take and
return a vec4 value regardless of the image type. We need to fix up
both the source and destination to handle it. For another thing, we
weren't actually setting up a destination in the OpAtomicLoad case.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
| |
Fixes: 2f181c8c183cc8b4d0450789bb20c2be48d32db3
"glsl_types: vec8/vec16 support"
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
| |
A while ago, we added a bunch of format conversion helpers; we should
use them instead of hand-rolling sRGB conversions.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
There are no fixed sized array arguments in C, those are simply pointers
to unsized arrays and as the size is passed in anyway, just rely on that.
where possible calls are replaced by nir_channel and nir_channels.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
| |
It wasn't causing problems since there's nothing to delete, but better
be consistent with the rest of existing codebase.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The only value in kill_entry is the writemask, which can be stored in
the data pointer of the hash table entry.
Suggested by Eric Anholt.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 4654439fdd7 "glsl: Use hash tables for
opt_constant_propagation() kill sets." uses a hash_table for storing
kill_entries, so the structs can be simplified.
Remove the exec_node from kill_entry since it is not used in an
exec_list anymore.
Remove the 'var' from kill_entry since it is now redundant with the
key of the hash table.
Suggested by Eric Anholt.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
| |
we already have this code duplicated and we will need it for the global
group size as well
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
also reorder to match the gl_system_value enum.
It is weird that the STATIC_ASSERT doesn't trigger though.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Karol Herbst <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise nir_validate may complain about 8 bit floats, which do not exist.
Reviewed-by: Karol Herbst <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|