| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Fixes make check permission error.
../../bin/test-driver: line 107: ./nir/tests/algebraic_parser_test.sh: Permission denied
FAIL nir/tests/algebraic_parser_test.sh (exit status: 126)
Fixes: a0ae12ca91a4 ("nir/algebraic: Add unit tests for bitsize validation")
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64. This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
| |
Suffixes are dropped from a bunch of conversion opcodes when it makes
sense to do so. Others are kept if we really do want the bit-size
restriction.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
All conversion opcodes require a destination size but this makes
constructing certain algebraic expressions rather cumbersome. This
commit adds support to nir_search and nir_algebraic for writing
conversion opcodes without a size. These meta-opcodes match any
conversion of that type regardless of destination size and the size gets
inferred from the sizes of the things being matched or from other
opcodes in the expression.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
Instead of using an OrderedDict, just have a (necessarily sorted) array
of transforms and a set of opcodes.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
Both of these things are already handled in the Value base class so we
don't need to handle them explicitly in Constant.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
While we're at it, we rework them a bit to all use regular expressions
and assert more.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The non-failure path can be tested by just compiling mesa and then
testing it, but the failure paths won't be hit unless you make a mistake,
so it's best to test them with some unit tests.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this commit, there were two copies of the algorithm: one in C,
that we would use to figure out what bit-size to give the replacement
expression, and one in Python, that emulated the C one and tried to
prove that the C algorithm would never fail to correctly assign
bit-sizes. That seemed pretty fragile, and likely to fall over if we
make any changes. Furthermore, the C code was really just recomputing
more-or-less the same thing as the Python code every time. Instead, we
can just store the results of the Python algorithm in the C
datastructure, and consult it to compute the bitsize of each value,
moving the "brains" entirely into Python. Since the Python algorithm no
longer has to match C, it's also a lot easier to change it to something
more closely approximating an actual type-inference algorithm. The
algorithm used is based on Hindley-Milner, although deliberately
weakened a little. It's a few more lines than the old one, judging by
the diffstat, but I think it's easier to verify that it's correct while
being as general as possible.
We could split this up into two changes, first making the C code use the
results of the Python code and then rewriting the Python algorithm, but
since the old algorithm never tracked which variable each equivalence
class, it would mean we'd have to add some non-trivial code which would
then get thrown away. I think it's better to see the final state all at
once, although I could also try splitting it up.
v2:
- Replace instances of "== None" and "!= None" with "is None" and
"is not None".
- Rename first_src to first_unsized_src
- Only merge the destination with the first unsized source, since the
sources have already been merged.
- Add a comment explaining what nir_search_value::bit_size now means.
v3:
- Fix one last instance to use "is not" instead of !=
- Don't try to be so clever when choosing which error message to print
based on whether we're in the search or replace expression.
- Fix trailing whitespace.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
| |
Required for VK_KHR_shader_atomic_int64.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
because the closed driver exposes it. Tested by piglit.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extension is not properly tested (testing for
GL_ARB_fragment_shader_interlock is not sufficient), and since this was
noted in review on August 28th no tests have been sent.
Revert "i965: Add INTEL_fragment_shader_ordering support."
Revert "mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering"
This reverts commit 03ecec9ed2099f6e2b62994b33dc948dc731e7b8.
This reverts commit 119435c8778dd26cb7c8bcde9f04b3982239fe60.
Cc: [email protected]
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Józef Kucia <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
This makes some of the code more clear.
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
We normally call with stderr which is unbuffered, so this won't affect
that, but it does let me call nir_print_shader(nir, fopen("log", "w+"))
from gdb and actually get the whole shader in my file.
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
lowers ceil(x) as -floor(-x)
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a shader program is de-serialized the gl_shader_program passed in
may actually still hold memory allocations for the transform feedback
varyings. If that is the case, free the varying names and reallocate
the new storage for the names array.
This fixes a memory leak:
Direct leak of 48 byte(s) in 6 object(s) allocated from:
in malloc (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0xdb880)
in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:875
in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985
...
Indirect leak of 42 byte(s) in 6 object(s) allocated from:
in __interceptor_strdup (/usr/lib64/gcc/x86_64-pc-linux-gnu/7.3.0/libasan.so+0x761c8)
in transform_feedback_varyings ../../samba/mesa/src/mesa/main/transformfeedback.c:887
in _mesa_TransformFeedbackVaryings ../../samba/mesa/src/mesa/main/transformfeedback.c:985
Fixes: ab2643e4b06f63c93a57624003679903442634a8
glsl: serialize data from glTransformFeedbackVaryings
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 1f29f4db1e867357a119c0c7c34fb54dc27fb682.
For this to work the compiler must ensure that it never puts
the values that arrive to this helper into unsigned variables
at any point in its processing, since that would not apply sign
extension to the value and it would break the expectations here.
Unfortunately, we use uint64_t extensively to pass and copy
things around, so some times we get to this helper with values
that are not properly sign extended to 64-bit. Here is an example
for an 8-bit value that comes from a switch case:
(gdb) p /x x
$1 = 0xffffffd6
The value seems to have been sign extended to 32-bit at some point
getting proper sign extension, but then copied into a uint64_t
which wont' apply sign extension, breaking the expectations of
the assertion.
Reviewed-by: Juan A. Suarez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Meson test has a concepts of suites, which allow tests to be grouped
together. This allows for a subtest of tests to be run only (say only
the tests for nir). A test can be added to more than one suite, but for
the most part I've only added a test to a single suite, though I've
added a compiler group that includes nir, glsl, and glcpp tests.
To use this you'll need to invoke meson test directly, instead of ninja
test (which always runs all targets). it can be invoked as:
`meson test -C builddir --suite $suitename` (meson test has addition
options that are pretty useful).
Tested-By: Gert Wollny <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way nir_lower_clip_vs() works with store_output intrinsics makes a
ton of assumptions about the driver_location field.
In i965 and iris, I'd rather do this lowering early and work with
variables. v3d may want to switch to that as well, and ir3 could too,
but I'm not sure exactly what would need updating. For now, handle
both methods.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
I'll want the variables in the next patch.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
It's now called exactly once, and there's not really any distinction.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Check if the base ends up with no variable, and continue
if we see that case outside the loop.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I posted a load of hacks before to do this, Jason suggested this,
just check the deref mode, not the variable mode and delay getting
the variable until we know the type.
avoids crashes when derefing shared memory pointers.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
It's not at all intel-specific; the formula is dictated by OpenGL and
Vulkan. The only intel-specific thing is that we need the lowering. As
a nice side-effect, the new version is variable-group-size ready.
Reviewed-by: Plamena Manolova <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This also changes spirv_to_nir and glsl_to_nir to set them. The one
place that doesn't set them is shared memory access lowering in
nir_lower_io. That will have to be updated before any consumers of it
can effectively use these new alignments.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Acked-by: Karol Herbst <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The new helpers can generate any pack/unpack operation including those
for which we do not have specific opcodes and they express a bitcast in
terms of these pack/unpack operations. In particular, the new helpers
properly handle 8-bit types.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The pattern of adding or multiplying an integer by an immediate is
fairly common especially in deref chain handling. This adds a helper
for it and uses it a few places. The advantage to the helper is that
it automatically handles bit sizes for you.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
| |
This assert won't catch all mistakes with this helper but it will at
least ensure that the top bits are all zero or all one which should help
catch bugs.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
| |
It messes up when trying to lower.
Cc: [email protected]
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Section 3.7 (Identifiers) of the GLSL spec says:
However, as noted in the specification, there are some cases where
previously declared variables can be redeclared to change or add
some property, and predeclared "gl_" names are allowed to be
redeclared in a shader only for these specific purposes. More
generally, it is an error to redeclare a variable, including those
starting "gl_".
This patch should fix piglit tests:
clip-distance-redeclare-without-inout.frag
clip-distance-redeclare-without-inout.vert
However, this causes a regression in
clip-distance-out-values.shader_test. A fix for that test has been sent
to the piglit list for review:
https://patchwork.freedesktop.org/patch/255201/
As far as I understood following mailing thread:
https://lists.freedesktop.org/archives/piglit/2013-October/007935.html
looks like we have accepted to remove an ability to change qualifiers
but have not done it yet. Unless I missed something)
v2 (idr): Move 'earlier->data.mode != var->data.mode' test much earlier
in the function. Add special handling for gl_LastFragData.
Signed-off-by: Andrii Simiklit <[email protected]>
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. nir/nir_lower_vars_to_ssa.c:691:21: warning:
unused variable ‘var’
nir_variable *var = path->path[0]->var;
v2: Changes for some part of 'may be used uninitialized'
warnings were removed, seems like it is a compiler issue.
( Eric Engestrom <[email protected]> )
Possible like this one:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46684
This issue is flagged as duplicate but an
original one is not closed yet.
Signed-off-by: Andrii Simiklit <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Some hardware supports source mods only for float operations. Make it
possible to skip lowering to source mods in these cases.
v2: use option flags instead of a boolean (Jason Ekstrand)
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: fix for specialization constants as well
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: [email protected]
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
this helps reduce the overall code changes when a bit_size parameter is
added to nir_load_system_value
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
this allows to replace some nir_load_system_value calls with the specific
system value constructor
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
If the local work group size is variable it won't be available
at compile time so we can't lower it in nir_lower_system_values().
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example the following type of thing is seen in TCS from
a number of Vulkan and DXVK games:
vec1 32 ssa_557 = deref_var &oPatch (shader_out float)
vec1 32 ssa_558 = intrinsic load_deref (ssa_557) ()
vec1 32 ssa_559 = deref_var &oPatch@42 (shader_out float)
vec1 32 ssa_560 = intrinsic load_deref (ssa_559) ()
vec1 32 ssa_561 = deref_var &oPatch@43 (shader_out float)
vec1 32 ssa_562 = intrinsic load_deref (ssa_561) ()
intrinsic store_deref (ssa_557, ssa_558) (1) /* wrmask=x */
intrinsic store_deref (ssa_559, ssa_560) (1) /* wrmask=x */
intrinsic store_deref (ssa_561, ssa_562) (1) /* wrmask=x */
No shader-db changes on i965 (SKL).
vkpipeline-db results RADV (VEGA):
Totals from affected shaders:
SGPRS: 7832 -> 7728 (-1.33 %)
VGPRS: 6476 -> 6740 (4.08 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 469572 -> 456596 (-2.76 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 989 -> 960 (-2.93 %)
Wait states: 0 -> 0 (0.00 %)
The Max Waves and VGPRS changes here are misleading. What is
happening is a bunch of TCS outputs are being optimised away as
they are now recognised as unused. This results in more varyings
being compacted via nir_compact_varyings() which can result in
more register pressure when they are not packed in an optimal way.
This is an existing problem independent of this patch. I've run
some benchmarks and haven't noticed any performance regressions
in affected games.
Reviewed-by: Jason Ekstrand <[email protected]>
|