| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
"Result is the same as computing the sum of the absolute values of
OpDPdx and OpDPdy on P."
We were doing sum of absolute values of OpDPdx of P and OpDPdx of NULL.
|
|
|
|
|
|
| |
These were originally added to reduce compiler warnings but aren't really
needed. Getting rid of them reduces the diff between the Vulkan branch and
master, so we might as well.
|
| |
|
| |
|
|\
| |
| |
| |
| | |
This also reverts commit 1d65abfa582a371558113f699ffbf16d60b64c90 because
now NIR handles texture offsets in a much more sane way.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When NIR was originally drafted, there was no easy way to determine if
something was constant or not. The result was that we had lots of
special-casing for constant values such as this. Now that load_const
instructions are SSA-only, it's really easy to find constants and this
isn't really needed anymore.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This fixes two issues. First, we had a use-after-free in the case where
the instruction got deleted and we tried to return mov->dest.write_mask.
Second, in the case where we are doing a self-mov of a register, we delete
those channels that are moved to themselves from the write-mask. This
means that those channels aren't reported as being handled even though they
are. We now stash off the write-mask before remove unneeded channels so
that they still get reported as handled.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94073
Reviewed-by: Matt Turner <[email protected]>
Cc: "11.0 11.1" <[email protected]>
|
| |
| |
| |
| | |
We were pulling the member index from the wrong dword
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The sub-determinate implementation pattern fixed by
6a7e2904e0a2a6f8efbf739a1b3cad7e1e4ab42d has a second instance in the
same file.
With the previous algorithm, when row and j are both 3, the index
overruns the array. This only impacts the stack on 32 bit builds.
Reviewed-by: Jason Ekstrand <[email protected]>
|
| | |
|
|\|
| |
| |
| | |
This pulls in the separate texture/sampler stuff from upstream
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit adds the capability to NIR to support separate textures and
samplers. As it currently stands, glsl_to_nir only sets the texture deref
and leaves the sampler deref alone as it did before and nir_lower_samplers
assumes this. Backends can still assume that they are combined and only
look at only at the texture index. Or, if they wish, they can assume that
they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir
all set both texture and sampler index whenever a sampler is required (the
two indices are the same in this case).
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We're about to separate the two concepts. When we do, the sampler will
become optional. Doing a rename first makes the separation a bit more
safe because drivers that depend on GLSL or TGSI behaviour will be fine to
just use the texture index all the time.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| | |
|
| | |
|
|\|
| |
| |
| | |
This pulls in Rob Clark's const_index changes for NIR
|
| |
| |
| |
| |
| | |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Direct access to intr->const_index[n], where different slots have
different meanings, is somewhat confusing.
Instead, let's put some extra info in nir_intrinsic_infos[] about which
slots map to what, and add some get/set helpers. The helpers validate
that the field being accessed (base/writemask/etc) is applicable for the
intrinsic opc, for some extra safety. And nir_print can use this to
dump out decoded const_index fields.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
These are used in GLSL IR to removed unused varyings and match
transform feedback variables. There is no need to use these in NIR.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Helps 11 shaders in UnrealEngine4 demos.
I seriously hope they would have given us bitfieldReverse() if we
exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that
anyway?).
instructions in affected programs: 4875 -> 4633 (-4.96%)
cycles in affected programs: 270516 -> 244516 (-9.61%)
I suspect there's a *lot* of room to improve nir_search/opt_algebraic's
handling of this. We'd actually like to match, e.g., step2 by matching
step1 once and then doing a pointer comparison for the second instance
of step1, but unfortunately we generate an enormous tuple for instead.
The .text size increases by 6.5% and the .data by 17.5%.
text data bss dec hex filename
22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o
24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o
I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is
in a GL 4.0 context once we expose GL 4.0.
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The next patch adds an algebraic rule that uses the constant 0xff00ff00.
Without this change, the build fails with
return hex(struct.unpack('I', struct.pack('i', self.value))[0])
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
The hex() function handles integers of any size, and assigning a
negative value to an unsigned does what we want in C. The pack/unpack is
unnecessary (and as we see, buggy).
Reviewed-by: Dylan Baker <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Walking the SSA definitions in order means that we consider the smallest
algebraic optimizations before larger optimizations. So if a smaller
rule is part of a larger rule, the smaller one will happen first,
preventing the larger one from happening.
instructions in affected programs: 32721 -> 32611 (-0.34%)
helped: 106
In programs whose nir_optimize loop count changes (129 of them):
before: 1164 optimization loops
after: 1071 optimization loops
Of the 129 affected, 16 programs' optimization loop counts increased.
Prevents regressions and annoyances in the next commits.
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| |
| |
| | |
Prevents regressions in the next commit.
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
instructions in affected programs: 668 -> 664 (-0.60%)
helped: 4
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
Previously, in order to get things working, we just always shadowed
variables. Now, we rewrite derefs whenever it's safe to do so and only
shadow if we have an in or out variable that we write or read to
respectively.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| | |
Parameterize build_asin() on the fit coefficients so the
implementation can be shared while still using different polynomials
for asin and acos. Also switch back to implementing acos in terms of
asin -- The improvement obtained from cancelling out the pi/2 terms
was negligible compared to the approximation error.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
vtn_handle_type() creates a signed type regardless of the value of the
signedness flag, which usually doesn't make much of a difference
except when the type is used as base sampled type of an image type,
what will cause the base type of the NIR image variable to be
inconsistent with its format and cause an assertion failure in the
back-end (most likely only reproducible on Gen7), and may change the
semantics of the image intrinsic subtly (e.g. UMIN may become IMIN).
|
|\| |
|
| |
| |
| |
| | |
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
| |
| |
| |
| |
| |
| | |
The uint versions zero extend while the int versions sign extend.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
| |
| |
| |
| |
| |
| | |
i965/fs was the only consumer, and we're now doing the lowering in NIR.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
| |
| |
| |
| |
| |
| | |
Strangely the return and parameter types were reversed.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|/
|
|
| |
This pulls in the patches that move all of the compiler stuff around
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|
|
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|