| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
load_front_face and load_helper_invocation produce booleans.
On Broadwell:
total instructions in shared programs: 11638956 -> 11638011 (-0.01%)
instructions in affected programs: 115093 -> 114148 (-0.82%)
helped: 628
HURT: 14
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't want src_is_bool() and src_is_type(x, nir_type_bool) to behave
differently. Having the logic spread out over three functions makes it
harder to decide where to put new logic, as well.
So, combine them all. It's a bit simpler because there's now only one
recursive function rather than a pair of mutually recursive functions.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, 'a@type' can only match if 'a' is produced by an ALU
instruction. This is rather limited - there are other cases we
can easily detect which we should handle.
Extending the code in-place would be fairly messy, so we introduce
a new src_is_type() helper.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some optimizations, like converting integer multiply/divide into left/
right shifts, have additional constraints on the search expression.
Like requiring that a variable is a constant power of two. Support
these cases by allowing a fxn name to be appended to the search var
expression (ie. "a#32(is_power_of_two)").
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
There seemed to be missing one break in nested switchcases.
Signed-off-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Antia Puentes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
%ld and %lu aren't the right format specifiers for int64_t and uint64_t
on 32-bit (x86) systems. They're %zu on Linux and %Iu on Windows.
Use the standard C99 macros in hopes that they work everywhere.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the first pass of implementing exact handling, I made a mistake with
search-and-replace. In particular, we only reallly handled exact/inexact
on the root of the tree. Instead, we need to check every node in the tree
for an exact/inexact match. As an example of this, consider the following
GLSL code
precise float a = b + c;
if (a < 0) {
do_stuff();
}
In that case, only the add will be declared "exact" and an expression that
looks for "b + c < 0" will still match and replace it with "b < -c" which
may yield different results. The solution is to simply bail if any of the
values are exact when matching an inexact expression.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In many places, the convention is to pass an existing ssadef name ptr
when construction/initializing a new nir_ssa_def. But that goes badly
(as noticed by garbage in nir_print output) when the original string
gets freed.
Just use ralloc_strdup() instead, and add ralloc_free() in the two
places that would care (not that the strings wouldn't eventually get
freed anyways).
Also fixup the nir_search code which was directly setting ssadef->name
to use the parent instruction as memctx.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we replace an expresion we have to compute bitsize information for the
replacement. We do this in two passes to validate that bitsize information
is consistent and correct: first we propagate bitsize from child nodes to
parent, then we do it the other way around, starting from the original's
instruction destination bitsize.
v2 (Iago):
- Always use nir_type_bool32 instead of nir_type_bool when generating
algebraic optimizations. Before we used nir_type_bool32 with constants
and nir_type_bool with variables.
- Fix bool comparisons in nir_search.c to account for bitsized types.
v3 (Sam):
- Unpack the double constant value as unsigned long long (8 bytes) in
nir_algrebraic.py.
v4 (Sam):
- Use helpers to get type size and base type from nir_alu_type.
Signed-off-by: Iago Toral Quiroga <[email protected]>
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Squash multiple commits addressing the new parameter in different
files so we don't break the build (Iago)
v3: Fix tgsi (Samuel)
v4: Fix nir_clone.c (Samuel)
v5: Fix vc4 and freedreno (Iago)
v6 (Sam)
- Fix build errors in nir_lower_indirect_derefs
- Use helper to get type size from nir_alu_type.
Signed-off-by: Iago Toral Quiroga <[email protected]>
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Tested-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|