| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
| |
It used to be in nir_gather_info.c until I moved it out to nir.h so it
could be re-used with some linking code that never got merged. We'll move
it back out if and when we have real code to share it with.
|
|\ |
|
| |
| |
| |
| |
| |
| | |
This correlates directly to the SPIR-V opcode OpQuantizeToF16
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Oddly, this did not affect the shader where I first noticed the pattern.
That particular shader doesn't get its if-statement converted to a bcsel
because there are two assignments in the else-statement. This led to me
submitting https://bugs.freedesktop.org/show_bug.cgi?id=94747.
shader-db results:
Sandy Bridge
total instructions in shared programs: 8467384 -> 8467069 (-0.00%)
instructions in affected programs: 36594 -> 36279 (-0.86%)
helped: 46
HURT: 0
total cycles in shared programs: 117573448 -> 117568518 (-0.00%)
cycles in affected programs: 339114 -> 334184 (-1.45%)
helped: 46
HURT: 0
Ivy Bridge / Haswell / Broadwell / Skylake:
total instructions in shared programs: 7774258 -> 7773999 (-0.00%)
instructions in affected programs: 30874 -> 30615 (-0.84%)
helped: 46
HURT: 0
total cycles in shared programs: 65739190 -> 65734530 (-0.01%)
cycles in affected programs: 180380 -> 175720 (-2.58%)
helped: 45
HURT: 1
No change on G45 or Ironlake.
I also tried these expressions, but none of them affected any shaders in
shader-db:
(('bcsel', a, 'a@bool', 'b@bool'), ('ior', a, b)),
(('bcsel', a, 'b@bool', False), ('iand', a, b)),
(('bcsel', a, 'b@bool', 'a@bool'), ('iand', a, b)),
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
total instructions in shared programs: 7112159 -> 7088092 (-0.34%)
instructions in affected programs: 1374915 -> 1350848 (-1.75%)
helped: 7392
HURT: 621
GAINED: 2
LOST: 2
|
| |
| |
| |
| | |
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Previously, the pass assumed that the entrypoint would be whatever function
happened to have the name "main". We really shouldn't trust in the
function names.
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
They are no longer in the list of local variables so we need to explicitly
sweep them.
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Rob Clark <[email protected]>
|
| |
| |
| |
| |
| |
| | |
The SPIR-V spec says geometry shaders are supposed to have one invocation
by default. The execution mode is only required if there are multiple
invocations.
|
| |
| |
| |
| |
| | |
NIR now just handles this for us by not fusing if the multiply is marked as
exact.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In the first pass of implementing exact handling, I made a mistake with
search-and-replace. In particular, we only reallly handled exact/inexact
on the root of the tree. Instead, we need to check every node in the tree
for an exact/inexact match. As an example of this, consider the following
GLSL code
precise float a = b + c;
if (a < 0) {
do_stuff();
}
In that case, only the add will be declared "exact" and an expression that
looks for "b + c < 0" will still match and replace it with "b < -c" which
may yield different results. The solution is to simply bail if any of the
values are exact when matching an inexact expression.
|
| | |
|
| |
| |
| |
| | |
This is similar to the way that regular ALU operations are handled.
|
| |
| |
| |
| |
| | |
This was useful once upon a time but now that we have a real Vulkan driver
to run our SPIR-V binaries through, there's really no point.
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
| | |
|
|\| |
|
| |
| |
| |
| |
| |
| |
| | |
This commit adds a new NIR pass that lowers all function calls away by
inlining the functions.
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This commit adds a NIR pass for lowering away returns in functions. If the
return is in a loop, it is lowered to a break. If it is not in a loop,
it's lowered away by moving/deleting code as needed.
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| |
| | |
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This can happen if a function ends in a return instruction and you remove
the return.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The efficiency should be approximately the same. We do a little more work
per phi node because we have to sort the predecessors. However, we no
longer have to walk the blocks a second time to pop things off the stack.
The bigger advantage, however, is that we can now re-use the phi placement
and per-block SSA value tracking in other passes.
As a side-benifit, the phi builder actually handles unreachable blocks
correctly. The original vars_to_ssa code, because of the way it iterated
the blocks and added phi sources, didn't add sources corresponding to
predecessors of unreachable blocks. The new strategy employed by the phi
builder creates a phi source for each predecessor and should correctly
handle unreachable blocks by setting those sources to SSA undefs.
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Previously, nir_dominance.c didn't properly handle unreachable blocks.
This can happen if, for instance, you have something like this:
loop {
if (...) {
break;
} else {
break;
}
}
In this case, the block right after the if statement will be unreachable.
This commit makes two changes to handle this. First, it removes an assert
and allows block->imm_dom to be null if the block is unreachable. Second,
it properly skips unreachable blocks in calc_dom_frontier_cb.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Right now, we have phi placement code in two places and there are other
places where it would be nice to be able to do this analysis. Instead of
repeating it all over the place, this commit adds a helper for placing all
of the needed phi nodes for a value.
v2: Add better documentation
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In many places, the convention is to pass an existing ssadef name ptr
when construction/initializing a new nir_ssa_def. But that goes badly
(as noticed by garbage in nir_print output) when the original string
gets freed.
Just use ralloc_strdup() instead, and add ralloc_free() in the two
places that would care (not that the strings wouldn't eventually get
freed anyways).
Also fixup the nir_search code which was directly setting ssadef->name
to use the parent instruction as memctx.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Many of our optimizations, while great for cutting shaders down to size,
aren't really precision-safe. This commit tries to flag all of the
inexact floating-point optimizations so they don't get run on values that
are flagged "exact". It's a bit conservative and maybe flags some safe
optimizations as unsafe but that's better than missing one.
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
The previous transformation got the arguments to fmin backwards. When NaNs
are involved, the GLSL min/max aren't commutative so it matters.
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
The fxor opcode is required to return 1.0f or 0.0f but the input variable
may not be 1.0f or 0.0f.
Reviewed-by: Francisco Jerez <[email protected]>
|