| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mir_foreach_instr_in_block_safe() is based on list_for_each_entry_safe()
which is designed to protect against removal of the current entry, but
removing the entry placed just after the current one will lead to a
use-after-free situation.
Luckily, the midgard_pair_load_store() logic guarantees that the
instruction being removed (if any) is never placed just after ins which
in turn guarantees that the hidden __next variable always points to a
valid object.
Took me a bit of time to realize that this code was safe, so I'm
suggesting to get rid of the inner mir_foreach_instr_in_block_from()
loop and rework the code so that the removed instruction is always the
current one (which is what the list_for_each_entry_safe() API was
initially designed for).
While at it, we also get rid of the unecessary insert(ins)/remove(ins)
dance by simply moving the instruction around.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
list_for_each_entry() does not allow modifying the current item pointer.
Let's rework the skip-instructions logic in schedule_block() to not
break this rule.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set of opcodes doesn't have enough flexibility in certain cases. E.g.
Utgard PP has vector conditional select operation, but condition is always
scalar. Lowering all the vector selects to scalar increases instruction
number, so we need a way to filter only those ops that can't be handled
in hardware.
Reviewed-by: Qiang Yu <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
| |
These helpers are unused, as flagged by cppcheck.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
| |
In practice, the new post-schedule print is just as useful.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
|
| |
It has not been used in a long time; I forgot this file even existed.
Flagged by cppcheck.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
| |
Miscellaneous minor issues flagged by cppcheck.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
| |
cppcheck.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
| |
We already had a perfectly cromulent pass for this, but one landed in
common NIR code so let's switch and lighten our tree.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This optimization depended on RA running before scheduling. It therefore
no longer applies and is now unused.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a tradeoff.
Scheduling before RA means we don't do RA on what-will-become pipeline
registers. Importantly, it means the scheduler is able to reorder
instructions, as registers have not been decided yet.
Unfortunately, it also complicates register spilling, since the spills
themselves won't get bundled optimally and we can only spill twice per
ALU bundle (only one spill per bundle allowed here). It also prevents us
from eliminating dead moves introduced by register allocation, as they
are not dead before RA. The shader-db regressions are from poor spilling
choices introduced by the new bundling requirements. These could be
solved by the combination of a post-scheduler (to combine adjacent
spills into bundles) with a VLIW-aware spill cost calculation.
Nevertheless, the change is small enough that I feel it's worth it to
eat a tiny shader-db regression for the sake of flexibility.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Rather than using a pile of hacks and awkward constructs in MIR to
ensure the writeout parameter gets written into r0, let's add a
dedicated shadow register class for writeout (interfering with work
register r0) so we can express the writeout condition succintly and
directly.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
There's no slot for it; you'll end up writing into the void and
clobbering stuff. Don't. do it.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
When running the register allocator after scheduling, the MIR looks a
little different, so we need to extend the RA to handle a few of these
extra cases correctly.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
After scheduling, we still have valid MIR, but we have additional
bundling annotations which we would like to keep debug, so print these.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Rather than a vague "br.??" line, annotate the branch with its target
type (useful for disambiguating discards) and whether it was inverted.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This is deadcode.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
I'm not sure if this is strictly necessary but it makes debugging easier
and minimizes the diff with the experimental scheduler.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scheduling occurs on a per-block basis, strongly assuming that a given
block contains at most a single branch. This does not always map to the
source NIR control flow, particularly when discard intrinsics are
involved. The solution is to allow scheduling barriers, which will
terminate a block early in code generation and open a new block.
To facilitate this, we need to move some post-block processing to a new
pass, rather than relying hackily on the current_block pointer.
This allows us to cleanup some logic analyzing branches in other parts
of the driver us well, now that the MIR is much more well-formed.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This allow multiblock blend shaders to compute constant colour offsets
correctly.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
It's sometimes convenient to call this with no instruction specified. By
definition, a missing instruction cannot reference any argument, so
let's check for NULL and shortciruit to false.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
The branch has the writeout specified in its source list, making this
special even if it's not explicitly part of r0.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to run register allocation after scheduling, it is sometimes
necessary to be able to insert instructions into an already-scheduled
program. This is suboptimal, since it forces us to do a worst-case
scheduling, but it is nevertheless required for correct handling of
spills/fills. Let's add helpers to insert instructions as standalone
bundles for use in spilling code.
These helpers are minimal -- they *only* work on load/store ops or
moves. They should not be used for anything but register spilling; any
other instructions should be added prior to the schedule.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
While it doesn't matter with an unconditional move to the conditional
register (r31), when we try to elide that move we'll need to track the
swizzle explicitly, and there is no slot for that yet since ALU ops are
normally binary.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This ensures the block only has exactly one branch, which makes
scheduling happy.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Oh boy. Midgard scheduling is crazy... These are all just the
requirements, not even the algorithm yet.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This will allow us to reference the condition while scheduling.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
It doesn't really matter but... meh.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
..to distinguish from scalar csel.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
The scheduler would like to use these.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
The scheduler shouldn't need to worry about this.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This helper doesn't need to be in the giant loop.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This does not affect shaders in any way. Rather, it makes the shader-db
instruction count recorded in the compiler accurate with the in-order
scheduler, matching up with what we calculate from pandecode.
Though shaders are the same, instruction counts cannot be compared
across this commit for this reason.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instructions attached to blocks are never explicitly freed. Let's
use ralloc() to attach those objects to the compiler context so that
they are automatically freed when the ctx object is freed.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
midgard_compile_shader_nir()"
This reverts commit 5882e0def97a47aff050f5a3f412b97a7f440e27.
This commit causes a segfault.
Signed-off-by: Boris Brezillon <[email protected]>
|
|
|
|
|
|
|
|
| |
Add an assert() in schedule_bundle() to make sure all instruction
pointers in bundle.instructions[] are valid.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
midgard_compile_shader_nir()
Right now we're leaking all block and instruction objects allocated by
the compiler. Let's clean things up before leaving
midgard_compile_shader_nir().
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
To avoid memory leaks.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
The invert wasn't applying (correctly) due to the issues addressed here.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This is just a bit of refactoring to simplify MIR.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When moving constants, if switching to a floating-point representation
doesn't break anything, we'd rather have an fmov than an imov,
permitting inlining the constant in many circumstances.
total quadwords in shared programs: 3408 -> 3366 (-1.23%)
quadwords in affected programs: 1188 -> 1146 (-3.54%)
helped: 41
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 9.65% x̃: 11.11%
95% mean confidence interval for quadwords value: -1.07 -0.98
95% mean confidence interval for quadwords %-change: -11.38% -7.93%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Storing constants as float doesn't make sense when we have integer
instructions; better to switch to be integer natively and coerce to/from
float rather than the opposite.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
fdph instructions show up in some desktop GL shaders.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|