| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
This allow multiblock blend shaders to compute constant colour offsets
correctly.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Oh boy. Midgard scheduling is crazy... These are all just the
requirements, not even the algorithm yet.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
The scheduler shouldn't need to worry about this.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This helper doesn't need to be in the giant loop.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instructions attached to blocks are never explicitly freed. Let's
use ralloc() to attach those objects to the compiler context so that
they are automatically freed when the ctx object is freed.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Add an assert() in schedule_bundle() to make sure all instruction
pointers in bundle.instructions[] are valid.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
This is just a bit of refactoring to simplify MIR.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
shader-db regression in the scheduler.
Fixes: dff4986b1aa ("pan/midgard: Emit store_output branch just-in-time")
total bundles in shared programs: 2055 -> 2019 (-1.75%)
bundles in affected programs: 1055 -> 1019 (-3.41%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 20.00% x̄: 6.71% x̃: 5.16%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -8.45% -4.97%
Bundles are helped.
total quadwords in shared programs: 3444 -> 3408 (-1.05%)
quadwords in affected programs: 1897 -> 1861 (-1.90%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 14.29% x̄: 3.97% x̃: 2.99%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -5.08% -2.86%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This allows nodes to be unsigned and prevents a class of weird
signedness bugs identified by Coverity.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
It's small; this way we don't leak memory.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Multiple spill moves share a single spill slot. Issue found in Krita.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This allows us to have multiple spill moves, whereas otherwise for N
spill moves, the first N-1 would be clobbered. Issue found in Krita.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This used a delicate hack to try to find indirect inputs and skip them
as candidates for pairing. Let's use a better criterion -- no sources --
and pair based on that.
We could do better, but that would require more complex data flow
analysis than we're interested in doing here.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Hint for the RA to avoid infinite spilling loops.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
This is a corner case that happens a lot with SSBOs. Basically, if we
only read a few components of a uniform, we need to only spill a few
components or otherwise we try to spill what we spilled and RA hangs.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
No glmark changes, but this seems like a good idea.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that register spilling is in place, this is reasonable. It turns out
for some shaders, it's actually better to cap at 8 work registers and
extra >8 uniform reigsters and tolerate the spilling, since the extra
resulting threads make up for the spillage. So incidentally, the shader
that spills here is in -bterrain, which jumps from 19fps to 21fps as a
result of this change.
total instructions in shared programs: 3513 -> 3448 (-1.85%)
instructions in affected programs: 776 -> 711 (-8.38%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 3.25 x̃: 2
helped stats (rel) min: 3.57% max: 16.00% x̄: 8.37% x̃: 7.19%
95% mean confidence interval for instructions value: -4.28 -2.22
95% mean confidence interval for instructions %-change: -10.02% -6.73%
Instructions are helped.
total bundles in shared programs: 2067 -> 2024 (-2.08%)
bundles in affected programs: 515 -> 472 (-8.35%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 6 x̄: 2.37 x̃: 2
helped stats (rel) min: 2.13% max: 17.86% x̄: 10.19% x̃: 11.11%
HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for bundles value: -3.01 -1.29
95% mean confidence interval for bundles %-change: -12.13% -6.91%
Bundles are helped.
total quadwords in shared programs: 3468 -> 3426 (-1.21%)
quadwords in affected programs: 764 -> 722 (-5.50%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 2.26 x̃: 2
helped stats (rel) min: 1.41% max: 12.50% x̄: 6.76% x̃: 7.14%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 1.08% max: 1.08% x̄: 1.08% x̃: 1.08%
95% mean confidence interval for quadwords value: -2.83 -1.37
95% mean confidence interval for quadwords %-change: -8.08% -4.65%
Quadwords are helped.
total registers in shared programs: 383 -> 360 (-6.01%)
registers in affected programs: 112 -> 89 (-20.54%)
helped: 19
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 1.21 x̃: 1
helped stats (rel) min: 12.50% max: 27.27% x̄: 20.63% x̃: 20.00%
95% mean confidence interval for registers value: -1.47 -0.95
95% mean confidence interval for registers %-change: -22.39% -18.87%
Registers are helped.
total threads in shared programs: 432 -> 451 (4.40%)
threads in affected programs: 19 -> 38 (100.00%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.73 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.41 2.04
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are [helped].
total loops in shared programs: 4 -> 4 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total spills in shared programs: 0 -> 4
spills in affected programs: 0 -> 4
helped: 0
HURT: 2
total fills in shared programs: 0 -> 7
fills in affected programs: 0 -> 7
helped: 0
HURT: 2
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
No functional changes, just breaks out a megamonster function and fixes
the indentation.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We need three independent sources to support indirect SSBO writes (as
well as textures with both LOD/bias and offsets). Now is a good time to
make sources just an array so we don't have to rewrite a ton of code if
we ever needed a fourth source for some reason.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The 16-bit field can be decomposed to two independent 8-bit fields, each
representing a single (additional) argument to the load/store op,
generally used for encoding registers. Addressable registers here are
substantially limited compared to the main register in a load/store op.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Another constant source of bugs. Ain't that special.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
It's not that special.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Rather than putting registers after SSA in the MIR indexing, put them
side-by-side, shifted 1, using the bottom bit as the SSA/reg select.
This will allow us to generate SSA temps in the compiler.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make scalar scheduling onto vector units more aggressive (it can only
help while we schedule strictly in order). Also, allow imov on VLUT.
total bundles in shared programs: 2176 -> 2117 (-2.71%)
bundles in affected programs: 901 -> 842 (-6.55%)
helped: 24
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 2.46 x̃: 2
helped stats (rel) min: 2.08% max: 20.00% x̄: 8.68% x̃: 5.94%
95% mean confidence interval for bundles value: -3.93 -0.99
95% mean confidence interval for bundles %-change: -10.92% -6.45%
Bundles are helped.
total quadwords in shared programs: 3605 -> 3566 (-1.08%)
quadwords in affected programs: 1984 -> 1945 (-1.97%)
helped: 28
HURT: 5
helped stats (abs) min: 1 max: 3 x̄: 1.68 x̃: 2
helped stats (rel) min: 1.02% max: 14.29% x̄: 5.12% x̃: 2.94%
HURT stats (abs) min: 1 max: 3 x̄: 1.60 x̃: 1
HURT stats (rel) min: 0.57% max: 9.09% x̄: 6.40% x̃: 9.09%
95% mean confidence interval for quadwords value: -1.67 -0.69
95% mean confidence interval for quadwords %-change: -5.37% -1.37%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We loosen the requirement of "no dependencies" to simply be "no
non-pipelined dependencies", so we check for what could be pipelined.
total bundles in shared programs: 2176 -> 2156 (-0.92%)
bundles in affected programs: 779 -> 759 (-2.57%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.33% max: 20.00% x̄: 6.47% x̃: 2.78%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -9.44% -3.50%
Bundles are helped.
total quadwords in shared programs: 3605 -> 3585 (-0.55%)
quadwords in affected programs: 1391 -> 1371 (-1.44%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 14.29% x̄: 3.84% x̃: 1.64%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -5.73% -1.94%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than always emitting an extra move for fragments, check the
actual criteria and emit accordingly. (This was lost during the RA
improvements at the end of May).
total bundles in shared programs: 2210 -> 2176 (-1.54%)
bundles in affected programs: 501 -> 467 (-6.79%)
helped: 34
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.59% max: 33.33% x̄: 13.13% x̃: 12.50%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -16.06% -10.21%
Bundles are helped.
total quadwords in shared programs: 3639 -> 3605 (-0.93%)
quadwords in affected programs: 795 -> 761 (-4.28%)
helped: 34
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.96% max: 33.33% x̄: 11.22% x̃: 8.33%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -14.31% -8.13%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Think of this pass as register coalescing part 2. After RA runs, but
before scheduling, we scan for code of the form:
mov rN, rN
and delete the move, since it's totally redundant. This pass helps
already, but it'd of course be much more effective paired with
register coalescing to encourage moves in general to end up in this
form. Nevertheless, even by itself:
total instructions in shared programs: 3665 -> 3613 (-1.42%)
instructions in affected programs: 2046 -> 1994 (-2.54%)
helped: 52
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 8.02% x̃: 4.00%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -10.26% -5.79%
Instructions are helped.
total bundles in shared programs: 2256 -> 2213 (-1.91%)
bundles in affected programs: 1154 -> 1111 (-3.73%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.33% max: 25.00% x̄: 9.10% x̃: 5.56%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -11.60% -6.60%
Bundles are helped.
total quadwords in shared programs: 3689 -> 3642 (-1.27%)
quadwords in affected programs: 2025 -> 1978 (-2.32%)
helped: 47
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 7.86% x̃: 3.85%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -10.30% -5.42%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 3916 -> 3665 (-6.41%)
instructions in affected programs: 1405 -> 1154 (-17.86%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 21 x̄: 7.17 x̃: 3
helped stats (rel) min: 3.00% max: 28.57% x̄: 20.11% x̃: 21.74%
95% mean confidence interval for instructions value: -9.35 -4.99
95% mean confidence interval for instructions %-change: -22.75% -17.46%
Instructions are helped.
total bundles in shared programs: 2472 -> 2256 (-8.74%)
bundles in affected programs: 906 -> 690 (-23.84%)
helped: 32
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 6.75 x̃: 3
helped stats (rel) min: 5.56% max: 32.26% x̄: 20.83% x̃: 16.67%
95% mean confidence interval for bundles value: -9.09 -4.41
95% mean confidence interval for bundles %-change: -23.77% -17.89%
Bundles are helped.
total quadwords in shared programs: 3965 -> 3689 (-6.96%)
quadwords in affected programs: 1568 -> 1292 (-17.60%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 21 x̄: 7.89 x̃: 3
helped stats (rel) min: 2.08% max: 28.57% x̄: 19.87% x̃: 20.00%
95% mean confidence interval for quadwords value: -10.38 -5.39
95% mean confidence interval for quadwords %-change: -22.57% -17.17%
Quadwords are helped.
total registers in shared programs: 411 -> 392 (-4.62%)
registers in affected programs: 76 -> 57 (-25.00%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.27 x̃: 1
helped stats (rel) min: 9.09% max: 50.00% x̄: 30.97% x̃: 33.33%
95% mean confidence interval for registers value: -1.52 -1.01
95% mean confidence interval for registers %-change: -39.12% -22.82%
Registers are helped.
total threads in shared programs: 426 -> 432 (1.41%)
threads in affected programs: 6 -> 12 (100.00%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We shouldn't try to schedule onto a vmul if the last unit was a smul;
that would force a break ("traveling back in time").
total bundles in shared programs: 2519 -> 2472 (-1.87%)
bundles in affected programs: 791 -> 744 (-5.94%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 2.35 x̃: 1
helped stats (rel) min: 1.52% max: 11.76% x̄: 7.94% x̃: 7.69%
95% mean confidence interval for bundles value: -3.47 -1.23
95% mean confidence interval for bundles %-change: -9.36% -6.51%
Bundles are helped.
total quadwords in shared programs: 4028 -> 3965 (-1.56%)
quadwords in affected programs: 1223 -> 1160 (-5.15%)
helped: 17
HURT: 0
helped stats (abs) min: 1 max: 17 x̄: 3.71 x̃: 2
helped stats (rel) min: 2.97% max: 10.64% x̄: 6.97% x̃: 7.14%
95% mean confidence interval for quadwords value: -5.71 -1.70
95% mean confidence interval for quadwords %-change: -8.03% -5.91%
Quadwords are helped.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Given the constraints on special registers, we add a helper for lowering
these by inserting moves (copies) where needed to satsify the ISA
constraints.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We reuse the same register spilling mechanism as for work->memory to
spill special->work registers, e.g. to allow writing out more than 2
vec4 varyings (without better scheduling anyway).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Route this info through so we can track how we're doing on register
spilling.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This was disabled to permit regression-free RA work. Now that the spill
code is in place, we can reenable, with some caveats about efficacy.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Pipe through the number of bytes of spilled memory used from the
compiler into the main driver, where it will be used to allocate the
Thread Local Storage buffer.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we run RA in a loop, before each iteration after a failed
allocation we choose a spill node and spill it to Thread Local Storage
using st_int4/ld_int4 instructions (for spills and fills respectively).
This allows us to compile complex shaders that normally would not fit
within the 16 work register limits, although it comes at a fairly steep
performance penalty.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
These are used to load/store from Thread Local Storage, which is memory
allocated per-thread (corresponding to ctx->scratchpad in the command
stream) and used for register spilling.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Rather than creating either a load or a uniform register read with a
fixed beginning offset, we always create a load and then promote to a
uniform register later. This will allow us to promote in a register
pressure aware manner.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow us to insert instructions as a result of register
allocation, permitting spilling to be implemented. As a side effect,
with the assert commented out this would fix a bunch of glamor crashes
(due to RA failures) so MATE becomes useable.
Ideally we'll have scheduling or RA actually sorted out before the
branch point but if not this gives us a one-line out to get X working...
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 812ce2ce9e5655613eae740926176509897122fa.
We massively regress with the reverted patch. So in the meantime, take
it out.
Signed-off-by: Tomeu Vizoso <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes this assertion:
../mesa/src/panfrost/midgard/midgard_schedule.c:507:schedule_block: Assertion `ins == __next && "use _safe iterator"' failed.
Trace/breakpoint trap
Signed-off-by: Tomeu Vizoso <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
In preparation for a Panfrost-based non-Gallium driver (maybe
Vulkan...?), hoist everything except for the Gallium driver into a
shared src/panfrost. Practically, that means the compilers, the headers,
and pandecode.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|