aboutsummaryrefslogtreecommitdiffstats
path: root/src/panfrost/midgard/midgard_ra.c
Commit message (Collapse)AuthorAgeFilesLines
* pan/midgard: Remove util/ra supportAlyssa Rosenzweig2019-11-131-243/+22
| | | | | | It's now unused, in favour of LCRA. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Integrate LCRAAlyssa Rosenzweig2019-11-131-57/+67
| | | | | | | Pretty routine, we do have a hack to force swizzle alignment for !32-bit for until we implement !32-bit the right way. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend default_phys_reg to !32-bitAlyssa Rosenzweig2019-11-041-5/+5
| | | | | | We can pass through a size. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend offset_swizzle to non-32-bitAlyssa Rosenzweig2019-11-041-3/+4
| | | | | | We take a size parameter; use it. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: offset_swizzle doesn't need dstsizeAlyssa Rosenzweig2019-11-041-9/+9
| | | | | | This argument should be omitted. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add bizarre corner caseAlyssa Rosenzweig2019-11-041-1/+8
| | | | | | Someone really needs to look into this. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Compute bundle interferenceAlyssa Rosenzweig2019-11-041-0/+57
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Eliminate blank_alu_srcAlyssa Rosenzweig2019-11-011-3/+2
| | | | | | We don't need it in practice, so this is some more cleanup. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Refactor swizzlesAlyssa Rosenzweig2019-11-011-22/+16
| | | | | | | | Rather than having hw-specific swizzles encoded directly in the instructions, have a unified swizzle arary so we can manipulate swizzles generically. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add a dummy source for loadsAlyssa Rosenzweig2019-11-011-7/+4
| | | | | | | | | | | | | | We want symmetry between loads and stores, so we add a dummy source. So we get, e.g. st_int4 _, val, arg_1, arg_2 ld_int4 dest, _, arg_1, arg_2 Semantically, this dummy source represents the data itself, as if the load is simply a move. That means it has a swizzle that acts as a source. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Express allocated registers as offsetsAlyssa Rosenzweig2019-10-251-104/+62
| | | | | | | | | | Rather than supplying a mask/swizzle to compose with the original, just supply the offset of the allocated register so we can directly offset the mask/swizzle, without resorting to composition. This is simpler, cleaner, and will generalize to non-32-bit. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Handle nontrivial masks in texture RAAlyssa Rosenzweig2019-10-201-1/+1
| | | | | | The texture instruction has a mask we need to take into account. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Report byte masks for read componentsAlyssa Rosenzweig2019-10-201-1/+1
| | | | | | | | | | | Read component masks don't have a particular type associated, since the type of the ALU operation may not match the type of the operands in question. So let's generate byte masks instead, and update the rest of the compiler to use byte masks when analyzing reads. Preparation for mixed types. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use 16-bit liveness masksAlyssa Rosenzweig2019-10-161-1/+1
| | | | | | We'll want liveness per-byte, so we need to accomodate up to 16 bytes. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Begin tracking liveness metadataAlyssa Rosenzweig2019-10-031-5/+0
| | | | | | | This will allow us to explicitly invalidate liveness analysis results so we can cache liveness results. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Don't try to OR live_in of successorsAlyssa Rosenzweig2019-10-031-6/+2
| | | | | | | | By definition, once liveness analysis has occurred: live_out = OR {succ} succ->live_in Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Move RA's liveness analysis into midgard_liveness.cAlyssa Rosenzweig2019-10-031-122/+5
| | | | | | | | | | | There are unfortunately two distinct liveness analysis passes in the compiler right now -- one good (but complex) pass used by RA based on solving data flow equations, and one awful (but simple) pass used for dead code elimination and bundling based on an abstract walk of the AST. Let's move RA's pass into shared code so we can work on unifying. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Handle fragment writeout in RAAlyssa Rosenzweig2019-08-301-8/+33
| | | | | | | | | | Rather than using a pile of hacks and awkward constructs in MIR to ensure the writeout parameter gets written into r0, let's add a dedicated shadow register class for writeout (interfering with work register r0) so we can express the writeout condition succintly and directly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix misc. RA issuesAlyssa Rosenzweig2019-08-301-10/+15
| | | | | | | | When running the register allocator after scheduling, the MIR looks a little different, so we need to extend the RA to handle a few of these extra cases correctly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix corner case in RAAlyssa Rosenzweig2019-08-301-1/+1
| | | | | | It doesn't really matter but... meh. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use ralloc() to allocate instructions to avoid leaking those objsBoris Brezillon2019-08-281-2/+2
| | | | | | | | | Instructions attached to blocks are never explicitly freed. Let's use ralloc() to attach those objects to the compiler context so that they are automatically freed when the ctx object is freed. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fold ssa_args into midgard_instructionAlyssa Rosenzweig2019-08-261-46/+44
| | | | | | This is just a bit of refactoring to simplify MIR. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Simplify contradictory check.Alyssa Rosenzweig2019-08-211-4/+1
| | | | | | Coverity. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Represent unused nodes by ~0Alyssa Rosenzweig2019-08-211-13/+14
| | | | | | | This allows nodes to be unsigned and prevents a class of weird signedness bugs identified by Coverity. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Free liveness infoAlyssa Rosenzweig2019-08-211-0/+2
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Compute liveness per-blockAlyssa Rosenzweig2019-08-191-70/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than using a regalloc based on live internals, computed hastily with repeated invocations of a forward-analysis pass, we switch to compute liveness information on a per-block basis. Within a given basic block, we compute liveness backwards with a linear-time algorithm; for common shaders, this may help RA terminate quicker. Across blocks, we use a work list (really a work set) and check if we're making progress. This isn't terribly efficient, but it gets the job done. Point is, we get the live_in/live_out for each block. From there, it's simple to rerun the linear-time update algorithm to compute the interference graph. The benefit of this technique is the ability to ignore "gaps" in liveness across intermediate blocks that are never executed. On simple shaders like the loops in glmark, this results in a minor reduction in register pressure. The motivation was a complex shader in Krita that failed register allocation due to an unfortunate interaction between texture pipeline registers and control flow. This shader now compiles successfully. total instructions in shared programs: 3439 -> 3438 (-0.03%) instructions in affected programs: 22 -> 21 (-4.55%) helped: 1 HURT: 0 total bundles in shared programs: 2077 -> 2076 (-0.05%) bundles in affected programs: 12 -> 11 (-8.33%) helped: 1 HURT: 0 total quadwords in shared programs: 3457 -> 3456 (-0.03%) quadwords in affected programs: 20 -> 19 (-5.00%) helped: 1 HURT: 0 total registers in shared programs: 341 -> 338 (-0.88%) registers in affected programs: 9 -> 6 (-33.33%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33% Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Treat cubemaps "stores" as loadsAlyssa Rosenzweig2019-08-191-3/+1
| | | | | | | It's always been ambiguous which they are, but their primary register is their output, not their input; therefore, they are loads. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Set mask for lowered read-hazard movesAlyssa Rosenzweig2019-08-191-0/+1
| | | | | | | If we need to lower a move for a read from a vec2 texture coordinate, we shouldn't write zw, even incidentally. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix texw lowering with complex control flowAlyssa Rosenzweig2019-08-191-1/+1
| | | | | | | | | | | | | | | Fixes shaders with control flow like: out = 0; if (A) { if (B) out = texture(A, ...) } else { out = texture(B, ...) } Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Allocate separate spill indices for lowered movesAlyssa Rosenzweig2019-08-121-6/+4
| | | | | | This helps RA be slightly more reasonable. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend liveness analysis to trinary opsAlyssa Rosenzweig2019-08-121-6/+2
| | | | | | Fixes RA fails with multiple indirect SSBO writes. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement SSBO accessAlyssa Rosenzweig2019-08-121-6/+2
| | | | | | | | Just laying the groundwork. Reads and writes should be supported (both direct and indirect, either int or float, vec1/2/3/4), but no bounds checking is done at the moment. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Account for swizzle/mask in st_varyAlyssa Rosenzweig2019-08-091-2/+14
| | | | | | | | Register allocation for varying stores is a bit different, since the instructions ignore the writemask (varyings are normalized packed/vectorized..) Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Switch sources to an array for trinary sourcesAlyssa Rosenzweig2019-08-021-27/+34
| | | | | | | | | We need three independent sources to support indirect SSBO writes (as well as textures with both LOD/bias and offsets). Now is a good time to make sources just an array so we don't have to rewrite a ton of code if we ever needed a fourth source for some reason. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove "r27-only" register classAlyssa Rosenzweig2019-08-021-46/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As far as I know, there's no such thing as a load/store op that only takes its argument in r27. We just need to set the appropriate arg_1 field in the RA to specify other registers if we want them. To facilitate this, various RA-related changes are needed across the compiler ; this should also fix indirect offsets which were implicitly interpreted as "r27-only" despite not even passing through RA yet. One ripple effect change is switching the move insertion point and adjusting the liveness analysis accordingly, so while this was intended as a purely functional change, there are some shader-db changes: total instructions in shared programs: 3511 -> 3498 (-0.37%) instructions in affected programs: 563 -> 550 (-2.31%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1 helped stats (rel) min: 0.93% max: 5.00% x̄: 2.58% x̃: 2.33% 95% mean confidence interval for instructions value: -1.27 -0.90 95% mean confidence interval for instructions %-change: -3.23% -1.93% Instructions are helped. total bundles in shared programs: 2067 -> 2067 (0.00%) bundles in affected programs: 398 -> 398 (0.00%) helped: 7 HURT: 4 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.54% max: 10.00% x̄: 5.04% x̃: 5.56% HURT stats (abs) min: 1 max: 2 x̄: 1.75 x̃: 2 HURT stats (rel) min: 2.13% max: 4.26% x̄: 3.72% x̃: 4.26% 95% mean confidence interval for bundles value: -0.95 0.95 95% mean confidence interval for bundles %-change: -5.21% 1.50% Inconclusive result (value mean confidence interval includes 0). total quadwords in shared programs: 3464 -> 3454 (-0.29%) quadwords in affected programs: 1199 -> 1189 (-0.83%) helped: 18 HURT: 4 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.03% max: 5.26% x̄: 2.44% x̃: 1.79% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 2.56% max: 2.82% x̄: 2.63% x̃: 2.56% 95% mean confidence interval for quadwords value: -0.98 0.07 Inconclusive result (value mean confidence interval includes 0). total registers in shared programs: 383 -> 373 (-2.61%) registers in affected programs: 56 -> 46 (-17.86%) helped: 12 HURT: 2 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 9.09% max: 33.33% x̄: 29.58% x̃: 33.33% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 20.00% max: 50.00% x̄: 35.00% x̃: 35.00% 95% mean confidence interval for registers value: -1.13 -0.29 95% mean confidence interval for registers %-change: -35.07% -5.63% Registers are helped. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Don't special case inline_constantAlyssa Rosenzweig2019-07-311-11/+3
| | | | | | Another constant source of bugs. Ain't that special. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: De-special-case branchingAlyssa Rosenzweig2019-07-311-11/+0
| | | | | | It's not that special. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Lower texr/texw mixed registersAlyssa Rosenzweig2019-07-301-2/+2
| | | | | | | | | Conceptually, r28-r29 (as used for reading) and r28-r29 (as used for writing) aren't registers at all, merely push/pull arrangements. So you can't feed a texture result back into itself without explicitly moving in the middle. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Compose original texture swizzle in RAAlyssa Rosenzweig2019-07-301-2/+4
| | | | | | Used for lowering derivatives. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement texture RAAlyssa Rosenzweig2019-07-261-45/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | total instructions in shared programs: 3916 -> 3665 (-6.41%) instructions in affected programs: 1405 -> 1154 (-17.86%) helped: 35 HURT: 0 helped stats (abs) min: 1 max: 21 x̄: 7.17 x̃: 3 helped stats (rel) min: 3.00% max: 28.57% x̄: 20.11% x̃: 21.74% 95% mean confidence interval for instructions value: -9.35 -4.99 95% mean confidence interval for instructions %-change: -22.75% -17.46% Instructions are helped. total bundles in shared programs: 2472 -> 2256 (-8.74%) bundles in affected programs: 906 -> 690 (-23.84%) helped: 32 HURT: 0 helped stats (abs) min: 1 max: 18 x̄: 6.75 x̃: 3 helped stats (rel) min: 5.56% max: 32.26% x̄: 20.83% x̃: 16.67% 95% mean confidence interval for bundles value: -9.09 -4.41 95% mean confidence interval for bundles %-change: -23.77% -17.89% Bundles are helped. total quadwords in shared programs: 3965 -> 3689 (-6.96%) quadwords in affected programs: 1568 -> 1292 (-17.60%) helped: 35 HURT: 0 helped stats (abs) min: 1 max: 21 x̄: 7.89 x̃: 3 helped stats (rel) min: 2.08% max: 28.57% x̄: 19.87% x̃: 20.00% 95% mean confidence interval for quadwords value: -10.38 -5.39 95% mean confidence interval for quadwords %-change: -22.57% -17.17% Quadwords are helped. total registers in shared programs: 411 -> 392 (-4.62%) registers in affected programs: 76 -> 57 (-25.00%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.27 x̃: 1 helped stats (rel) min: 9.09% max: 50.00% x̄: 30.97% x̃: 33.33% 95% mean confidence interval for registers value: -1.52 -1.01 95% mean confidence interval for registers %-change: -39.12% -22.82% Registers are helped. total threads in shared programs: 426 -> 432 (1.41%) threads in affected programs: 6 -> 12 (100.00%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Update RA for cubemap coordsAlyssa Rosenzweig2019-07-261-0/+2
| | | | | | | Following the RA work, we apply the same technique to eliminate the move to r27 when loading cubemaps. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Force perspective ops to use vec4Alyssa Rosenzweig2019-07-251-0/+16
| | | | | | It doesn't make sense to use them with anything less. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add R27-only op handlingAlyssa Rosenzweig2019-07-251-8/+44
| | | | | | We use a special conflicting register class. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove check for `class`Alyssa Rosenzweig2019-07-251-1/+0
| | | | | | Fixes classes defaulting to vec4 in some cases. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Emit st_vary registers in install_registersAlyssa Rosenzweig2019-07-251-3/+11
| | | | | | Now that we have its registers handled normally like the rest of the IR. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_lower_special_reads helperAlyssa Rosenzweig2019-07-251-0/+112
| | | | | | | | Given the constraints on special registers, we add a helper for lowering these by inserting moves (copies) where needed to satsify the ISA constraints. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add class checkAlyssa Rosenzweig2019-07-251-1/+30
| | | | | | | | | | | This ensures the rules for accessing special register classes are satisfied. This is asserted as a prepass should have lowered offending uses to something satisfying these rules. Special register classes are *not* work registers and cannot be used for RMW operations; they are essentially 1-way pipes straight into/from fixed-function logic in the shader cores. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend liveness analysis to st_varyAlyssa Rosenzweig2019-07-251-8/+1
| | | | | | These can consume sources now. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement load/store register classingAlyssa Rosenzweig2019-07-251-17/+70
| | | | | | | | This does not yet support special->work spilling, nor does it support multiclass breakup. These corner cases will be handled in succeeding commits. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Allocate special register classesAlyssa Rosenzweig2019-07-251-37/+44
| | | | | | | We'll want to also handle load/store and texture registers in our RA loop. Signed-off-by: Alyssa Rosenzweig <[email protected]>