summaryrefslogtreecommitdiffstats
path: root/src/panfrost
Commit message (Collapse)AuthorAgeFilesLines
* pan/midgard: Allow scheduling conditions with constantsAlyssa Rosenzweig2019-09-301-4/+10
| | | | | | | | Now that we have constant adjustment logic abstracted, we can do this safely. Along with the csel inversion patch, this allows many more common csel ops to inline their condition in the bundle. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add csel invert optimizationAlyssa Rosenzweig2019-09-303-0/+27
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_flip helperAlyssa Rosenzweig2019-09-303-10/+21
| | | | | | | Useful for various operations on both commutative and anticommutative ops. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Tightly pack 32-bit constantsAlyssa Rosenzweig2019-09-301-16/+113
| | | | | | | If we can reuse constant slots from other instructions, we would like to do so to include more instructions per bundle. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Allow writeout to see into the futureAlyssa Rosenzweig2019-09-301-1/+40
| | | | | | | | If an instruction could be scheduled to vmul to satisfy the writeout conditions, let's do that and save an instruction+cycle per fragment shader. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Allow 6 instructions per bundleAlyssa Rosenzweig2019-09-301-2/+3
| | | | | | We never had a scheduler good enough to hit this case before! :) Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Only one conditional per bundle allowedAlyssa Rosenzweig2019-09-301-0/+16
| | | | | | There's no r32 to save ya after you use up r31 :) Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Schedule to smul/saddAlyssa Rosenzweig2019-09-301-0/+5
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend choose_instruction for scalar unitsAlyssa Rosenzweig2019-09-301-0/+4
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Don't double check SCALAR unitsAlyssa Rosenzweig2019-09-301-4/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use new schedulerAlyssa Rosenzweig2019-09-303-678/+130
| | | | | | | | We still emit in-order but we switch to using the bundles created from the new scheduler, which will allow greater flexibility and room for out-of-order optimization. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add distance metric to choose_instructionAlyssa Rosenzweig2019-09-301-0/+14
| | | | | | | | | | | | | | We require chosen instructions to be "close", to avoid ballooning register pressure. This is a kludge that will go away once we have proper liveness tracking in the scheduler, but for now it prevents a lot of needless spilling. v2: Lower threshold to 6 (from 8). Schedule is hurt, but a few shaders that spilled excessively are fixed. Signed-off-by: Alyssa Rosenzweig <[email protected]> Derp
* pan/midgard: Add mir_choose_alu helperAlyssa Rosenzweig2019-09-301-0/+24
| | | | | | Based on a given unit. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement load/store pairingAlyssa Rosenzweig2019-09-301-55/+12
| | | | | | | We can bundle two load/store together. This eliminates the need for explicit load/store pairing in a prepass, as well. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend csel_swizzle to branchesAlyssa Rosenzweig2019-09-303-5/+10
| | | | | | | | Conditions for branches don't have a swizzle explicitly in the emitted binary, but they do implicitly get swizzled in whatever instruction wrote r31, so we need to handle that. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add helpers for scheduling conditionalsAlyssa Rosenzweig2019-09-301-0/+146
| | | | | | | | | | | | | Conditional instructions (csel and conditional branches) require their condition to be written to a special condition pipeline register (r31.w for scalar, r31.xyzw for vector). However, pipeline registers are live only for the duration of a single bundle. As such, the logic to schedule conditionals correct is surprisingly complex. Essentially, we see if we could stuff the conditional within the same bundle as the csel/branch without breaking anything; if we can, we do that. If we can't, we add a dummy move to make room. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement predicate->unitAlyssa Rosenzweig2019-09-301-0/+9
| | | | | | This allows ALUs to select for each unit of the bundle separately. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add predicate->excludeAlyssa Rosenzweig2019-09-301-4/+14
| | | | | | | | | | | | | | | A bit of a kludge but allows setting an implicit dependency of synthetic conditional moves on the actual condition, fixing code generated like: vmul.feq r0, .. sadd.imov r31, .., r0 vadd.fcsel [...] The imov runs simultaneous with feq so it gets garbage results, but it's too late to add an actual dependency practically speaking, since the new synthetic imov doesn't have a node associated. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add constant intersection filtersAlyssa Rosenzweig2019-09-301-0/+55
| | | | | | | | | | In the future, we will want to keep track of which components of constants of various sizes correspond to which parts of the bundle constants, like in the old scheduler. For now, let's just stub it out for a simple rule of one instruction with embedded constants per bundle. We can eventually do better, of course. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove csel constant unit forceAlyssa Rosenzweig2019-09-301-3/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_schedule_texture/ldst/alu helpersAlyssa Rosenzweig2019-09-301-0/+190
| | | | | | | We don't actually do any scheduling here yet, but add per-tag helpers to consume an instruction, print it, pop it off the worklist. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_choose_bundle helperAlyssa Rosenzweig2019-09-301-0/+25
| | | | | | | | | It's not always obvious what the optimal bundle type should be. Let's break out the logic to decide. Currently set for purely in-order operation. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_update_worklist helperAlyssa Rosenzweig2019-09-301-0/+39
| | | | | | | | After we've chosen an instruction, popped it off, and processed it, it's time to update the worklist, removing that instruction from the dependency graph to allow its dependents to be put onto the worklist. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_choose_instruction stubAlyssa Rosenzweig2019-09-301-0/+55
| | | | | | | | | | | | | In the future, this routine will implement the core scheduling logic to decide which instruction out of the worklist will be scheduled next, in a way that minimizes cycle count and register pressure. In the present, we are more interested in replicating in-order scheduling with the much-more-powerful out-of-order model. So rather than discriminating by a register pressure estimate, we simply choose the latest possible instruction in the worklist. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Initialize worklistAlyssa Rosenzweig2019-09-301-0/+17
| | | | | | This flows naturally from the dependency graph Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Calculate dependency graphAlyssa Rosenzweig2019-09-302-0/+131
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add flatten_mir helperAlyssa Rosenzweig2019-09-301-0/+22
| | | | | | | We would like to flatten a linked list of midgard_instructions into an array of midgard_instruction pointers on the heap. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Squeeze indices before schedulingAlyssa Rosenzweig2019-09-301-0/+1
| | | | | | This allows node_count to be correct while scheduling. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix component count handling for ldstAlyssa Rosenzweig2019-09-302-37/+37
| | | | | | | It's not based on the writemask and it can't be inferred; it's just intrinsic to the op itself. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add missing parans in SWIZZLE definitionAlyssa Rosenzweig2019-09-301-1/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* Revert "panfrost: Rework midgard_pair_load_store() to kill the nested ↵Boris Brezillon2019-09-191-29/+34
| | | | | | | | | | foreach loop" There's a missing prev_ldst = NULL; assignment in the new logic, but even with this fixed it seems to regress some applications, so let's revert the change until we find the real problem. This reverts commit c9bebae2877e55cdcd94f9f9f3f6805238caeb28.
* panfrost: Rework midgard_pair_load_store() to kill the nested foreach loopBoris Brezillon2019-09-131-34/+29
| | | | | | | | | | | | | | | | | | | | | | | mir_foreach_instr_in_block_safe() is based on list_for_each_entry_safe() which is designed to protect against removal of the current entry, but removing the entry placed just after the current one will lead to a use-after-free situation. Luckily, the midgard_pair_load_store() logic guarantees that the instruction being removed (if any) is never placed just after ins which in turn guarantees that the hidden __next variable always points to a valid object. Took me a bit of time to realize that this code was safe, so I'm suggesting to get rid of the inner mir_foreach_instr_in_block_from() loop and rework the code so that the removed instruction is always the current one (which is what the list_for_each_entry_safe() API was initially designed for). While at it, we also get rid of the unecessary insert(ins)/remove(ins) dance by simply moving the instruction around. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix a list_assert() in schedule_block()Boris Brezillon2019-09-131-4/+6
| | | | | | | | | list_for_each_entry() does not allow modifying the current item pointer. Let's rework the skip-instructions logic in schedule_block() to not break this rule. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* nir: allow specifying filter callback in lower_alu_to_scalarVasily Khoruzhick2019-09-062-2/+2
| | | | | | | | | | | | | Set of opcodes doesn't have enough flexibility in certain cases. E.g. Utgard PP has vector conditional select operation, but condition is always scalar. Lowering all the vector selects to scalar increases instruction number, so we need a way to filter only those ops that can't be handled in hardware. Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* pan/midgard: Remove mir_rewrite_index_*_tagAlyssa Rosenzweig2019-09-032-29/+0
| | | | | | | These helpers are unused, as flagged by cppcheck. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Remove mir_print_bundleAlyssa Rosenzweig2019-09-031-13/+0
| | | | | | | In practice, the new post-schedule print is just as useful. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Remove cppwrap.cppAlyssa Rosenzweig2019-09-032-10/+0
| | | | | | | | It has not been used in a long time; I forgot this file even existed. Flagged by cppcheck. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Fix cppcheck issuesAlyssa Rosenzweig2019-09-035-22/+27
| | | | | | | Miscellaneous minor issues flagged by cppcheck. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Correct issues in disassemble.cAlyssa Rosenzweig2019-09-031-23/+22
| | | | | | | cppcheck. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/decode: Add missing format specifierAlyssa Rosenzweig2019-09-031-1/+1
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/decode: Use portable format specifier for 64-bitAlyssa Rosenzweig2019-09-031-1/+1
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/decode: Use %zu instead of %dAlyssa Rosenzweig2019-09-031-3/+3
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/decode: Fix uninitialized variablesAlyssa Rosenzweig2019-09-031-2/+5
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Use shared psiz clamp passAlyssa Rosenzweig2019-08-302-6/+1
| | | | | | | We already had a perfectly cromulent pass for this, but one landed in common NIR code so let's switch and lighten our tree. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove mir_opt_post_move_eliminateAlyssa Rosenzweig2019-08-302-49/+0
| | | | | | | This optimization depended on RA running before scheduling. It therefore no longer applies and is now unused. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Schedule before RAAlyssa Rosenzweig2019-08-301-27/+29
| | | | | | | | | | | | | | | | | | | | | This is a tradeoff. Scheduling before RA means we don't do RA on what-will-become pipeline registers. Importantly, it means the scheduler is able to reorder instructions, as registers have not been decided yet. Unfortunately, it also complicates register spilling, since the spills themselves won't get bundled optimally and we can only spill twice per ALU bundle (only one spill per bundle allowed here). It also prevents us from eliminating dead moves introduced by register allocation, as they are not dead before RA. The shader-db regressions are from poor spilling choices introduced by the new bundling requirements. These could be solved by the combination of a post-scheduler (to combine adjacent spills into bundles) with a VLIW-aware spill cost calculation. Nevertheless, the change is small enough that I feel it's worth it to eat a tiny shader-db regression for the sake of flexibility. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Handle fragment writeout in RAAlyssa Rosenzweig2019-08-306-24/+49
| | | | | | | | | | Rather than using a pile of hacks and awkward constructs in MIR to ensure the writeout parameter gets written into r0, let's add a dedicated shadow register class for writeout (interfering with work register r0) so we can express the writeout condition succintly and directly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Do not propagate swizzles into writeoutAlyssa Rosenzweig2019-08-301-3/+5
| | | | | | | There's no slot for it; you'll end up writing into the void and clobbering stuff. Don't. do it. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix misc. RA issuesAlyssa Rosenzweig2019-08-301-10/+15
| | | | | | | | When running the register allocator after scheduling, the MIR looks a little different, so we need to extend the RA to handle a few of these extra cases correctly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Print MIR by the bundleAlyssa Rosenzweig2019-08-301-2/+11
| | | | | | | After scheduling, we still have valid MIR, but we have additional bundling annotations which we would like to keep debug, so print these. Signed-off-by: Alyssa Rosenzweig <[email protected]>