aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/lima/ir/gp/instr.c
Commit message (Collapse)AuthorAgeFilesLines
* lima: gpir: enforce instruction limit earlierVasily Khoruzhick2020-03-061-0/+6
| | | | | | | | | | | Enforce instruction limit of 512 instructions earlier. This is a workaround for infinite loops in gpir compiler and allows us to pin point the tests that are affected. Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4055> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4055>
* lima/gpir: Support branch instructionsConnor Abbott2019-09-241-1/+0
| | | | | | | | | | | | | | | | Because branch conditions have to be in the pass slot, there is no unconditional branch, and realistically the pass slot has to contain a move when branching (there's nothing it does that would be useful for operating on booleans, so we can't use it for anything when computing the branch condition), we put the branch instruction in the pass slot and at codegen time turn it into a move of the branch condition. This means that it doesn't have to be special-cased like store instructions are in the scheduler. Because of this decision we can remove the half-implemented BRANCH codegen slot. Finally, we (ab)use the existing schedule_first mechanism to make sure that branches are always last in the basic block. Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Always schedule complex2 and *_impl right after complex1Connor Abbott2019-07-301-4/+11
| | | | | | | | | | | | | | See https://gitlab.freedesktop.org/lima/mesa/issues/94 for the gory details of why this is needed. For *_impl this is easy, since it never increases register pressure and it goes in the complex slot hence it never counts against max nodes. It's a bit more challenging for complex2, since it does count against max nodes, so we need to change the reservation logic to reserve an extra slot for complex2 when scheduling complex1. This second part isn't strictly necessary yet, but it will be for exp2. Signed-off-by: Connor Abbott <[email protected]> Acked-by: Qiang Yu <[email protected]>
* lima/gp: Fix problem with complex movesConnor Abbott2019-07-181-2/+37
| | | | | | | | | | | | | | | | | | | When writing the scheduler, we forgot that you can't read the complex unit in certain sources because it gets overwritten to 0 or 1. Fixing this turned out to be possible without giving up and reducing GPIR_VALUE_REG_NUM to 10, although it was difficult in a way I didn't expect. There can be at most 4 next-max nodes that can't have moves scheduled in the complex slot, so it actually isn't a problem for getting the number of next-max nodes at 5 or lower. However, it is a problem for stores. If a given node is a next-max node whose move cannot go in the complex slot *and* is used by a store that we decide to schedule, we have to reserve one of the non-complex slots for a move instead of all the slots, or we can wind up in a situation where only the complex slot is free and we fail the move. This means that we have to add another term to the reservation logic, for stores whose children cannot be in the complex slot. Acked-by: Qiang Yu <[email protected]>
* lima/gpir: Rework the schedulerConnor Abbott2019-07-181-21/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, we do scheduling at the same time as value register allocation. The ready list now acts similarly to the array of registers in value_regalloc, keeping us from running out of slots. Before this, the value register allocator wasn't aware of the scheduling constraints of the actual machine, which meant that it sometimes chose the wrong false dependencies to insert. Now, we assign value registers at the same time as we actually schedule instructions, making its choices reflect reality much better. It was also conservative in some cases where the new scheme doesn't have to be. For example, in something like: 1 = ld_att 2 = ld_uni 3 = add 1, 2 It's possible that one of 1 and 2 can't be scheduled in the same instruction as 3, meaning that a move needs to be inserted, so the value register allocator needs to assume that this sequence requires two registers. But when actually scheduling, we could discover that 1, 2, and 3 can all be scheduled together, so that they only require one register. The new scheduler speculatively inserts the instruction under consideration, as well as all of its child load instructions, and then counts the number of live value registers after all is said and done. This lets us be more aggressive with scheduling when we're close to the limit. With the new scheduler, the kmscube vertex shader is now scheduled in 40 instructions, versus 66 before. Acked-by: Qiang Yu <[email protected]>
* lima/gpir: Fix some bugs in instruction handlingConnor Abbott2019-07-181-0/+12
| | | | Reviewed-by: Qiang Yu <[email protected]>
* lima/gpir: fix alu check miss last store slotQiang Yu2019-04-141-2/+2
| | | | | | Fixes: 92d7ca4b1cd "gallium: add lima driver" Signed-off-by: Qiang Yu <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: fix compile fail when two slot nodeQiang Yu2019-04-141-1/+1
| | | | | | | | Come from glmark2-es2 jellyfish test. Fixes: 92d7ca4b1cd "gallium: add lima driver" Signed-off-by: Qiang Yu <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* gallium: add lima driverQiang Yu2019-04-111-0/+488
v2: - use renamed util_dynarray_grow_cap - use DEBUG_GET_ONCE_FLAGS_OPTION for debug flags - remove DRM_FORMAT_MOD_ARM_AGTB_MODE0 usage - compute min/max index in driver v3: - fix plbu framebuffer state calculation - fix color_16pc assemble - use nir_lower_all_source_mods for lowering neg/abs/sat - use float arrary for static GPU data - add disassemble comment for static shader code - use drm_find_modifier v4: - use lima_nir_lower_uniform_to_scalar v5: - remove nir_opt_global_to_local when rebase Cc: Rob Clark <[email protected]> Cc: Alyssa Rosenzweig <[email protected]> Acked-by: Eric Anholt <[email protected]> Signed-off-by: Andreas Baierl <[email protected]> Signed-off-by: Arno Messiaen <[email protected]> Signed-off-by: Connor Abbott <[email protected]> Signed-off-by: Erico Nunes <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Signed-off-by: Koen Kooi <[email protected]> Signed-off-by: Marek Vasut <[email protected]> Signed-off-by: marmeladema <[email protected]> Signed-off-by: PaweÅ‚ Chmiel <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Rohan Garg <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Qiang Yu <[email protected]>