aboutsummaryrefslogtreecommitdiffstats
path: root/src/freedreno/ir3/ir3_sched.c
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/sched: reset delay counters at start of blockRob Clark2020-06-161-0/+2
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3: make foreach_ssa_src declar cursor ptrRob Clark2020-05-191-6/+1
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
* freedreno/ir3/deps: report progressRob Clark2020-05-191-12/+11
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
* freedreno/ir3/sched: try to avoid syncsRob Clark2020-05-131-13/+99
| | | | | | | | | | Similar to what we do in postsched. It is useful for pre-RA sched to be a bit aware of things that would cause syncs. In particular for the tex fetches, since the vecN src/dst tends to limit postsched's ability to re-order them. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/sched: avoid scheduling outputsRob Clark2020-05-131-8/+87
| | | | | | | | | | | | If an instruction's only use is as an output, and it increases register pressure, then try to avoid scheduling it until there are no other options. A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these immed loads to the end of the shader. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/sched: awareness of partial livenessRob Clark2020-04-131-1/+44
| | | | | | | | | Realize that certain instructions make a vecN live, and account for this, in hopes of scheduling the remaining components of the vecN sooner. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
* freedreno/ir3: new pre-RA schedulerRob Clark2020-04-131-376/+426
| | | | | | | | | | | | This replaces the depth-first search scheduler with a more traditional ready-list scheduler. It primarily tries to reduce register pressure (number of live values), with the exception of trying to schedule kills as early as possible. (Earlier iterations of this scheduler had a tendency to push kills later, and in particular moving texture fetches which may not be necessary ahead of kills.) Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
* ir3: Plumb through support for a1.xConnor Abbott2020-04-091-26/+59
| | | | | | | | This will need to be used in some cases for the upcoming bindless support, plus ldc.k instructions which push data from a UBO to const registers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
* freedreno/ir3: small cleanup and commentsRob Clark2020-03-271-8/+8
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
* freedreno/ir3: split out has_latency_to_hide()Rob Clark2020-03-101-25/+1
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
* freedreno/ir3: remove extra nops inserted in schedulerRob Clark2020-03-101-12/+0
| | | | | | | | | They were inserting a nop between back to back SFU instrucions. But that doesn't actually appear to be required. And they get stripped out later anyways before legalize. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>
* freedreno/ir3: track half-precision live valuesRob Clark2020-02-281-14/+31
| | | | | | | | | In schedule live value tracking, differentiate between half vs full precision. Half-precision live values are less costly than full precision. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>
* freedreno/ir3: don't hide latency when there is none to hideRob Clark2020-02-281-5/+52
| | | | | | | | | Current scheduler thresholds try to ensure there are warps available to switch to when hiding texture fetch latency. But if there is none to hide, we should allow scheduler to use more registers to reduce nops. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>
* freedreno/ir3: move block-scheduling into legalizeRob Clark2020-02-011-42/+0
| | | | | | | | | We want to do this only once. If we have post-RA sched pass, then we don't want to do it pre-RA. Since legalize is where we resolve the branch/jumps, we might as well move this into legalize. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
* freedreno/ir3: move nop padding to legalizeRob Clark2020-02-011-52/+0
| | | | | | | | | | | This way we can deal with it in one place, *after* all the blocks have been scheduled. Which will simplify life for a post-RA sched pass. This has the benefit of already taking into account nop's that legalize has to insert for non-delay related reasons. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
* freedreno/ir3: split out delay helpersRob Clark2020-02-011-115/+4
| | | | | | | | We're going to want these also for a post-RA sched pass. And also to split nop stuffing out into it's own pass. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
* freedreno/ir3: add iterator macrosRob Clark2019-12-131-16/+16
| | | | | | So many open coded list iterators were getting annoying. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add scheduler tracesRob Clark2019-12-131-0/+19
| | | | | | | Add some infrastructure to trace scheduler decisions. The next patch will add some more traces, just splitting this out to reduce clutter. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix gpu hang with pre-fs-tex-fetchRob Clark2019-11-121-10/+20
| | | | | | | | | | | | | | | | | | | | | | For pre-fs-dispatch texture fetch, we need to assign bary_ij to r0.x, even if it is not used in the shader (ie. only varying use is for tex coords). But if, for example, gl_FragCoord is used, it could get assigned on top of bary_ij, resulting in a GPU hang. The solution to this is two-fold: (1) the inputs/outputs rework has the benefit of making RA realize bary_ij is a vec2, even if there are no split/collect instructions (due to no varying fetches in the shader itself). And (2) extend the live ranges of meta:input instructions to the first non-input, to prevent RA from assigning the same register to multiple inputs. Backport note: because of (1) above, a better solution for 19.3 would be to revert f30c256ec05. Fixes: f30c256ec05 ("freedreno/ir3: enable pre-fs texture fetch for a6xx") Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: add input/output iteratorsRob Clark2019-11-121-7/+2
| | | | | | | | We can at least get rid of the if-not-NULL check in a bunch of places. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: rename fanin/fanout to collect/splitRob Clark2019-11-121-8/+8
| | | | | | | | | | | If I'm going to refactor a bit to use these meta instructions to also handle input/output, then might as well cleanup the names first. Nouveau also uses collect/split for names of these meta instructions, and I like those names better. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* util: rename list_empty() to list_is_empty()Timothy Arceri2019-10-281-1/+1
| | | | | | | This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <[email protected]>
* freedreno/ir3: add meta instruction for pre-fs texture fetchRob Clark2019-10-181-1/+2
| | | | | | | | | | | | | | | | | Add a placeholder instruction to track texture fetches made prior to FS shader dispatch. These, like meta:input instructions are scheduled before any real instructions, so that RA realizes their result values are live before the first real instruction. And to give legalize a way to track usage of fetched sample requiring (sy) sync flags. There is some related special handling for varying texcoord inputs used for pre-fs-fetch, so that they are not DCE'd and remain in linkage between FS and previous stage. Note that we could almost avoid this special handling by giving meta:tex_prefetch real src arguments, except that in the FS stage, inputs are actual bary.f/ldlv instructions. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/ir3: assert that only single addressRob Clark2019-09-061-0/+1
| | | | | | | | | | | | | | An instruction can reference only a single address register value. Add an assert to catch bugs. Also, address value should also be local to the same block as the instruction. (The one spot where changing the instruction address is actually legit needs to clear the address first.) Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: fix addr/pred spillingRob Clark2019-09-061-7/+42
| | | | | | | | | The live_values and use_count was not being properly updated. This starts triggering problems with the next patch, where we allow copy propagation for RELATIV access. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: convert block->predecessors to setRob Clark2019-08-281-4/+6
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: immediately schedule meta instructionsRob Clark2019-06-031-0/+3
| | | | | | | | The aren't real instructions, and don't change # of live values, so no point in them competing with real instructions. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: scheduler improvementsRob Clark2019-06-031-13/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For instructions that increase the # of live values, apply a threshold to avoid scheduling them too early. And factor the net change of # of live values that would result from scheduling an instruction, to prioritize instructions that reduce number of live values as the number of live values increases. For manhattan: total instructions in shared programs: 27869 -> 28413 (1.95%) instructions in affected programs: 26756 -> 27300 (2.03%) helped: 102 HURT: 87 total full in shared programs: 1903 -> 1719 (-9.67%) full in affected programs: 1390 -> 1206 (-13.24%) helped: 124 HURT: 9 The reduction in register usage nets ~20% gain in manhattan. (So getting mediump support should be a huge win for gles gfxbench.) Also significantly helps some of the more complex shadertoy shaders, like IQ's Piano (32 to 18 regs, doubles fps). The effect is less pronounced on smaller shaders. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: sched should mark outputs usedRob Clark2019-06-031-19/+35
| | | | | | | | Account for shader outputs and values live in any direct/indirect successor block. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: reads/writes to unrelated arrays are not dependentRob Clark2019-03-281-1/+30
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: sched fixRob Clark2019-03-281-1/+1
| | | | | | | Not sure why new-style frag inputs start triggering this. But we probably shouldn't consider src's from other blocks. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: track register pressure in schedRob Clark2019-03-031-8/+89
| | | | | | | | | | | | | | | Not a perfect solution, and the "pressure" target is hard-coded. But it doesn't really seem to much in the common case, and avoids exploding register usage in dEQP ssbo tests. So this should serve as a stop-gap solution until I have time to re- write the scheduler. Hurts slightly in instruction count, but gains (reduces) slightly the register usage in shader-db. Fixes ~150 dEQP-GLES31.functional.ssbo.* that were failing due to RA fail. Signed-off-by: Rob Clark <[email protected]>
* freedreno: move ir3 to common locationRob Clark2018-11-271-0/+818
Move (most of) the ir3 compiler to src/freedreno/ir3 so that it can be re-used by some future vulkan driver. The parts that are gallium specific have been refactored out and remain in the gallium driver. Getting the move done now so that it can happen before further refactoring to support a6xx specific instructions. NOTE also removes ir3_cmdline compiler tool from autotools build since that was easier than fixing it and I normally use meson build. Waiting patiently for the day that we can remove *everything* from the autotools build. Signed-off-by: Rob Clark <[email protected]>