| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Since DCE is the only remaining function of this pass, after the pre-RA
scheduler rewrite.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do a better job of pruning when removing unused instructions, including
cleaning up dangling false-deps.
This introduces a new ssa src pointer iterator, which makes it easy to
clear links without having to think about whether it is a normal ssa
src or a false-dep.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This replaces the depth-first search scheduler with a more traditional
ready-list scheduler. It primarily tries to reduce register pressure
(number of live values), with the exception of trying to schedule kills
as early as possible. (Earlier iterations of this scheduler had a
tendency to push kills later, and in particular moving texture fetches
which may not be necessary ahead of kills.)
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>
|
|
|
|
|
|
|
|
| |
This will need to be used in some cases for the upcoming bindless
support, plus ldc.k instructions which push data from a UBO to const
registers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we are re-assigning the scalars anyways in the second pass, assign
them to the highest free reg in the first pass (rather than lowest) to
allow packing vecN regs as low as possible.
Note this required some changes specifically for tex instructions with a
single component writemask that is not necessarily .x, as previously
these would get assigned in the first RA pass, and since they are still
scalar, we'd end up w/ some r47.* and other similarly way-to-high
assignments after the 2nd pass.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>
|
|
|
|
|
|
|
|
|
|
| |
1) emperically, 10 seems like a more accurate # than 4
2) push "soft" delay handling into ir3_delayslots(), as
we should also be using it to calculate the costs
that the schedulers use
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some of the aspects of tex prefetch are in common with normal tex
instructions, such as having a wrmask to control which components
are written. Add a helper for this.
This should result in actually using the prefetch wrmask to avoid
fetching unneeded components.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
|
|
|
|
|
|
|
|
| |
We're going to want these also for a post-RA sched pass. And also to
split nop stuffing out into it's own pass.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>
|
|
|
|
|
|
|
|
| |
We need the LINEAR versions for AMD_shader_explicit_vertex_parameter.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3578>
|
|
|
|
|
|
| |
So many open coded list iterators were getting annoying.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
At the ir3 level, we would assume that we could use wrmask to mask
off other components of an instruction returning a vecN when they are
not used. Which would let RA use components not written for other live
values. But this is only true for tex instructions.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
We can at least get rid of the if-not-NULL check in a bunch of places.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If I'm going to refactor a bit to use these meta instructions to also
handle input/output, then might as well cleanup the names first.
Nouveau also uses collect/split for names of these meta instructions,
and I like those names better.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When we enable pre-dispatch texture fetch, we could have a scenario
where the barycentric i/j coord sysval is not used in the shader, but
only used for the varying fetch for the pre-dispatch texture fetch.
In this case we need to take care not to DCE this sysval.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
Just add the constructors for now and special case similar to END so
we don't remove them.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not a perfect solution, and the "pressure" target is hard-coded. But it
doesn't really seem to much in the common case, and avoids exploding
register usage in dEQP ssbo tests.
So this should serve as a stop-gap solution until I have time to re-
write the scheduler.
Hurts slightly in instruction count, but gains (reduces) slightly the
register usage in shader-db. Fixes ~150 dEQP-GLES31.functional.ssbo.*
that were failing due to RA fail.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Fixes: 0d240c22141 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Fixes a crash in dEQP-GLES3.functional.shaders.fragdepth.compare.fragcoord_z
Fixes: 0d240c22141 freedreno/ir3: don't fetch unused tex components
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Detect when a component of an (for example) texture fetch is unused and
propagate the updated wrmask back to the parent instruction.
Signed-off-by: Rob Clark <[email protected]>
|
|
Move (most of) the ir3 compiler to src/freedreno/ir3 so that it can be
re-used by some future vulkan driver. The parts that are gallium
specific have been refactored out and remain in the gallium driver.
Getting the move done now so that it can happen before further
refactoring to support a6xx specific instructions.
NOTE also removes ir3_cmdline compiler tool from autotools build since
that was easier than fixing it and I normally use meson build. Waiting
patiently for the day that we can remove *everything* from the autotools
build.
Signed-off-by: Rob Clark <[email protected]>
|