| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Promote as much as we feasibly can while keeping it Midgard/Bifrost
agnostic.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4150>
|
|
|
|
|
|
|
|
|
|
| |
Fixes
dEQP-GLES2.functional.shaders.indexing.varying_array.vec2_dynamic_loop_write_
static_read with register pressure forced down.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3950>
|
|
|
|
|
|
|
|
| |
We unify disparate metadata about tags into a single structure to ensure
information is not left out.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Barriers execute on the texture pipeline on Midgard, so let's
tentatively handle barrier() as conservatively as possible (forcing
memory barriers of both buffers and shared memory). Implementation isn't
quite there yet -- it doesn't look at interactions of adjacent barriers
like it's supposed to -- but the core is there.
Fixes dEQP-GLES31.functional.compute.basic.ssbo_local_barrier_single_invocation
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>
|
|
|
|
|
|
|
|
|
| |
Fixes RA failure in
dEQP-GLES31.functional.shaders.builtin_functions.common.modf.* (which
uses multiple indirect SSBO writes)
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3835>
|
|
|
|
|
|
|
|
|
| |
Fixes dEQP-GLES3.functional.shaders.fragdepth.write.dynamic_conditional_write
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Signed-off-by: Boris Brezillon <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3697>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3697>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ZS fragment stores are done like color fragment stores, except it's
using a different RT id (0xFF), the depth and stencil values are stored
in r1.x and r1.y.
Signed-off-by: Boris Brezillon <[email protected]>
[Fix the scheduling part]
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3697>
|
|
|
|
|
|
|
|
| |
Allocate those instructions with ralloc() instead of using mem_dup.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3676>
|
|
|
|
|
|
|
|
|
|
|
| |
mir_schedule_alu()
There's a writeout bool storing the result of this test. Use it instead
of duplicating the test.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3676>
|
|
|
|
|
|
|
|
|
|
| |
Lot of churn but mostly just specializes types per source instead of per
instruction.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Boris Brezillon <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3653>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3653>
|
|
|
|
|
|
| |
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3496>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3496>
|
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
|
|
|
|
|
|
|
|
|
|
| |
Right now, constant combining is not supported in 16 bit mode, and 64
bit mode is simply ignored. Let's rework the function to make it
type/bit-size agnostic.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each instruction bundle can contain up to 16 constant bytes. The meaning
of those byte is instruction dependent: it depends on the instruction
native type (int, uint or float) and the instruction reg_mode (8, 16, 32
or 64 bit). Those different layouts can be exposed as a union to
facilitate constants manipulation.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Corner case causing invalid scheduling on shaders with nested csels,
i.e. GLSL code resembling:
(foo ? bool1 : bool2) ? x : y
By explicitly disallowing csels this is fixed.
Fixes INSTR_INVALID_ENC on a glamor shader (noticeable with slowdown and
visual corruption when scrolling "too far" on GTK apps).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the schedule_program implementation being used is picked
at compile time, which on the Android platform means that the
bifrost compiler & scheduler is used for all targets, including
midgard based hardware.
This commit disambiguates between the two schedule_program functions.
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
It's not clear yet what the distinction is.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
They need a very particular form; the naive way we did before is not
sufficient in practice, it doesn't look like. So let's follow the rough
structure of the blob's writeout since this is fixed code anyway.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
There are more ALU tags, let's do some cleanup while we're at it.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
It's a long story... but we'd try to insert constants that weren't there
and end up clobbering fields in the bundle following the constant
array...
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
It's an ugly hack that's no longer used.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Uniform/work registers are partitioned on a shader-by-shader basis as
determined by the compiler. We add a simple heuristic here running
before scheduling that prioritizes mitigating spilling at all costs.
A more sophisticated heuristic should run *after* scheduling, doing a
dry run of the register allocator itself to determine spilling. Fitting
this into our current scheduling model is difficult, so while this
heuristic does hurt some shaders, overall the results are acceptable:
total instructions in shared programs: 50065 -> 38747 (-22.61%)
instructions in affected programs: 37187 -> 25869 (-30.44%)
helped: 59
HURT: 77
helped stats (abs) min: 1 max: 757 x̄: 198.46 x̃: 151
helped stats (rel) min: 0.48% max: 62.89% x̄: 32.95% x̃: 36.27%
HURT stats (abs) min: 1 max: 9 x̄: 5.08 x̃: 6
HURT stats (rel) min: 0.92% max: 14.29% x̄: 6.71% x̃: 4.60%
95% mean confidence interval for instructions value: -111.15 -55.29
95% mean confidence interval for instructions %-change: -14.33% -6.67%
Instructions are helped.
total bundles in shared programs: 30606 -> 19157 (-37.41%)
bundles in affected programs: 23907 -> 12458 (-47.89%)
helped: 58
HURT: 74
helped stats (abs) min: 6 max: 757 x̄: 203.09 x̃: 152
helped stats (rel) min: 5.19% max: 77.00% x̄: 49.38% x̃: 53.79%
HURT stats (abs) min: 1 max: 9 x̄: 4.46 x̃: 5
HURT stats (rel) min: 1.85% max: 26.32% x̄: 11.70% x̃: 9.57%
95% mean confidence interval for bundles value: -115.46 -58.01
95% mean confidence interval for bundles %-change: -20.87% -9.41%
Bundles are helped.
total quadwords in shared programs: 31305 -> 32027 (2.31%)
quadwords in affected programs: 20471 -> 21193 (3.53%)
helped: 0
HURT: 133
HURT stats (abs) min: 1 max: 9 x̄: 5.43 x̃: 5
HURT stats (rel) min: 0.76% max: 15.15% x̄: 5.47% x̃: 4.65%
95% mean confidence interval for quadwords value: 5.00 5.86
95% mean confidence interval for quadwords %-change: 4.85% 6.08%
Quadwords are HURT.
total registers in shared programs: 2256 -> 2545 (12.81%)
registers in affected programs: 708 -> 997 (40.82%)
helped: 0
HURT: 95
HURT stats (abs) min: 1 max: 8 x̄: 3.04 x̃: 3
HURT stats (rel) min: 12.50% max: 100.00% x̄: 39.41% x̃: 37.50%
95% mean confidence interval for registers value: 2.64 3.45
95% mean confidence interval for registers %-change: 34.62% 44.19%
Registers are HURT.
total threads in shared programs: 1776 -> 1709 (-3.77%)
threads in affected programs: 134 -> 67 (-50.00%)
helped: 0
HURT: 67
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: -1.00 -1.00
95% mean confidence interval for threads %-change: -50.00% -50.00%
Threads are HURT.
total spills in shared programs: 3868 -> 2 (-99.95%)
spills in affected programs: 3868 -> 2 (-99.95%)
helped: 60
HURT: 0
total fills in shared programs: 6456 -> 4 (-99.94%)
fills in affected programs: 6456 -> 4 (-99.94%)
helped: 60
HURT: 0
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150>
|
|
|
|
|
|
|
|
|
| |
I'm honestly unsure what this is for, but it's needed on MFBD systems
for unknown reasons, at least when MRT is actually in use and then
sometimes without MRT (it fixes a blend shader issue on T760?)
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Visoso <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The flow is considerably more complicated. Instead of one writeout loop
like usual, we have a separate write loop for each render target. This
requires some scheduling shenanigans to get right.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Visoso <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We move it to the register allocator itself. It doesn't belong in
midgard_schedule.c!
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Urja Rannikko <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Urja Rannikko <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Urja Rannikko <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Urja Rannikko <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This simplifies manipulation of the offsets dramatically, fixing some
UBO access related bugs.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Eventually, we will want to combine constants across types, but for now
let's not break the world.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
On newer GPUs, this is a no-op. On older GPUs, this prevents needless
spilling since texture registers are shared with a subset of work
registers.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
Tested-by: Andre Heider <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
It's now unused, in favour of LCRA.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Pretty routine, we do have a hack to force swizzle alignment for !32-bit
for until we implement !32-bit the right way.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
Spilling can mess with this considerably.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
We don't need it in practice, so this is some more cleanup.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Rather than having hw-specific swizzles encoded directly in the
instructions, have a unified swizzle arary so we can manipulate swizzles
generically.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This allows for vec16 dependencies in the scheduler, not that we have
any yet (thankfully).
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Read component masks don't have a particular type associated, since the
type of the ALU operation may not match the type of the operands in
question. So let's generate byte masks instead, and update the rest of
the compiler to use byte masks when analyzing reads.
Preparation for mixed types.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
It doesn't make sense. You already spilled it once, and it didn't help.
Don't try again, or you'll end up in a loop.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This will allow us to explicitly invalidate liveness analysis results so
we can cache liveness results.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that we have constant adjustment logic abstracted, we can do this
safely. Along with the csel inversion patch, this allows many more
common csel ops to inline their condition in the bundle.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
If we can reuse constant slots from other instructions, we would like to
do so to include more instructions per bundle.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
If an instruction could be scheduled to vmul to satisfy the writeout
conditions, let's do that and save an instruction+cycle per fragment
shader.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
| |
There's no r32 to save ya after you use up r31 :)
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|