| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Properly handle the difference between split and merged register file
when determining where arrays can fit without conflicting with other
arrays or pre-colored instructions.
1) if not mergedregs, only consider other things with same precision
as potentially conflicting
2) if mergedregs, calculate everything in therms of half-regs and
convert back to fullregs in the end
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We shouldn't divide-by-two for half-reg arrays. We set the proper node
interference class, based on `arr->half`.
Fixes a RA fail with 16b arrays:
src/freedreno/ir3/ir3_ra.c:633: name_to_array: Assertion `!"invalid array name"' failed.
Caused by use/def iterators returning `arr->length` vreg namess, but
only assigning the array half that many names.
Also, since we are assigning unique vreg names to each array element,
there is no need to try and convert from half-reg to it's conflicting
full reg when pre-coloring the array elements. Getting us closer to
having half-arrays work sanely with split-register-file (a5xx and
earlier).
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
|
|
|
|
|
|
|
|
| |
Print out the assigned vreg names earlier. Also print the few special
nodes.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5957>
|
|
|
|
|
|
|
| |
Since I was going back to look at fine derivs again, add some tests of
instruction encoding.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
legalize_block() can get run multiple times, which I didn't notice when
adding fine derivs support. Other instruction clones change things such
that the legalization won't trigger again, but that didn't apply to the
DS.PP legalization. To keep someone else from tripping over this, split
the one-shot legalization out of the iterative sync flag application.
Fixes failures in dEQP-VK.glsl.derivate.dfdxfine.*
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3198
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5699>
|
|
|
|
|
|
|
|
|
|
| |
This is needed for handling drivers that use an input for loading the
face, for example Panfrost with Midgard GPUs.
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Tested-by: Urja Rannikko <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5915>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5936>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Letting unused arrays stick around confuses RA, which assigns vreg names
to the unused arrays, but then does not precolor them (because they are
unused). This leads to an assert in ra_select_reg_merged():
skqp: ../src/freedreno/ir3/ir3_ra.c:589: name_to_instr: Assertion '!name_is_array(ctx, name)' failed.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3262
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5907>
|
|
|
|
|
|
|
|
|
| |
It doesn't happen much. But it's annoying when we hit an impossible
condition deep in RA 90% thru a long test run. Add some ra_assert()/
ra_unreachable() helper macros so we can bail cleanly and fail RA.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5907>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5907>
|
|
|
|
|
|
|
| |
It's a decent bit of analysis to see that the initialization will always
happen, and my compiler isn't doing so in at least one configuration.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5834>
|
|
|
|
|
|
|
|
| |
GL driver was relying on this being done by gallium, but there might be
new loops to unroll during optimizations and turnip needs it.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5818>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With turnip we can have sparse input variables like:
decl_var shader_in INTERP_MODE_NONE float @1 (VERT_ATTRIB_GENERIC1.x, 1, 0)
decl_var shader_in INTERP_MODE_NONE float @2 (VERT_ATTRIB_GENERIC1.y, 1, 0)
decl_var shader_in INTERP_MODE_NONE float @3 (VERT_ATTRIB_GENERIC1.w, 1, 0)
Example of a test fixed:
dEQP-VK.glsl.440.linkage.varying.component.vert_in.vec2.as_float_float_unused
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5818>
|
|
|
|
|
|
|
|
| |
A650 uses LDL/STL, and the "local_primitive_id" in tess ctrl shader comes
from bits 16-21 in the header instead of 0-5.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5764>
|
|
|
|
| |
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5732>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example only some UCPs may be used by the shader, triggering asserts
that too many consts are being uploaded.
While we're at it, also fix the const size when loading UCPs, since
otherwise it doesn't correspond to what the shader is actually using.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5752>
|
|
|
|
|
|
|
|
|
|
| |
Gallium drivers should never see nir_var_uniform because gallium lowers
regular uniforms to a UBO. No GL driver should ever see either
nir_var_mem_shared because that's lowered in GLSL IR.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5418>
|
|
|
|
|
|
|
|
| |
Both vertex and fragment shaders need to have the lowering.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5751>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous version assumes tess level outputs will only be written once
in the shader, however its not possible to guarantee that.
It also assumes all invocations will write all the levels, which is also
not guaranteed.
This is required to fix the "tesselation" and "terraintessellation" demos
with turnip.
The comment about nir_lower_io_to_temporaries in lower_tess_ctrl_block is
removed because nir_lower_io_to_temporaries specifically skips TESS_CTRL
shaders so the comment doesn't make sense.
The split load for tess levels workaround is removed, the new version only
has scalar access unless if ever gets vectorized.
This sets NIR_COMPACT_ARRAYS cap to avoid the glsl tess vec lowering with
gallium. It seems this will also disable "LowerCombinedClipCullDistance",
which I'm not sure was needed or not.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5744>
|
|
|
|
|
|
|
|
| |
Check the interp mode and use SYSTEM_VALUE_BARYCENTRIC_LINEAR_* instead
when it is INTERP_MODE_NOPERSPECTIVE.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5582>
|
|
|
|
|
|
|
| |
This will be useful to support the missing barycentric sysvals.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5582>
|
|
|
|
|
|
|
|
|
|
|
| |
resinfo always writes 3 components, which was not being taken into account
Fixes these tests:
dEQP-VK.renderpass.suballocation.attachment_sparse_filling.input_attachment_3
dEQP-VK.renderpass.suballocation.attachment_sparse_filling.input_attachment_7
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5674>
|
|
|
|
|
|
|
|
|
| |
In cases where every variant is a shader-cache-hit, we never need the
post-finalize round of nir opt/lowering passes. So defer this until
the first shader-cache-miss to avoid doing pointless work.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a shader disk-cache for ir3 shader variants. Note that builds with
`-Dshader-cache=false` have no-op stubs with `disk_cache_create()` that
returns NULL.
Binning pass variants are serialized together with their draw-pass
counterparts, due to shared const-state.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
|
|
| |
For shader-cache, we are going to want to serialize them together.
Which is awkward if the two related variants are not compiled together.
This also decouples allocation and compile, which will simplify adding
shader-cache (which still needs to allocate, but can skip compile).
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we always do sysmem if there is tess. And for GS, the binning
pass VS ends up identical to the draw pass VS, so no point in compiling
it twice. (For GS what we should do someday is generate a binning pass
GS, and possibly if we can do cross-stage linking opts, an optimized
binning pass VS, but the required outputs would somehow have to end up
in the shader variant key.)
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
| |
Just to group together the parts that will get serialized when we have
shader disk-cache.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
| |
Use ir3_compiler_destroy() rather than open-coding ralloc_free(). This
will give us a place to add more compiler related cleanup code in the
following patches.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
|
|
|
|
| |
The next step is to hook this into pscreen->finalize_nir() so it can
come before the state tracker's shader-caching.
Unfortunately we still need to do lower_io after mesa/st, so that is
split out into a post-finalize pass.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
|
|
|
|
|
|
|
| |
This provides the policy for how to handle reducing constlen for some
stages.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
|
|
|
|
|
|
|
|
| |
This provides the mechanism for compiling variants with a reduced
constlen. The next patch provides the policy for choosing which to
reduce.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
|
|
|
|
|
|
|
| |
I wanted to access the ir3_compiler from a small helper inside
ir3_shader.h, which currently isn't possible.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
|
|
|
|
|
|
|
| |
Prevents problems when calculating whether we overflow the shared limit.
Note that on a6xx, the macros handle the assert for us.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5607>
|
|
|
|
|
|
|
|
|
|
| |
This is part of adding VK_KHR_shader_draw_parameters for turnip.
IR3_DP_VTXID_BASE/IR3_DP_VTXCNT_MAX offsets are changed to match what
CP_DRAW_INDIRECT_MULTI requires.
Signed-off-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5635>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes a case where you have something like:
aVecOutput.z = aScalarInput;
In particular, skipping over things that are not the first component is
wrong.. in the above case the input we need to precolor is the 3rd
component. But we need to adjust the target register according to the
offset.
Fixes android.hardware.nativehardware.cts.AHardwareBufferNativeTests
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5601>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't really need the varying remapping, and it seems to somehow
happen twice when shader-cache comes into the picture. But we can
just choose not to have this problem.
Now that everything is using the ir3_point_sprite() helper, we can
flip this pipe cap without it being a massive flag-day.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5595>
|
|
|
|
|
|
|
|
|
| |
This will simplify a bit the logic for setting up vinterp/vprepl in the
driver backend, and also avoid it being a flag-day when we switch the
texcoord pipe cap.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5595>
|
|
|
|
|
|
|
|
|
|
|
| |
As per discussion on !5059, we don't see any particular reason as to
why MERGEDREGS should be disabled on HS/DS/GS, and none of the dEQP
tests (both VK and GL) fail when MERGEDREGS is enabled. In fact, some
of the VK dEQP tests fail when MERGEDREGS is disabled (e.g. tests
with shaders that employ a0.x). As a result, let's just enable
MERGEDREGS unconditionally on a6xx.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5059>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lower_tess_ctrl_block assumes that the gl_TessLevel*
intrinsic_store_outputs have already been collapsed into a single
instruction before the tess lowering step:
store_output ... /* base=0 */ /* wrmask=xyzw */ /* component=0 */
store_output ... /* base=1 */ /* wrmask=xy */ /* component=0 */
While this is true in fd because of st_nir_vectorize_io, we don't do
the same lowering in turnip so each tess level component still has
its own store instruction:
store_output ... /* base=0 */ /* wrmask=x */ /* component=0 */
store_output ... /* base=0 */ /* wrmask=x */ /* component=1 */
store_output ... /* base=0 */ /* wrmask=x */ /* component=2 */
store_output ... /* base=0 */ /* wrmask=x */ /* component=3 */
store_output ... /* base=1 */ /* wrmask=x */ /* component=0 */
store_output ... /* base=1 */ /* wrmask=x */ /* component=1 */
This commit adds a component offset to the tess control lowering. An
alternative is to also perform nir_lower_io_to_vector in turnip, but
ir3 seems to generate the same assembly either way and it's nice to
not have a lowering prereq before tess lowering.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5059>
|
|
|
|
|
|
|
|
|
|
| |
Since binning pass variants share the same const_state with their
draw-pass counterpart, we should re-use the draw-pass variant's ubo
range analysis. So split the two functions of the existing pass
into two parts.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5526>
|
|
|
|
|
|
|
|
|
|
| |
This serves two purposes, one during ubo range analysis, where we want
to create new ranges, and another during the actual ubo lowering. Split
these in two, with read-only ubo analysis state in the second case, to
prepare to split this pass in two.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5526>
|
|
|
|
|
|
|
| |
Split out the description of the ubo from the ubo-range.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5526>
|
|
|
|
|
|
|
|
| |
This moves the last bit of important state to be serialized from
ir3_shader to ir3_shader_variant.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For shader-cache, we want to not have anything important in `ir3_shader`.
And to have shader variants with lower const size limits (to properly
handle cross-stage limits), we also want variants to be able to have
their own const_state.
But we still need binning pass shaders to align with their draw pass
counterpart so that the same const emit can be used for both passes.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
|
|
|
|
|
|
|
|
|
|
|
| |
Make it an rzalloc'd ptr instead of embedded struct, so it can serve as
the mem ctx for immediates. This gets rid of needing to explicitly free
the immediates, so one less thing to deal with when moving const_state.
(Also, after we move const_state to the shader variant, we won't need
one for binning pass variants)
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
|
|
|
|
|
|
|
|
|
| |
When we move const_state to the variant, this will need to stay in the
shader, as it applies to all variants (and we need to store it somewhere
before we have any variants)
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
|