| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just make it be all SSBOs then all storage images. The remapping table
was there to make it so that the big gap present from gallium's atomic
lowering would get cleaned up, but that's no longer case. The table has
made it very hard to support Vulkan storage images, so it's time for it to
go.
This does mean that an SSBO/IBO that is only loaded (or size-queried) will
now occupy a slot in the table where it wouldn't before. This seems like
a minor cost compared to being able to drop this much logic.
With the remapping table gone, SSBO array handling for turnip just falls
out.
Fixes many array cases of
dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.*
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]> (turnip)
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The arguments passed in were:
- prog->info.num_ssbos
- prog->nir->info.num_ssbos
- arbitrary values for standalone compilers
The num_ssbos should match between the prog's info and prog->nir's info
until this lowering happens.
Reviewed-by: Marek Olšák <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
|
|
|
|
|
|
|
|
|
|
| |
For the handful of registers that depend on the union of program/
framebuffer/rasterizer state.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
|
|
|
| |
Move the program state which we can't pre-bake to a streaming state
object, rather than emitting directly in the draw cmdstream.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This lets us move PC_PRIMITIVE_CNTL into the rasterizr stateobj, rather
than unconditionally emitting it directly in the cmdstream on every
draw.
This also starts adding some tracking about previous draw state, so that
following patches can limit some of the register writes we currently
emit on every draw.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
|
|
|
|
|
|
| |
All but one of the reg values is only used in the stateobj, so we can
inline the register value setup and stateobj construction. While we
are at it, switch over to the new register builders.
Prep work for next patch.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The overhead does seem to matter when you have a high enough # of draw
calls that effect few bins/pixels, because these writes would happen
unconditionally (ie. not part of a state-group).
Possibly we could keep these if we moved them into a state-group so the
register writes would be no-ops on bins with no geometry. OTOH I
usually end up adding in a WFI when using them scratch reg values to
track down a crash. (So add a WFI to mitigate the annoyance of needing
to use a debug build to get scratch regs to locate the position of a
crash/hang in the cmdstream.)
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>
|
|
|
|
|
|
|
|
|
| |
We had an extra factor of num_samples in the stride.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
|
|
| |
Similar to how we handle zs blits.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
|
|
| |
If we move this function up, we don't have to forward declare it.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
|
|
|
| |
We have a SAMPLES_AVERAGE bit that does what we need for resolving
multisample buffers - let's use it.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
|
|
|
|
| |
The linear levels in a tiled resource are stored in the canonical
swap, WZYX. We need to pick the swap based on whether or not the
resource is tiled, not whether the the level in question is tiled.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
|
|
| |
It doesn't matter whether or not the level in question is linear.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
|
|
| |
Use new macro, DEBUG_BLIT, for dumping all blits.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
|
|
|
|
|
|
|
|
|
|
|
| |
These actually mean something completely different, at least on A5xx
and A6xx. The only other usage of the old flags on something older than
A6xx was a typo, so I don't know if it was always this way, but at the
same time it means that we don't have to worry too much about that.
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
|
|
|
|
|
|
|
|
|
|
| |
Similar to the existing usage for CP_COND_WRITE5, this makes it clear
what each of the magic parameters are for.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And add fields uncovered by looking at the firmware. I think this covers
all the memory, register, and scratch manipulation opcodes that exist on
A6xx, plus one additional nice find for Vulkan and describing a
previously unknown opcode and documenting CP_WAIT_REG_MEM.
Note that the bits for the CP_REG_TO_MEM count, as well as the formula
for computing the actual count for both CP_REG_TO_MEM and CP_MEM_TO_REG,
are changed because the A630 SQE firmware actually does something
different. I haven't investigated older microcodes to see whether this
extends back to A5xx and A4xx, but the only non-A6xx uses of this
field result in the same bit-pattern when using the A6xx bit range and
formula, so it should be safe to change the definition universally.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
|
|
|
|
|
|
|
|
|
|
| |
There are bits for binning, gmem and sysmem.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131>
|
|
|
|
|
|
|
|
|
|
| |
BEGIN_RING() could decide we can't fit the next packet in the current
cmdstream segment, and grow a new segment. So we need to grab ring->cur
*after* BEGIN_RING(), otherwise we are writing cmdstream past the end of
the previous segment.
Fixes: bdd98b892f3 ("freedreno: New struct packing macros")
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
At least on Mali Utgard, index buffers need to be aligned on 0x40.
To avoid duplicating this, add an alignment parameter.
Keep the previous default for the other existing users.
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
|
|
|
|
|
|
|
|
|
| |
Sometimes sched changes that are a win in terms of instruction count
and/or register pressure, are worse in real life, due to keeping varying
storage locked for too long. Add a shader-db stat to give this more
visibility.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Including non-functional changes to get the value from the fd_reg_pair
in places.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
This gets us shared non-UBWC layout code between gallium and turnip.
Until I fix up the rest of gallium to handle UBWC mipmapping, we do the
single-level UBWC setup in gallium as a fixup after layout.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
We pass in all the parameters for setting up the layout, though freedreno
still sets a few of them up early (since it uses layout helpers in making
some decisions about the layout setup parameters that will be cleaned up
once krh's blitter work lands).
|
|
|
|
|
|
| |
This is a little refactor in preparation for UBWC mipmapping support.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
It's the same logic for each of these being emitted, and I was about to
change the rsc->layout.* for UBWC.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
We can just bake the UBWC-goes-first delta into the slices at setup time.
I did have to fix up the resource shadowing swap path to swap the slice
fields, as it was missing and regressed the format reinterpets otherwise.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Now that dEQP should be happy, lets flip the switch.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In particular, we need to invalidate the LRZ state when we cannot be
confident in what the Z state would be during rendering:
1) depth test modes not supported by LRZ
2) stencil test, which would require full rasterization and stencil
test in the binning pass (whereas LRZ normally just needs to
determine the min and max z value in an 8x8 quad)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Seems to be a bit different for a6xx, so let's split this out.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We were iterating over the entire 32-entry array each time, when we
can just use a bitset to know that we're only uploading from the first
entry normally.
Knocks ir3_emit_user_consts down from ~.5% of CPU to .1% on WebGL
fishtank.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The default is to not throw GL errors when drawing with mapped
buffers, but we were forcing it on for unclear reasons. Internally we
keep all our buffers mapped anyway, so it should be a no-op other than
reducing CPU overhead (.23% in a perf report for WebGL fishtank)
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Fixes oom-killer during streaming-texture-upload, which I found while
trying to enable piglit in CI.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the addition of the planar formats helper, the
planar formats no longer have a valid block.bits field.
Calling util_format_get_blocksize therefore asserts.
Reorder the check to see if the format is supported
before doing the query to get the blocksize.
Fixes: 20f132e5eff2d ("gallium/util: add planar format layouts and helpers")
Signed-off-by: Fritz Koenig <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|