| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
fmask implies that cmask is present too.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
these names were misleading.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
These just say whether libdrm can assume that the latest radeon_surface
definition is used by Mesa.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This removes input-only parameters from the radeon_surf structure.
Some of the translation logic from pipe_resource to radeon_surf is moved to
winsys/radeon.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
npix_y will be removed. level[0].npix_y will be removed too. nblk_y should
be the same as npix_y if the block height == 1. However, nblk_y is aligned
to the tile size, so it can be greater than npix_y.
If that's a problem, we'll have to save the input height of surface_init
and use that.
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
SDMA might be fixed by:
"winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures"
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Maybe this is why SDMA has been broken for many amdgpu users?
SDMA is the only block which is used with imported textures and relies
on this variable. DB also uses it, but it doesn't get imported textures,
so it's unaffected.
I do get SDMA failures on Tonga before this patch if R600_DEBUG=testdma
is changed to use imported textures.
Cc: 11.2 12.0 13.0 <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This should fix random GPU hangs on Hawaii and Fiji.
Cc: 11.2 12.0 13.0 <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
Oh my god, I wonder what catastrophic issues this was causing on SI.
Cc: 13.0 <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When restoring something from shader cache we won't have and don't
want to create a nir_shader this change detaches the two.
There are other advantages such as being able to reuse the
shader info populated by GLSL IR.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
total instructions in shared programs : 3499888 -> 3499445 (-0.01%)
total gprs used in shared programs : 453866 -> 453803 (-0.01%)
total local used in shared programs : 21621 -> 21621 (0.00%)
total bytes used in shared programs : 32078952 -> 32074936 (-0.01%)
local gpr inst bytes
helped 0 39 119 119
hurt 0 0 0 0
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "12.0 13.0" <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Shared memory is local to CTA, thus we should only wait for
prior memory writes which are visible to other threads in
the same CTA, and not at global level. This should speedup
compute shaders which use shared memory.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Only one face of Cubetextures was locked when in DEFAULT Pool.
Fixes:
https://github.com/iXit/Mesa-3D/issues/129
CC: "12.0 13.0" <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In the format fallback path,
the height was used instead of the depth.
CC: "12.0 13.0" <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We are not sure exactly what needs to be 0 initialized,
but we are missing some cases. 0 initialize all our current
aligned allocation.
Fixes Tree of Savior visual issues.
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Add implementation for align_calloc,
which is align_malloc + memset.
v2: add if (ptr) before memset.
Fix indentation.
Signed-off-by: Axel Davy <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Leak introduced by:
a83dce01284f220b1bf932774730e13fca6cdd20
The patch also moves the part to
release changed.vs_const_i and changed.vs_const_b
before the if (!cb.buffer_size) check,
to avoid reuploading every draw call if
integer or boolean constants are dirty, but the shaders
use no constants.
Signed-off-by: Axel Davy <[email protected]>
CC: "13.0" <[email protected]>
|
|
|
|
|
|
| |
This seems important considering how much we depend on some of the flags.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
the next commit will need this
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
nvdisasm does not print a .S even though the bit is set.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
radeonsi also does the same thing. I suspect that this is likely to be a
no-op in reality, but it brings nouveau code closer to what the blob
produces. Plus it makes sense to not try to do auto-derivatives on this.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows the driver to signal that it can't handle random
interleaving of attributes across buffers. This is required for
ARB_transform_feedback3, and it's initialized to whatever the previous
value of PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME was except for nv50 where
it is disabled. Note that the proprietary drivers never expose
ARB_transform_feedback3 on any GT21x's (where nouveau previously did),
and after some effort I was unable to get it to work.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Getting stores to NIR regs to not generate new MOVs is tricky, since the
result we're trying to store into the NIR reg may have been from a
conditional update of a temp, or a series of packed writes. The easiest
solution seems to be to require that nir_store_dest()'s arg comes from an
SSA temp.
This causes us to put in a few more temporary MOVs in the NIR SSA dest
case, but copy propagation successfully cleans those up.
The shader-db change is modest:
total instructions in shared programs: 93774 -> 93598 (-0.19%)
instructions in affected programs: 14760 -> 14584 (-1.19%)
total estimated cycles in shared programs: 212135 -> 211946 (-0.09%)
estimated cycles in affected programs: 27005 -> 26816 (-0.70%)
but I was seeing patterns in some register-allocation failures in DEQP
tests that looked like the extra MOVs would increase maximum register
pressure in loops. Some debug code indicates that that's not the case,
though I'm still a bit confused by that result.
|
| |
|
|
|
|
|
| |
One tiny hack is left in vc4_bufmgr.c for what kind of mapping we got so
that we can free it.
|
|
|
|
|
|
| |
Now we aren't limited to 256MB total allocated across a driver instance,
just 256MB at one time. We're still copying in and out, which should get
fixed.
|
|
|
|
|
| |
I would like to put a couple more things in here, so it's time to package
it up.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rather than having simulator mode changes scattered around vc4_bufmgr.c
and vc4_screen.c, make vc4_bufmgr.c just call a vc4_simulator_ioctl, which
then dispatches to a corresponding implementation.
This will give the simulator support a centralized place to do tricks like
storing most BOs directly in simulator memory rather than copying in and
out.
This leaves special casing of mmaping BOs and execution, because of the
winsys mapping.
|