| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Grigori Goronzy <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
and clamp it right before emitting. This is a prerequisite for computing
the guard band.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Grigori Goronzy <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is in preparation of raising the number of exposed sampler views to 32
bits, which will raise the total number of sampler views to 33 for the
polygon stipple texture. That texture should never be compressed (and it's
certainly not a depth texture), but this approach seems cleaner to me than
special-casing the last slot in all affected code paths.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Whether DCC is disabled depends on the access flags with which the image
is bound: image_load supports DCC, but store and atomic don't.
v2: remove an unnecessary masking of images->desc.enabled_mask
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since it is all about calling into blitter functions, it makes more
sense here. This change also reduces the size of the interfaces between
.c files.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
view->resource is redundant with view->base.texture, so get rid of it.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is currently not needed but will be necessary when we have
features that do not work with DCC enabled, such as image stores
and sharing non-scanout surfaces.
v2: Marek: rebase, remove decompression from si_flush_resource (not needed)
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
To allow sharing textures with DCC enabled.
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
v2: handle _mesa_hash_table_insert failure
other cosmetic changes
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It only exports the primitive ID.
Also used by TES when it's compiled as VS.
The VS input location of the primitive ID input is v2.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
This is disabled with use_monolithic_shaders = true.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The texture slot is expanded to 16 dwords containing 2 descriptors.
Those can be:
- Image and fmask, or
- Image and sampler state
By carefully choosing the locations, we can put all three into one slot,
with the fmask and sampler state being mutually exclusive.
This improves shaders in 2 ways:
- 2 user SGPRs are unused, shaders can use them as temporary registers now
- each pair of descriptors is always on the same cache line
v2: cosmetic changes: add back v8i32, don't load a sampler state & fmask
at the same time
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It was partly a state and partly emulated by shader code, but since we want
to do this in a fragment shader prolog, we need to put it into the shader
key, which will be used to generate the prolog.
This also removes the spi_ps_input states and moves the registers
to the PS state.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
and rename a variable in the function.
SX_PS_DOWNCONVERT will be emitted here.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
because not using SPI_SHADER_32_ABGR doubles fill rate.
We should also get optimal performance if alpha isn't needed or blending
isn't enabled.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This does change the behavior slightly:
If a shader writes COLOR[i] and that color buffer isn't bound,
the shader will export MRT_NULL instead and discard the IR tree that
calculates the output. The only exception is alpha-to-coverage, which
requires an alpha export.
v2: - update a comment about 16BPC
- account for MRTZ when when fixing alpha-test/kill
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This option allows replacing a single shader by a pre-compiled ELF object
as generated by LLVM's llc, for example. This can be useful for debugging a
deterministically occuring error in shaders (and has in fact helped find
the causes of https://bugs.freedesktop.org/show_bug.cgi?id=93264).
v2: drop the debug flag, use DEBUG_GET_ONCE_OPTION instead
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Expose most of the performance counter groups that are exposed by Catalyst.
Ideally, the driver will work with GPUPerfStudio at some point, but we are not
quite there yet. In any case, this is the reason for grouping multiple
instances of hardware blocks in the way it is implemented.
The counters can also be shown using the Gallium HUD. If one is interested to
see how work is distributed across multiple shader engines, one can set the
environment variable RADEON_PC_SEPARATE_SE=1 to obtain finer-grained performance
counter groups.
Part of the implementation is in radeon because an implementation for
older hardware would largely follow along the same lines, but exposing
a different set of blocks which are programmed slightly differently.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I discovered that increasing the ESGS ring size fixes GS hangs on Tonga,
so let's do it properly.
There is now a separate init_config_gs_rings state that is not immutable,
because GS rings are resized when needed.
This also saves some memory. Most apps won't need more than 1MB
per ring per shader engine.
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
The recursion can only occur if you modify need_cs_space to always flush.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
KCACHE, TC L1 and TC L2 are renamed to:
- SMEM L1
- VMEM L1
- GLOBAL L2
You can easily tell what they are used for now.
Shaders must deal with coherency issues between both L1s manually,
e.g. by setting GLC=1 or by using s_dcache_*.
BOTH_ICACHE_KCACHE was an unused definition.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "current" shader pointer is moved from the CSO to the context, so that
the CSO is mostly immutable.
The only drawback is that the "current" pointer isn't saved when unbinding
a shader and it must be looked up when the shader is bound again.
This is also a prerequisite for multithreaded shader compilation.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Broken by one of the cleanups: 0d46c3bc9d09b376d74f7399e1a2d1b0a923640b
Not applicable to stable.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
This will be a derived state used for changing center->sample and
centroid->sample at runtime.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
instead of always doing both.
Usually, only depth is needed, so stencil decompression is useless.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The border colors are uploaded only once when the state is created.
This brings truly immutable sampler descriptors, because they don't have
to be updated every time a sampler state is re-bound.
It also moves the TA_BC_BASE_ADDR registers to init_config, removing one
more state. The catch is there is now a limit: only 4096 border colors can
be used by one context. I don't think that will be a problem.
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
| |
Since we don't put any resource descriptors in IBs, the space used by draw
calls is quite small.
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
|
|
| |
Reducing calloc overhead.
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The main idea is to avoid setting CB_COLORi_INFO = 0 for i>0 repeatedly
when those colorbuffers aren't used. This is mainly for glamor.
Same for DB. Z_INFO and STENCIL_INFO need to be cleared only once.
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|