| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
If a mipfilter is not set, it's legal to have an incomplete mipmap; we
should handle this accordingly. An "easy way out" is to rig the LOD
clamps.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sampler state prefetching is broken on Gen11, and WA_160668216 says
to disable it. Apparently sampler state prefetching also has basically
zero impact on performance, so we don't need to worry there.
i965, anv, and iris already handle this correctly, but we missed BLORP.
Ideally the kernel should globally disable this by writing SARCHKMD, at
which point we wouldn't have to worry about it. But let's be defensive
and handle it ourselves too.
v2: separate out from BTP workaround in case we change that eventually
Reviewed-by: Anuj Phogat <[email protected]> [v1]
|
|
|
|
|
|
|
|
|
| |
When immutable samplers are set we call write_image_view with a NULL
image view. This causes issues on IVB where we have to fake texture
swizzling.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110999
Fixes: d2aa65eb18 "anv: Emulate texture swizzle in the shader when..."
|
|
|
|
|
|
|
|
| |
When set, do as requested and skip any transfer optimization.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-By: Gert Wollny <[email protected]>
Reviewed-By: Alexandros Frantzis <[email protected]>
|
|
|
|
|
|
|
|
| |
When set, wait after every each flush.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-By: Gert Wollny <[email protected]>
Reviewed-By: Alexandros Frantzis <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
VIRGL_DEBUG_BGRA_DEST_SWIZZLE should use bit 3. Make some cosmetic
changes as well.
Fixes: a478e56fbd33fa23503b63d41265a1c2f3253ed2
virgl: Add debug flag to bypass driconf to enable the BGRA tweaks
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-By: Gert Wollny <[email protected]>
Reviewed-By: Alexandros Frantzis <[email protected]>
|
|
|
|
|
|
|
| |
SMEM and VMEM caches are L0 on gfx10. Ported from RadeonSI.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Ported from RadeonSI, will be emitted for GFX10 too.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reduces the size of fill operations needed to clear CMASK
for layered color textures.
GFX9 unsupported for now.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reduces the size of fill operations needed to clear FMASK
for layered color textures.
GFX9 unsupported for now.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This fixes a rendering issue with RoTR/DXVK.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case of any enabled VS members from: uses_firstvertex,
uses_baseinstance, uses_drawid, uses_is_indexed_draw
leaks may happens.
Call gen6_upload_push_constants allocates
stage_stat->push_const_bo. It than takes pointer from
push_const_bo to draw_params_bo (in the call
brw_prepare_shader_draw_parameters by brw_upload_data)
and do reference which finally haven't got unreferenced.
Fixes leak:
136 bytes in 1 blocks are definitely lost in loss record 6 of 13
at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0xC2B64B7: bo_alloc_internal (brw_bufmgr.c:596)
by 0xC2B6748: brw_bo_alloc (brw_bufmgr.c:672)
by 0xC314BB3: brw_upload_space (intel_upload.c:88)
by 0xC2EBBC5: gen6_upload_push_constants (gen6_constant_state.c:155)
by 0xC9E4FA6: gen9_upload_vs_push_constants (genX_state_upload.c:3300)
by 0xC2E0EDA: check_and_emit_atom (brw_state_upload.c:540)
by 0xC2E0EDA: brw_upload_pipeline_state (brw_state_upload.c:659)
by 0xC2E0FF1: brw_upload_render_state (brw_state_upload.c:681)
by 0xC2C5D2D: brw_draw_single_prim (brw_draw.c:1052)
by 0xC2C62CB: brw_draw_prims (brw_draw.c:1175)
by 0xC488AD1: vbo_exec_vtx_flush (vbo_exec_draw.c:386)
by 0xC485270: vbo_exec_FlushVertices_internal (vbo_exec_api.c:652)
Reviewed-by: Lionel Landwerlin <[email protected]>
Reported-by: Yevhenii Kolesnikov <[email protected]>
Signed-off-by: Sergii Romantsov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
st/egl used to support eglCreatePbufferFromClientBuffer, but now that
it's gone, any call to it would segfault.
Let's return a nice error instead.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Nobody ever uses these, so let's just hard code them instead.
If an EGL driver ever comes around that needs them they're trivial to
re-add.
Suggested-by: Emil Velikov <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
| |
Fixes: 3d198926a48 freedreno: use fd_bc_alloc_batch instead of fd_batch_create.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
The driver doesn't use these values and ac_rtld has assertions
expecting the value of 0.
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
SMEM and VMEM caches are L0 on gfx10.
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
otherwise the behavior is undefined
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We'll have to extend this at some point, and using a bitfield union in
this way makes it easier to get the right index without excessive
branching.
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial prototype used a processor-specific symbol type, but
feedback suggests that an approach using processor-specific section
name that encodes the alignment analogous to SHN_COMMON symbols is
preferred.
This patch keeps both variants around for now to reduce problems
with LLVM compatibility as we switch branches around.
This also cleans up the error reporting in this function.
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Incrementing the iteration count was intended to fix an off-by-one error
when the first terminator was superseded by a later terminator. If
there is no first terminator or later terminator, there is no off-by-one
error. Incrementing the loop count creates one. This can be seen in
loops like:
do {
if (something) {
// No breaks or continues here.
}
} while (false);
Reviewed-by: Timothy Arceri <[email protected]>
Tested-by: Abel Briggs <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953
Fixes: 646621c66da ("glsl: make loop unrolling more like the nir unrolling path")
|
|
|
|
|
|
|
|
|
|
| |
We were pessimistically uploading all of it in case of indirection,
but we can just bump that when we encounter indirection.
total constlen in shared programs: 2529623 -> 2485933 (-1.73%)
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we
need to upload (all of it, since it will lower indirect UBO 0 accesses
from load_ubo back to indirection on the constant buffer).
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the NIR-level analysis decided to move UBO loads to the constant
file, but the backend decided not to load those constants, we could
upload past the end of constlen. This is particularly relevant for
pre-a6xx, where we emit a different constlen between bin and render
variants.
(Fix by Rob, commit message by anholt)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
This is the hardware max, as far as I can tell.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Just a little spring cleanup, extending UBOs to vertex shaders in the
process.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
UBOs and uniforms now use a common code path with an explicit `index`
argument passed, enabling UBO reads.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Prevents an assert(0) later in this (not so edge) case. We still have to
have a dummy there.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We've known about this for a while, but it was never formally in the
machine header files / decoder, so let's add them in.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Now that all the counting is sorted, it's a matter of passing along a
GPU address and going.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We already uploaded UBOs, but only a fixed number (1) for uniforms;
let's upload as many as we compute we need.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We look at the highest set bit in the UBO enable mask to work out the
maximum indexable UBO, i.e. the UBO count as we need to report to the
hardware.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We refactor panfrost_constant_buffer to mirror v3d's constant buffer
handling, to enable UBOs as well as a single set of uniforms.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This doesn't handle Y-flipping, but it's good enough to render the stars
in Neverball.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
In preparation for lowering point sprites, track them like we track
alpha testing state.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Those either depend on information filled by the NIR linking steps OR
are restricted by those:
- gl_nir_lower_samplers: depends on UniformStorage being set by the
linker.
- brw_nir_lower_image_load_store: After 6981069fc80 "i965: Ignore
uniform storage for samplers or images, use binding info" we want
this pass to happen after gl_nir_lower_samplers.
- gl_nir_lower_buffers: depends on UniformBlocks and
SharedStorageBlocks being set by the linker.
For the regular GLSL code path, those datastructures are filled
earlier. For NIR linking code path we need to generate the nir_shader
first then process it -- and currently the processing works with all
shaders together. So move the passes out of brw_create_nir into its
own function, called by the brwProgramStringNotify and
brw_link_shader().
This patch prepares ground for ARB_gl_spirv, that will make use of NIR
linker.
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The iterator `i` already walks the right amount now that is
incremented by `dmul`, so no need to `* 2`. Fixes invalid memory
access in upcoming ARB_gl_spirv tests.
Failure bisected by Arcady Goldmints-Orlov.
Fixes: b019fe8a5b6 "glsl/nir: Fix handling of 64-bit values in uniform storage"
Reviewed-by: Jason Ekstrand <[email protected]>
|