| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
The state tracker should never ask us to create a texture with invalid
dimensions / mipmap levels. Do some assertions to check that.
No Piglit regressions.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
Commit 8aba778fa2cd98a0b5a7429d3c5057778a0c808c "st/mesa: don't set
sampler states for TBOs" changed how texture buffer objects are handled.
Document the new convention.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
For now this is a no-op on the output, but it makes it clear that we've
had weird things going on with things like
V3D21_CLIPPER_Z_SCALE_AND_OFFSET.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This gets our vc4_emit.c size back down a bit:
before:
1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o
after:
968 0 0 968 3c8 src/gallium/drivers/vc4/.libs/vc4_emit.o
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Take the CL pointer in, which will be useful for enabling relocs.
However, our code expands a bit more:
before:
4449 0 0 4449 1161 src/gallium/drivers/vc4/.libs/vc4_draw.o
988 0 0 988 3dc src/gallium/drivers/vc4/.libs/vc4_emit.o
after:
4481 0 0 4481 1181 src/gallium/drivers/vc4/.libs/vc4_draw.o
1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This slightly inflates the size of the generated code, in exchange for
getting us some convenient tools.
before:
4389 0 0 4389 1125 src/gallium/drivers/vc4/.libs/vc4_draw.o
808 0 0 808 328 src/gallium/drivers/vc4/.libs/vc4_emit.o
after:
4449 0 0 4449 1161 src/gallium/drivers/vc4/.libs/vc4_draw.o
988 0 0 988 3dc src/gallium/drivers/vc4/.libs/vc4_emit.o
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I really liked this idea, as it should help with management of packet
parsing tools like the CL dump. The python script is forked off of theirs
because our packets are byte-based instead of dwords, and the changes to
do so while avoiding performance regressions due to unaligned accesses
were quite invasive.
v2: Fix Android.mk paths, drop shebang for python script, fix overlap
detection.
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
Tested-by: Rob Herring <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
In swr_update_derived, for consistency, index buffer validation should
be using the p_draw_info copy "info" rather than referencing
p_draw_info.
No functional change.
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Tag pStat field in swr_draw_context structure so gen_llvm_types.py
can deal with the actual structure type instead of using void.
Code cleanup, no functional change.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Increases performance of some large workloads on KNL by ~30%.
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
|
|
| |
Fixes render target read access from pixel shaders.
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
|
|
|
|
| |
Switch from a macro-based simd intrinsics layer to a more C++
implementation, which also adds AVX512 optimizations to 128-bit
and 256-bit SIMD.
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
|
|
|
| |
Hardcode split to four files currently. Decreases swr build
time on KNL by over 50%.
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each shader stage state (VS, TS, GS, SO, BE/CLIP) now has a
vertexAttribOffset to specify the offset to the start of the
general attribute section of the incoming verts for that stage.
It is up to the driver to set this up correctly based on the
active stages. All the shader stages use this value instead of
VERTEX_ATTRIB_START_SLOT to offset to the incoming attributes.
Only the vertex shader stage supports dynamic layout output
currently. The other stages continue to expect the output to be
the fixed layout slots as before. Will be enabling GS next.
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
|
|
|
| |
Hardcode split to four files currently. Decreases swr build
time on a quad-core by ~10%.
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <bruce.cherniak at intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is typo in the mkdir command path,
the correct one is $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH)
The other issue is in 32bit builds, because lib64 does not exist there,
we can use TARGET_IS_64_BIT to refine the post install command.
Fixes: a3d98ca62f ("Android: use symlinks for driver loading")
Signed-off-by: Rob Herring <[email protected]>
|
|
|
|
|
|
| |
To sync with in-house changes.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
| |
Add mksstats for surface view emulation and also tighten the stat
CreateBackedView for the actual creation of backed view.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In general, the functions which emit commands to the command buffer check
for failure and return a PIPE_ERROR_x code. It's up to the caller to
flush the buffer and retry the command.
But svga_set_stream_output() did its own flushing and the callers never
checked the return value (though, it would always be PIPE_OK) in practice.
This patch changes svga_set_stream_output() so that it does not call
svga_context_flush() when the buffer is full. And we update the callers
to check the return value as we do for other functions, like
svga_set_shader().
No Piglit regressions. Also tested w/ Nature demo.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This patch fixes the total surface size in surface cache
to include array size as well.
Tested with MTT glretrace.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
piglit test ext_texture_array-gen-mipmap is fixed with this patch.
Tested with mtt piglit, glretrace, viewperf and conform. No regression.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch validates those sampler views with backing copy
of texture whose original copy has been updated since the
view is last validated.
This is done here at draw time because the texture binding might not
have modified, hence validation is not triggered at state update time,
and yet the texture might have been updated in another context, so
we need to re-validate the sampler view in order to update the backing
copy of the updated texture.
This fixes a rendering flickering issue with Photoshop running in
Linux VM with HWversion 11. The problem is Photoshop renders to texture A
in context X, and then bind texture A to context Y. The first time
when texture A is bound to context Y, cso calls pipe->set_sampler_views().
Validation of sampler views is done, rendering is fine.
But when texture A is rendered to again in context X, and rebound in
context Y, cso skips pipe->set_sampler_views() because texture A is already
bound in context Y. SVGA driver is not given a chance to re-validate
the texture binding, the backing copy of the texture is not updated,
and hence causes black image.
Tested with Photoshop, MTT glretrace, piglit.
Fixes VMware bug 1769103.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The function getprogname() is available on Android, since it reuses
various BSD solutions C runtime.
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
Deferred deletion (via "fence_work") has obsoleted the need to allocate
all client vertex buffer scratch space in a single chunk. Scratch
allocations are now valid until the referenced fence is complete.
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Vertex buffer state doesn't need to be validated on every call,
only on dirty _NEW_VERTEX or indexed draws.
Unconditional validation was introduced as part of patch 330d0607ed6,
"remove pipe_index_buffer and set_index_buffer", with the expectation
we'd optimize later.
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
| |
Reduces the memory footprint of the frontend processing by packing
vertices.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instead of having special driver loading logic for Android, create
symlinks to gallium_dri.so so we can use the standard loading logic.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Rob Herring <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The resource struct is already allocated at this point and should be
freed properly.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Reviewed-by: Wladimir J. van der Laan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The layer stride information is used in various parts of the driver,
so it needs to be present regardless if the driver allocated the
buffer itself or merely imported it from an external source.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Reviewed-by: Wladimir J. van der Laan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The labels array may change its virtual address on a reallocation, so
it is invalid to cache pointers into the array. Rather than using the
pointer directly, remember the array index.
Fixes miscompilation of shaders in glmark2 ideas, leading to GPU hangs.
Fixes: c9e8b49b (etnaviv: gallium driver for Vivante GPUs)
Cc: [email protected]
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader key size: 107 -> 47
Divisors of 0 and 1 are encoded in the shader key. Greater instance divisors
are loaded from a constant buffer.
The shader code doing the division is huge. Is it something we need to
worry about? Does any app use instance divisors >= 2?
VS prolog disassembly:
s_load_dwordx4 s[12:15], s[0:1], 0x80 ; C00A0300 00000080
s_nop 0 ; BF800000
s_waitcnt lgkmcnt(0) ; BF8C007F
s_buffer_load_dword s14, s[12:15], 0x4 ; C0220386 00000004
s_waitcnt lgkmcnt(0) ; BF8C007F
v_cvt_f32_u32_e32 v4, s14 ; 7E080C0E
v_rcp_iflag_f32_e32 v4, v4 ; 7E084704
v_mul_f32_e32 v4, 0x4f800000, v4 ; 0A0808FF 4F800000
v_cvt_u32_f32_e32 v4, v4 ; 7E080F04
v_mul_hi_u32 v5, v4, s14 ; D2860005 00001D04
v_mul_lo_i32 v6, v4, s14 ; D2850006 00001D04
v_cmp_eq_u32_e64 s[12:13], 0, v5 ; D0CA000C 00020A80
v_sub_i32_e32 v5, vcc, 0, v6 ; 340A0C80
v_cndmask_b32_e64 v5, v6, v5, s[12:13] ; D1000005 00320B06
v_mul_hi_u32 v5, v5, v4 ; D2860005 00020905
v_add_i32_e32 v6, vcc, v5, v4 ; 320C0905
v_subrev_i32_e32 v4, vcc, v5, v4 ; 36080905
v_cndmask_b32_e64 v4, v4, v6, s[12:13] ; D1000004 00320D04
v_mul_hi_u32 v5, v4, v1 ; D2860005 00020304
v_add_i32_e32 v4, vcc, s8, v0 ; 32080008
v_mul_lo_i32 v6, v5, s14 ; D2850006 00001D05
v_add_i32_e32 v7, vcc, 1, v5 ; 320E0A81
v_cmp_ge_u32_e64 s[12:13], v1, v6 ; D0CE000C 00020D01
v_sub_i32_e32 v6, vcc, v1, v6 ; 340C0D01
v_cmp_le_u32_e32 vcc, s14, v6 ; 7D960C0E
v_cndmask_b32_e64 v8, 0, -1, s[12:13] ; D1000008 00318280
v_cndmask_b32_e64 v6, 0, -1, vcc ; D1000006 01A98280
v_and_b32_e32 v6, v8, v6 ; 260C0D08
v_cmp_eq_u32_e32 vcc, 0, v6 ; 7D940C80
v_cndmask_b32_e32 v6, v7, v5, vcc ; 000C0B07
v_add_i32_e32 v5, vcc, -1, v5 ; 320A0AC1
v_cmp_eq_u32_e32 vcc, 0, v8 ; 7D941080
v_cndmask_b32_e32 v5, v6, v5, vcc ; 000A0B06
v_add_i32_e32 v5, vcc, s9, v5 ; 320A0A09
v2: set prefer_mono for fetched instance divisors
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
sizeof(struct si_shader_key):
Before reverting the 2 commits: 120 bytes
After reverting the 2 commits: 128 bytes
With #pragma pack: 107 bytes
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
This reverts commit 7b2240ac9ce3ba9bd86f4ae8aac53af8878c0b10.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
ff_tcs_inputs_to_copy"
This reverts commit 6b6fed3a3c81c2b0d319ef121df20a0dc914705f.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
It's enabled through message buffer for UVD
Signed-off-by: Leo Liu <[email protected]>
Acked-by: Christian König <[email protected]>
|