| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
We have tessellation state now.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
At least the gallium blitter helper will call us to draw with
tessellation shaders set but a non-patch primitive.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It seems like tiling could work in the Adreno architecture, but we've
only ever seen bypass rendering with tessellation. For now, let's do
that too.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Assemble the information the stages need and emit the constants.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Tessellation needs a couple of buffers that should hold the entire
output from a full VS+TCS draw call.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We need to select the right primitive type, set a bit to turn on
tessellation and or in the TES output primitive type.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The tessellation stages need size and stride or the patch layout as
well as locations of attributes in the patch. The tesselation stages
also use two system memory BOs and need the iovas of those.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Similar to GS, the registers are shared and not reinitialized betewen
VS and TCS, so we need to make sure to allocate the same registers for
the system values between stages.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
In tessellation mode, the TES is (probably) the binning shader.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Similar to GS, some inputs are reused when the chsh from VS to TCS or
TES to GS, so we need to make sure we setup the right inputs and make
the shared system values outputs so they don't get clobbered.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We add two new IR3 specific nir intrinsics that map to the new condend
and endpatch instructions.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Our lowering pass made the z component unused by replacing its uses
by 1 - x - y. The intrinsic implementation then just need to return
the x and y components.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When we have both TES and GS, the TES needs to chain to the VS with
chmask and chsh GS just like the VS does to either TCS or GS.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two new opcodes in use in tesselation control shaders:
category 0, opcodes 13 and 15. unk13 is a kill type of instruction
that terminates threads where !p0.x and it used to narrow down a patch
wavefront to just thread 0. Then, once thread 0 has written the tess
levels, it issues unk15, which might signal the TE that another patch
has been fully written.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VS and TCS pass varyings the same way as VS and GS does. TCS then
writes entire patch to a system memory BO and TES eventually reads
back from the BO once the TE starts generating vertices. TES outputs
vertices the same way as VS and GS, except when there's a GS as well,
in which case TES passes varyings to GS same way the VS would.
In addition, the TCS needs a little bit of control flow massaging so
that it only runs for valid invocations needs a couple of unknown
instructions to synchronize with the TE.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Whether we're tessellating and which primitives the TES outputs
affects the entire pipeline so let's add a field to the key to track
that.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
With the imul24 opcode in place, we can now use it for computing local
offsets (ie for ldlw/stlw).
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These provide the iovas for system memory buffers used for
tessellation as well as a new HW specific system value.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The gallium helper doesn't like patches and we can't determine how
many primitives it gets tessellated into anyway. On gens where we
have tessellation, we get the prim count from a HW counter so just
skip counting on the CPU.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These intrinsics take a ivec2 for the 64 bit base address and a
integer offset.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Stages that load inputs with ldlw (TCS, GS) need byte offsets, stages
that load with ldg (TES) need dwords offsets.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These instructions take a 64 bit iova as two conescutive registers and
a immediate offset. This patch adds support for the offset to be a
single register, which is added to the 64 bit iova.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
What we call eRB6_Z24_UNORM_S8_UINT now is actually
RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually
RB6_Z24_UNORM_S8_UINT.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
2D array textures and 3D textures are different enum values after all.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use one mechanism for (REG_A6XX_RBBM_PRIMCTR_8_LO)
PIPE_QUERY_PRIMITIVES_GENERATED, which counts all primitives that exit
the geometry pipeline, whether or not xfb is on. Then for
PIPE_QUERY_PRIMITIVES_EMITTED, we use the CP_EVENT_WRITE subfunction
that writes out per-stream counts for generated and emitted, but only
when xfb is enabled.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Adding comments about best guess at what the counters count.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Move these two to be in order with the other VS regs.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
No pipeline-db changes.
v2: use early-exit for VOP3
Reviewed-by: Daniel Schürmann <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
| |
In particular, increase the cost of 64-bit integer division.
Fixes huge shaders with dEQP-VK.spirv_assembly.type.scalar.i64.mod_geom
, with ACO used for GS this creates shaders requiring a branch with
>32767 dword offset.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix memory leak on allocation for lima submit, reported by valgrind.
128 bytes in 1 blocks are definitely lost in loss record 38 of 84
at 0x484A6E8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
by 0x58689C7: util_dynarray_ensure_cap (u_dynarray.h:91)
by 0x5868BBB: util_dynarray_grow_bytes (u_dynarray.h:139)
by 0x5868BBB: lima_submit_add_bo (lima_submit.c:113)
by 0x585D7D3: lima_ctx_buff_va (lima_context.c:57)
by 0x586378F: lima_pack_plbu_cmd (lima_draw.c:802)
by 0x586378F: lima_draw_vbo (lima_draw.c:1351)
by 0x5406A2F: u_vbuf_draw_vbo (u_vbuf.c:1184)
by 0x55D0A57: st_draw_vbo (st_draw.c:268)
by 0x55576CB: _mesa_draw_arrays (draw.c:374)
by 0x55576CB: _mesa_draw_arrays (draw.c:351)
by 0x43610B: Mesh::render_vbo() (mesh.cpp:583)
by 0x415DBB: SceneBuild::draw() (scene-build.cpp:242)
by 0x41131B: MainLoop::draw() (main-loop.cpp:133)
by 0x411947: MainLoop::step() (main-loop.cpp:108)
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix memory leak on allocation for nir shader, reported by valgrind.
3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84
at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so)
by 0x5750817: ralloc_size (ralloc.c:119)
by 0x5750977: rzalloc_size (ralloc.c:151)
by 0x575C173: nir_shader_create (nir.c:45)
by 0x5763ACB: nir_shader_clone (nir_clone.c:728)
by 0x55D5003: st_create_fp_variant (st_program.c:1242)
by 0x55D789F: st_get_fp_variant (st_program.c:1522)
by 0x55D789F: st_get_fp_variant (st_program.c:1507)
by 0x56400C3: st_update_fp (st_atom_shader.c:163)
by 0x563D333: st_validate_state (st_atom.c:261)
by 0x55D07CB: prepare_draw (st_draw.c:132)
by 0x55D08DF: st_draw_vbo (st_draw.c:184)
by 0x55576CB: _mesa_draw_arrays (draw.c:374)
by 0x55576CB: _mesa_draw_arrays (draw.c:351)
Signed-off-by: Erico Nunes <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Also remove version sufix from osmesa swrast on Windows.
v2: Make sure we don't remove lib prefix on *nix platforms.
Signed-off-by: Prodea Alexandru-Liviu <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Cc: "19.3" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We would like to have GL 4.6 Compatibility too.
The extensions don't support compatibility features, so no other changes
are needed.
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
| |
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
All callers other than the unit test just wanted to convert back from
a known-mesa-equivalent format, which is now a no-op.
v2: Fix assertion failure in iris GL startup with BGR565 by continuing
to return MESA_FORMAT_NONE for non-Mesa formats.
Reviewed-by: Marek Olšák <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
| |
Now that MESA_FORMAT_x is just a PIPE_FORMAT_x define, we can strip
this function down to just the compression fallbacks.
v2: Restore the SRGB format for ASTC SRGB fallback case.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are various places in Mesa where we would like to be able to
have a shared format enum between Mesa and gallium (NIR compiler's
image formats, for example, or mapping from gallium's formats to
mesa's and vice versa in st_format.c). Rewriting all MESA_FORMAT to
PIPE_FORMAT would be disruptive and possibly more work than it's worth
(And I actually prefer MESA_FORMAT's name scheme), so for now just
make it so that there's one shared set of enum values.
The #defines here were generated by printing out from the
tests/st_format.c round-tripping loop, with the exception of 8888
formats where I hand-edited the #defines to point at the corresponding
gallium packed format define.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
To redefine MESA_FORMAT in terms of PIPE_FORMAT enums, we need to fix
places where we iterated up to MESA_FORMAT_COUNT. I use
_mesa_get_format_name(f) == NULL as the signal that it's not an enum
value with a MESA_FORMAT.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We checked round-tripping of formats without fallbacks, but weren't
setting the compression support flags in the mock context and thus
needed to skip testing those. Just set all the flags and assert that
no fallbacks are triggered, so we get full test coverage.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We have packed formats for RGBA and ABGR already, so we can just
pack/unpack code.
v2: Rebase on endianness macro rename
Reviewed-by: Marek Olšák <[email protected]> (v1)
|
|
|
|
|
|
|
|
| |
These are the last formats that MESA_FORMAT had and PIPE_FORMAT
didn't. The .csv entries channel sizes and swizzles all came from the
corresponding UNORM format.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This is the last unorm format that MESA_FORMAT had and PIPE_FORMAT
didn't. Note that it's an array format on gallium's side as well,
since it's a NPOT pixel size.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This covers everything that MESA_FORMAT had for packed unorm.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This texture compression is exposed by 830 and 915, and to make
MESA_FORMAT match PIPE_FORMAT defines I need a corresponding
PIPE_FORMAT.
v2: Set is_hand_written so we don't try to generate pack/unpack code.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The primitive indices have to be swapped to follow the drawing
order.
This fixes corruption with Overwatch when NGG GS is force enabled.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|