summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/blitter: Save tessellation stateKristian H. Kristensen2019-11-071-0/+2
| | | | | | | | We have tessellation state now. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Only set emit.hs/ds when we're drawing patchesKristian H. Kristensen2019-11-071-2/+3
| | | | | | | | | At least the gallium blitter helper will call us to draw with tessellation shaders set but a non-patch primitive. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Use bypass rendering for tessellationKristian H. Kristensen2019-11-071-0/+8
| | | | | | | | | | It seems like tiling could work in the Adreno architecture, but we've only ever seen bypass rendering with tessellation. For now, let's do that too. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Program state for tessellation stagesKristian H. Kristensen2019-11-074-34/+162
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Emit constant parameters for tessellation stagesKristian H. Kristensen2019-11-071-10/+84
| | | | | | | | Assemble the information the stages need and emit the constants. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Allocate and program tessellation bufferKristian H. Kristensen2019-11-073-0/+44
| | | | | | | | | Tessellation needs a couple of buffers that should hold the entire output from a full VS+TCS draw call. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Build the right draw command for tessellationKristian H. Kristensen2019-11-073-4/+52
| | | | | | | | | We need to select the right primitive type, set a bit to turn on tessellation and or in the TES output primitive type. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Allocate const space for tessellation parametersKristian H. Kristensen2019-11-071-0/+7
| | | | | | | | | | The tessellation stages need size and stride or the patch layout as well as locations of attributes in the patch. The tesselation stages also use two system memory BOs and need the iovas of those. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Pre-color TCS header and primitive ID inputsKristian H. Kristensen2019-11-071-2/+12
| | | | | | | | | | Similar to GS, the registers are shared and not reinitialized betewen VS and TCS, so we need to make sure to allocate the same registers for the system values between stages. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Don't assume binning shader is always VSKristian H. Kristensen2019-11-071-2/+2
| | | | | | | | In tessellation mode, the TES is (probably) the binning shader. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Setup inputs and outputs for tessellation stagesKristian H. Kristensen2019-11-071-7/+52
| | | | | | | | | | Similar to GS, some inputs are reused when the chsh from VS to TCS or TES to GS, so we need to make sure we setup the right inputs and make the shared system values outputs so they don't get clobbered. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Implement TCS synchronization intrinsicsKristian H. Kristensen2019-11-072-0/+41
| | | | | | | | | We add two new IR3 specific nir intrinsics that map to the new condend and endpatch instructions. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Implement tess coord intrinsicKristian H. Kristensen2019-11-071-0/+12
| | | | | | | | | | Our lowering pass made the z component unused by replacing its uses by 1 - x - y. The intrinsic implementation then just need to return the x and y components. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: End TES with chsh when using GSKristian H. Kristensen2019-11-071-1/+3
| | | | | | | | | When we have both TES and GS, the TES needs to chain to the VS with chmask and chsh GS just like the VS does to either TCS or GS. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add new synchronization opcodesKristian H. Kristensen2019-11-075-1/+15
| | | | | | | | | | | | | There are two new opcodes in use in tesselation control shaders: category 0, opcodes 13 and 15. unk13 is a kill type of instruction that terminates threads where !p0.x and it used to narrow down a patch wavefront to just thread 0. Then, once thread 0 has written the tess levels, it issues unk15, which might signal the TE that another patch has been fully written. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Extend geometry lowering pass to handle tessellationKristian H. Kristensen2019-11-073-8/+520
| | | | | | | | | | | | | | | | VS and TCS pass varyings the same way as VS and GS does. TCS then writes entire patch to a system memory BO and TES eventually reads back from the BO once the TE starts generating vertices. TES outputs vertices the same way as VS and GS, except when there's a GS as well, in which case TES passes varyings to GS same way the VS would. In addition, the TCS needs a little bit of control flow massaging so that it only runs for valid invocations needs a couple of unknown instructions to synchronize with the TE. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add tessellation field to shader keyKristian H. Kristensen2019-11-073-1/+51
| | | | | | | | | | Whether we're tessellating and which primitives the TES outputs affects the entire pipeline so let's add a field to the key to track that. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Use imul24 in offset calculationsKristian H. Kristensen2019-11-071-2/+2
| | | | | | | | | With the imul24 opcode in place, we can now use it for computing local offsets (ie for ldlw/stlw). Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add ir3 intrinsics for tessellationKristian H. Kristensen2019-11-077-3/+37
| | | | | | | | | These provide the iovas for system memory buffers used for tessellation as well as a new HW specific system value. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Don't count primitives for patchesKristian H. Kristensen2019-11-071-1/+8
| | | | | | | | | | | The gallium helper doesn't like patches and we can't determine how many primitives it gets tessellated into anyway. On gens where we have tessellation, we get the prim count from a HW counter so just skip counting on the CPU. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add load and store intrinsics for global ioKristian H. Kristensen2019-11-072-0/+60
| | | | | | | | | These intrinsics take a ivec2 for the 64 bit base address and a integer offset. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Emit link map as byte or dwords offsets as neededKristian H. Kristensen2019-11-071-2/+16
| | | | | | | | | Stages that load inputs with ldlw (TCS, GS) need byte offsets, stages that load with ldg (TES) need dwords offsets. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Add register offset for STG/LDGKristian H. Kristensen2019-11-075-9/+64
| | | | | | | | | | These instructions take a 64 bit iova as two conescutive registers and a immediate offset. This patch adds support for the offset to be a single register, which is added to the 64 bit iova. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6x: Rename z/s formatsKristian H. Kristensen2019-11-077-20/+20
| | | | | | | | | | What we call eRB6_Z24_UNORM_S8_UINT now is actually RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually RB6_Z24_UNORM_S8_UINT. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Fix layered texture type enumKristian H. Kristensen2019-11-072-7/+8
| | | | | | | | 2D array textures and 3D textures are different enum values after all. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Add nogmem debug option to force bypass renderingKristian H. Kristensen2019-11-073-1/+5
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Clear sysmem with CP_BLITKristian H. Kristensen2019-11-075-15/+171
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Fix primitive counters againKristian H. Kristensen2019-11-071-47/+104
| | | | | | | | | | | | | We use one mechanism for (REG_A6XX_RBBM_PRIMCTR_8_LO) PIPE_QUERY_PRIMITIVES_GENERATED, which counts all primitives that exit the geometry pipeline, whether or not xfb is on. Then for PIPE_QUERY_PRIMITIVES_EMITTED, we use the CP_EVENT_WRITE subfunction that writes out per-stream counts for generated and emitted, but only when xfb is enabled. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/registers: Add comments about primitive countersKristian H. Kristensen2019-11-071-12/+10
| | | | | | | | Adding comments about best guess at what the counters count. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/registers: Move SP_PRIMITIVE_CNTL and SP_VS_VPC_DSTKristian H. Kristensen2019-11-071-28/+28
| | | | | | | | Move these two to be in order with the other VS regs. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/registers: Fix typoKristian H. Kristensen2019-11-071-1/+1
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* aco: add Instruction::usesModifiers() and add more checks in the optimizerRhys Perry2019-11-082-7/+23
| | | | | | | | No pipeline-db changes. v2: use early-exit for VOP3 Reviewed-by: Daniel Schürmann <[email protected]> (v1)
* radv: adjust loop unrolling heuristics for int64Rhys Perry2019-11-072-7/+16
| | | | | | | | | | | In particular, increase the cost of 64-bit integer division. Fixes huge shaders with dEQP-VK.spirv_assembly.type.scalar.i64.mod_geom , with ACO used for GS this creates shaders requiring a branch with >32767 dword offset. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* lima: fix bo submit memory leakErico Nunes2019-11-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix memory leak on allocation for lima submit, reported by valgrind. 128 bytes in 1 blocks are definitely lost in loss record 38 of 84 at 0x484A6E8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so) by 0x58689C7: util_dynarray_ensure_cap (u_dynarray.h:91) by 0x5868BBB: util_dynarray_grow_bytes (u_dynarray.h:139) by 0x5868BBB: lima_submit_add_bo (lima_submit.c:113) by 0x585D7D3: lima_ctx_buff_va (lima_context.c:57) by 0x586378F: lima_pack_plbu_cmd (lima_draw.c:802) by 0x586378F: lima_draw_vbo (lima_draw.c:1351) by 0x5406A2F: u_vbuf_draw_vbo (u_vbuf.c:1184) by 0x55D0A57: st_draw_vbo (st_draw.c:268) by 0x55576CB: _mesa_draw_arrays (draw.c:374) by 0x55576CB: _mesa_draw_arrays (draw.c:351) by 0x43610B: Mesh::render_vbo() (mesh.cpp:583) by 0x415DBB: SceneBuild::draw() (scene-build.cpp:242) by 0x41131B: MainLoop::draw() (main-loop.cpp:133) by 0x411947: MainLoop::step() (main-loop.cpp:108) Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima: fix nir shader memory leakErico Nunes2019-11-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | Fix memory leak on allocation for nir shader, reported by valgrind. 3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84 at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so) by 0x5750817: ralloc_size (ralloc.c:119) by 0x5750977: rzalloc_size (ralloc.c:151) by 0x575C173: nir_shader_create (nir.c:45) by 0x5763ACB: nir_shader_clone (nir_clone.c:728) by 0x55D5003: st_create_fp_variant (st_program.c:1242) by 0x55D789F: st_get_fp_variant (st_program.c:1522) by 0x55D789F: st_get_fp_variant (st_program.c:1507) by 0x56400C3: st_update_fp (st_atom_shader.c:163) by 0x563D333: st_validate_state (st_atom.c:261) by 0x55D07CB: prepare_draw (st_draw.c:132) by 0x55D08DF: st_draw_vbo (st_draw.c:184) by 0x55576CB: _mesa_draw_arrays (draw.c:374) by 0x55576CB: _mesa_draw_arrays (draw.c:351) Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* Meson: Remove lib prefix from graw and osmesa when building with Mingw.Prodea Alexandru-Liviu2019-11-074-0/+5
| | | | | | | | | | | Also remove version sufix from osmesa swrast on Windows. v2: Make sure we don't remove lib prefix on *nix platforms. Signed-off-by: Prodea Alexandru-Liviu <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Cc: "19.3" <[email protected]>
* mesa: expose SPIR-V extensions in the Compatibility profile tooMarek Olšák2019-11-071-2/+2
| | | | | | | | | We would like to have GL 4.6 Compatibility too. The extensions don't support compatibility features, so no other changes are needed. Reviewed-by: Alejandro Piñeiro <[email protected]>
* st_get_external_sampler_key: improve error messageDrew DeVault2019-11-071-1/+2
| | | | Signed-off-by: Marek Olšák <[email protected]>
* mesa/st: Make st_pipe_format_to_mesa_format an effective no-op.Eric Anholt2019-11-072-588/+3
| | | | | | | | | | All callers other than the unit test just wanted to convert back from a known-mesa-equivalent format, which is now a no-op. v2: Fix assertion failure in iris GL startup with BGR565 by continuing to return MESA_FORMAT_NONE for non-Mesa formats. Reviewed-by: Marek Olšák <[email protected]> (v1)
* mesa/st: Gut most of st_mesa_format_to_pipe_format().Eric Anholt2019-11-071-629/+40
| | | | | | | | | Now that MESA_FORMAT_x is just a PIPE_FORMAT_x define, we can strip this function down to just the compression fallbacks. v2: Restore the SRGB format for ASTC SRGB fallback case. Reviewed-by: Marek Olšák <[email protected]>
* mesa: Redefine MESA_FORMAT_* in terms of PIPE_FORMAT_*.Eric Anholt2019-11-071-329/+271
| | | | | | | | | | | | | | | | | There are various places in Mesa where we would like to be able to have a shared format enum between Mesa and gallium (NIR compiler's image formats, for example, or mapping from gallium's formats to mesa's and vice versa in st_format.c). Rewriting all MESA_FORMAT to PIPE_FORMAT would be disruptive and possibly more work than it's worth (And I actually prefer MESA_FORMAT's name scheme), so for now just make it so that there's one shared set of enum values. The #defines here were generated by printing out from the tests/st_format.c round-tripping loop, with the exception of 8888 formats where I hand-edited the #defines to point at the corresponding gallium packed format define. Reviewed-by: Marek Olšák <[email protected]>
* mesa: Prepare for the MESA_FORMAT_* enum to be sparse.Eric Anholt2019-11-076-4/+29
| | | | | | | | | To redefine MESA_FORMAT in terms of PIPE_FORMAT enums, we need to fix places where we iterated up to MESA_FORMAT_COUNT. I use _mesa_get_format_name(f) == NULL as the signal that it's not an enum value with a MESA_FORMAT. Reviewed-by: Marek Olšák <[email protected]>
* mesa/st: Test round-tripping of all compressed formats.Eric Anholt2019-11-071-2/+4
| | | | | | | | | We checked round-tripping of formats without fallbacks, but weren't setting the compression support flags in the mock context and thus needed to skip testing those. Just set all the flags and assert that no fallbacks are triggered, so we get full test coverage. Reviewed-by: Marek Olšák <[email protected]>
* mesa: Stop defining a full separate format for RGBA_UINT8.Eric Anholt2019-11-076-12/+11
| | | | | | | | | We have packed formats for RGBA and ABGR already, so we can just pack/unpack code. v2: Rebase on endianness macro rename Reviewed-by: Marek Olšák <[email protected]> (v1)
* gallium: Add equivalents of packed MESA_FORMAT_*UINT formats.Eric Anholt2019-11-073-0/+111
| | | | | | | | These are the last formats that MESA_FORMAT had and PIPE_FORMAT didn't. The .csv entries channel sizes and swizzles all came from the corresponding UNORM format. Reviewed-by: Marek Olšák <[email protected]>
* gallium: Add an equivalent of MESA_FORMAT_BGR_UNORM8.Eric Anholt2019-11-073-0/+6
| | | | | | | | This is the last unorm format that MESA_FORMAT had and PIPE_FORMAT didn't. Note that it's an array format on gallium's side as well, since it's a NPOT pixel size. Reviewed-by: Marek Olšák <[email protected]>
* gallium: Add some more channel orderings of packed formats.Eric Anholt2019-11-073-0/+48
| | | | | | This covers everything that MESA_FORMAT had for packed unorm. Reviewed-by: Marek Olšák <[email protected]>
* gallium: Add defines for FXT1 texture compression.Eric Anholt2019-11-076-2/+22
| | | | | | | | | | This texture compression is exposed by 830 and 915, and to make MESA_FORMAT match PIPE_FORMAT defines I need a corresponding PIPE_FORMAT. v2: Set is_hand_written so we don't try to generate pack/unpack code. Reviewed-by: Marek Olšák <[email protected]>
* mesa/st: Add mapping of MESA_FORMAT_RGB_SNORM16 to gallium.Eric Anholt2019-11-071-0/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radv/gfx10: fix primitive indices orientation for NGG GSSamuel Pitoiset2019-11-072-9/+45
| | | | | | | | | | The primitive indices have to be swapped to follow the drawing order. This fixes corruption with Overwatch when NGG GS is force enabled. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>