summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: add Adreno 640 IDJonathan Marek2019-11-112-0/+10
| | | | | | | A640 seems to work without any other changes (glmark and vkcube). Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* st/mesa: remove unused TGSI-only debug printing functionsMarek Olšák2019-11-111-4/+0
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* panfrost: Select format-specific blending intrinsicsAlyssa Rosenzweig2019-11-113-9/+41
| | | | | | | | | | | If we have an accelerated path for a particular framebuffer format, let's use it to save a bunch of instructions in a blend shader. [Tomeu: Only use the faster intrinsic on >T760] Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Set depth and stencil for SFBD based on the formatTomeu Vizoso2019-11-114-21/+36
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* zink: correct depth-stencil formatErik Faye-Lund2019-11-111-1/+1
| | | | | | | | | | | | | | | | | | When using packed vulkan-formats on little-endian systems, we need to swap the components for the gallium formats. And since Zink isn't big-endian safe yet, little-endian is the only endianess we care about right now. This fixes a bunch of piglit tests, amongs others: - spec@arb_depth_texture@depth-level-clamp - spec@arb_depth_texture@depthstencil-render-miplevels * d=z24 - spec@arb_depth_texture@fbo-depth-gl_depth_component24-blit - spec@arb_depth_texture@fbo-depth-gl_depth_component24-copypixels - spec@arb_depth_texture@fbo-depth-gl_depth_component24-drawpixels - spec@arb_depth_texture@fbo-depth-gl_depth_component24-readpixels Signed-off-by: Erik Faye-Lund <[email protected]> Fixes: 8d46e35d16e ("zink: introduce opengl over vulkan")
* zink/spirv: add support for nir_op_flrpErik Faye-Lund2019-11-111-0/+15
| | | | | | | | This fixes the following piglit: spec@ati_fragment_shader@ati_fragment_shader-render-fog Signed-off-by: Erik Faye-Lund <[email protected]>
* freedreno/ir3: also track # of nops for shader-dbRob Clark2019-11-091-1/+3
| | | | | | | | | | | | | The instruction count is (mostly) a measure of what optimization passes can do, while # of nops is more an indication of how effectively the scheduler is balancing register pressure vs instruction count. So track these independently. (There could be opportunities to rematerialize values to reduce register pressure, swapping some nop's with other alu instructions, so nothing is truely independent.. but it is still useful to break these stats out.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISIONRob Clark2019-11-091-1/+1
| | | | | | Set flag based on actual output reg type. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISIONRob Clark2019-11-091-1/+1
| | | | | | | We should really be setting this based on the actual output register type. Signed-off-by: Rob Clark <[email protected]>
* radeonsi/nir: fix compute shader crash due to nir_binary == NULLMarek Olšák2019-11-081-2/+12
| | | | | | This partially reverts 8b30114dda8. Fixes: 8b30114dda8 "radeonsi/nir: call nir_serialize only once per shader"
* radeonsi/nir: call nir_serialize only once per shaderMarek Olšák2019-11-081-21/+21
| | | | | | | | We were calling it twice. First serialize it, then use it to compute the cache key. Reviewed-by: Timothy Arceri <[email protected]>
* virgl: support emulating planar image samplingDavid Stevens2019-11-081-1/+6
| | | | | | | | | | Mesa emulates planar format sampling with per-plane samplers. Virgl now supports this by allowing the plane index to be passed when creating a sampler view from a planar image. With this change, mesa now passes that information to virgl. Signed-off-by: David Stevens <[email protected]> Reviewed-by: Lepton Wu <[email protected]>
* gallium/swr: Enable some ARB_gpu_shader5 extensionsKrzysztof Raszkowski2019-11-081-0/+1
| | | | | | | | | Enable / add to features.txt: - Enhanced textureGather. - Geometry shader instancing. - Geometry shader multiple streams. Reviewed-by: Jan Zielinski <[email protected]>
* gallium/swr: Fix GS invocation issuesKrzysztof Raszkowski2019-11-081-2/+7
| | | | | | | - Fixed proper setting gl_InvocationID. - Fixed GS vertices output memory overflow. Reviewed-by: Jan Zielinski <[email protected]>
* panfrost: Try to evict unused BOs from the cacheBoris Brezillon2019-11-084-6/+61
| | | | | | | | | | | | | | | | | | | | | | The panfrost BO cache can only grow since all newly allocated BOs are returned to the cache (unless they've been exported). With the MADVISE ioctl that's not a big issue because the kernel can come and reclaim this memory, but MADVISE will only be available on 5.4 kernels. This means an app can currently allocate a lot memory without ever releasing it, leading to some situations where the OOM-killer kicks in and kills the app (or even worse, kills another process consuming more memory than the GL app) to get some of this memory back. Let's try to limit the amount of BOs we keep in the cache by evicting entries that have not been used for more than one second (if the app stopped allocating BOs of this size, it's likely to not allocate similar BOs in a near future). This solution is based on the VC4/V3D implementation. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move BO cache related fields to a sub-structBoris Brezillon2019-11-083-18/+21
| | | | | | | | | We will soon introduce an LRU list to evict BOs that have been unused for more than 1 second. Let's first move all BO cache fields to a sub-struct to clarify which fields are used by the BO caching logic. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* freedreno/a6xx: Turn on tessellation shadersKristian H. Kristensen2019-11-071-1/+13
| | | | | | | | Wow. Very triangle. So shader. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Only use merged regs and four quads for VS+FSKristian H. Kristensen2019-11-071-5/+15
| | | | | | | | | When other geometry stages are present, we chose two quads and no merged regs. Acked-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/blitter: Save tessellation stateKristian H. Kristensen2019-11-071-0/+2
| | | | | | | | We have tessellation state now. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Only set emit.hs/ds when we're drawing patchesKristian H. Kristensen2019-11-071-2/+3
| | | | | | | | | At least the gallium blitter helper will call us to draw with tessellation shaders set but a non-patch primitive. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Use bypass rendering for tessellationKristian H. Kristensen2019-11-071-0/+8
| | | | | | | | | | It seems like tiling could work in the Adreno architecture, but we've only ever seen bypass rendering with tessellation. For now, let's do that too. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Program state for tessellation stagesKristian H. Kristensen2019-11-073-34/+157
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Emit constant parameters for tessellation stagesKristian H. Kristensen2019-11-071-10/+84
| | | | | | | | Assemble the information the stages need and emit the constants. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Allocate and program tessellation bufferKristian H. Kristensen2019-11-073-0/+44
| | | | | | | | | Tessellation needs a couple of buffers that should hold the entire output from a full VS+TCS draw call. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Build the right draw command for tessellationKristian H. Kristensen2019-11-073-4/+52
| | | | | | | | | We need to select the right primitive type, set a bit to turn on tessellation and or in the TES output primitive type. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add tessellation field to shader keyKristian H. Kristensen2019-11-071-0/+17
| | | | | | | | | | Whether we're tessellating and which primitives the TES outputs affects the entire pipeline so let's add a field to the key to track that. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Don't count primitives for patchesKristian H. Kristensen2019-11-071-1/+8
| | | | | | | | | | | The gallium helper doesn't like patches and we can't determine how many primitives it gets tessellated into anyway. On gens where we have tessellation, we get the prim count from a HW counter so just skip counting on the CPU. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Emit link map as byte or dwords offsets as neededKristian H. Kristensen2019-11-071-2/+16
| | | | | | | | | Stages that load inputs with ldlw (TCS, GS) need byte offsets, stages that load with ldg (TES) need dwords offsets. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6x: Rename z/s formatsKristian H. Kristensen2019-11-074-10/+10
| | | | | | | | | | What we call eRB6_Z24_UNORM_S8_UINT now is actually RB6_Z24_UNORM_S8_UINT_AS_R8G8B8A8 and RB6_X8Z24_UNORM is actually RB6_Z24_UNORM_S8_UINT. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Fix layered texture type enumKristian H. Kristensen2019-11-071-4/+4
| | | | | | | | 2D array textures and 3D textures are different enum values after all. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Add nogmem debug option to force bypass renderingKristian H. Kristensen2019-11-073-1/+5
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Clear sysmem with CP_BLITKristian H. Kristensen2019-11-074-15/+167
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Fix primitive counters againKristian H. Kristensen2019-11-071-47/+104
| | | | | | | | | | | | | We use one mechanism for (REG_A6XX_RBBM_PRIMCTR_8_LO) PIPE_QUERY_PRIMITIVES_GENERATED, which counts all primitives that exit the geometry pipeline, whether or not xfb is on. Then for PIPE_QUERY_PRIMITIVES_EMITTED, we use the CP_EVENT_WRITE subfunction that writes out per-stream counts for generated and emitted, but only when xfb is enabled. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* lima: fix bo submit memory leakErico Nunes2019-11-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | Fix memory leak on allocation for lima submit, reported by valgrind. 128 bytes in 1 blocks are definitely lost in loss record 38 of 84 at 0x484A6E8: realloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so) by 0x58689C7: util_dynarray_ensure_cap (u_dynarray.h:91) by 0x5868BBB: util_dynarray_grow_bytes (u_dynarray.h:139) by 0x5868BBB: lima_submit_add_bo (lima_submit.c:113) by 0x585D7D3: lima_ctx_buff_va (lima_context.c:57) by 0x586378F: lima_pack_plbu_cmd (lima_draw.c:802) by 0x586378F: lima_draw_vbo (lima_draw.c:1351) by 0x5406A2F: u_vbuf_draw_vbo (u_vbuf.c:1184) by 0x55D0A57: st_draw_vbo (st_draw.c:268) by 0x55576CB: _mesa_draw_arrays (draw.c:374) by 0x55576CB: _mesa_draw_arrays (draw.c:351) by 0x43610B: Mesh::render_vbo() (mesh.cpp:583) by 0x415DBB: SceneBuild::draw() (scene-build.cpp:242) by 0x41131B: MainLoop::draw() (main-loop.cpp:133) by 0x411947: MainLoop::step() (main-loop.cpp:108) Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima: fix nir shader memory leakErico Nunes2019-11-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | Fix memory leak on allocation for nir shader, reported by valgrind. 3,502 (480 direct, 3,022 indirect) bytes in 1 blocks are definitely lost in loss record 77 of 84 at 0x48483F8: malloc (in /usr/lib/valgrind/vgpreload_memcheck-arm64-linux.so) by 0x5750817: ralloc_size (ralloc.c:119) by 0x5750977: rzalloc_size (ralloc.c:151) by 0x575C173: nir_shader_create (nir.c:45) by 0x5763ACB: nir_shader_clone (nir_clone.c:728) by 0x55D5003: st_create_fp_variant (st_program.c:1242) by 0x55D789F: st_get_fp_variant (st_program.c:1522) by 0x55D789F: st_get_fp_variant (st_program.c:1507) by 0x56400C3: st_update_fp (st_atom_shader.c:163) by 0x563D333: st_validate_state (st_atom.c:261) by 0x55D07CB: prepare_draw (st_draw.c:132) by 0x55D08DF: st_draw_vbo (st_draw.c:184) by 0x55576CB: _mesa_draw_arrays (draw.c:374) by 0x55576CB: _mesa_draw_arrays (draw.c:351) Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* gallium: Add defines for FXT1 texture compression.Eric Anholt2019-11-071-1/+2
| | | | | | | | | | This texture compression is exposed by 830 and 915, and to make MESA_FORMAT match PIPE_FORMAT defines I need a corresponding PIPE_FORMAT. v2: Set is_hand_written so we don't try to generate pack/unpack code. Reviewed-by: Marek Olšák <[email protected]>
* panfrost: Pipe the GPU ID into compiler and disassemblerTomeu Vizoso2019-11-073-3/+4
| | | | Signed-off-by: Tomeu Vizoso <[email protected]>
* panfrost: Release the ctx->pipe_framebuffer refBoris Brezillon2019-11-071-0/+1
| | | | | | | | ctx->pipe_framebuffer contains the last bound FB state, let's release resources pointed by this FB state when the context is destroyed. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Destroy the upload manager allocated in panfrost_create_context()Boris Brezillon2019-11-071-0/+2
| | | | | | | | | pipe->stream_uploader has been allocated with u_upload_create_default() in panfrost_create_context(), let's destroy it in the context destroy path. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Generate polygon list manually for SFBDTomeu Vizoso2019-11-062-1/+18
| | | | | | | | On clears without draws, the SFBD GPUs need for userspace to generate the trivial polygon list. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Decode blend shaders for SFBDTomeu Vizoso2019-11-061-1/+3
| | | | | | | Also set MALI_HAS_BLEND_SHADER as needed. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Take into account texture layers in SFBDTomeu Vizoso2019-11-061-5/+6
| | | | | Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Rework format encoding on SFBDTomeu Vizoso2019-11-066-46/+104
| | | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Set 0x10 bit on mali_shader_meta.unknown2_4 on T720Tomeu Vizoso2019-11-062-7/+3
| | | | | | | | | Testing shows that it's needed. Also remove ctx->is_t6xx as it was the last use of it. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add checksum fields to SFBD descriptorTomeu Vizoso2019-11-061-0/+12
| | | | | | | During tests on T720, these fields were discovered. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* zink: do advertize integer support in shadersErik Faye-Lund2019-11-061-1/+3
| | | | | | This is supported, so let's correct this. Signed-off-by: Erik Faye-Lund <[email protected]>
* zink/spirv: implement ball_fequal[2-4]Erik Faye-Lund2019-11-061-0/+12
|
* zink/spirv: implement ball_iequal[2-4]Erik Faye-Lund2019-11-061-0/+12
|
* zink/spirv: implement bany_inequal[2-4]Erik Faye-Lund2019-11-061-0/+12
|
* zink/spirv: implement bany_fnequal[2-4]Erik Faye-Lund2019-11-061-0/+12
|