summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/panfrost
Commit message (Collapse)AuthorAgeFilesLines
* panfrost/midgard: Allow flt to run on most unitsAlyssa Rosenzweig2019-02-271-1/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Expose perf counters in environmentAlyssa Rosenzweig2019-02-273-13/+11
| | | | | | | Previously, we were guarded by an #ifdef, which is generally a bad form. This patch instead guards them behind an environmental variable. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Identify 4-bit channel texture formatsAlyssa Rosenzweig2019-02-273-0/+6
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add RGB565, RGB5A1 texture formatsAlyssa Rosenzweig2019-02-272-0/+4
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Decode render target swizzle/channelsAlyssa Rosenzweig2019-02-253-23/+81
| | | | | | | | | | | On MRT-capable systems, the framebuffer format is encoded as a 64-bit word in the render target descriptor. Previously, the two 32-bit words were exposed as opaque hex values. This commit identifies a 12-bit Mali swizzle and a 2-bit channel counter, removing some of the magic. It also adds decoding support for the AFBC and MSAA enable bits, which were already known but otherwise ignored in pandecode. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Add fround(_even), ftrunc, ffmaAlyssa Rosenzweig2019-02-253-0/+14
| | | | | | | | | | | These ops were discovered by invoking the correspondingly names GLSL functions. The rounding ops here behave exact as expected and are mapped to their corresponding NIR ops where applicable. The ffma behaves as a LUT instruction and requires some special argument packing (since Midgard normally only allows for 2 arguments); this quirk will be addressed in the future, but for now FMA is still lowered. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/nondrm: Split out dump_countersAlyssa Rosenzweig2019-02-252-5/+10
| | | | | | Previously, this function was implied a part of the job submit. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/nondrm: Make COHERENT_LOCAL explicitAlyssa Rosenzweig2019-02-252-1/+2
| | | | | | | | This flag corresponds to what was MEM_COHERENT_LOCAL in the vendor driver, which seems to influence the cache policy, necessary for the varying temporary storage but nothing else. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/nondrm: Flag CPU-invisible regionsAlyssa Rosenzweig2019-02-252-3/+4
| | | | | | | Potentially, the kernel could optimize these allocations, or perhaps we can save on mapping costs. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/meson: Remove subdir for nondrmAlyssa Rosenzweig2019-02-251-1/+0
| | | | | | This change fixes cross builds with the (temporary) non-DRM overlay. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use tiler fast path (performance boost)Alyssa Rosenzweig2019-02-251-4/+38
| | | | | | | | | | | | | | | | | | For reasons that are still unclear (speculation included in the comment added in this patch), the tiler? metadata has a fast path that we were not enabling; there looks to be a possible time/memory tradeoff, but the details remain unclear. Regardless, this patch improves performance dramatically. Particular wins are for geometry-heavy scenes. For instance, glmark2-es2's Phong-shaded bunny, rendering at fullscreen (2400x1600) via GBM, jumped from ~20fps to hitting vsync cap at 60fps. Gains are even more obvious when vsync is disabled, as in glmark2-es2-wayland. With this patch, on GLES 2.0 samples not involving FBOs, it appears performance is converging with (and sometimes surpassing) the blob. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Verify and print brx condition in disasmAlyssa Rosenzweig2019-02-211-1/+10
| | | | | | | | | | | | The condition code in extended branches is repeated 8 times for unclear reasons; accordingly, the code would be disassembled as "unknown5555", "unknownAAAA", etc. This patch correctly masks off the lower two bits to find the true code to print, verifying that the code is repeated as believed to be necessary (providing some assurance for compiler quality and an assert trip in case we encounter a shader in the wild that breaks the convention). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Dynamically set discard branch targetsAlyssa Rosenzweig2019-02-211-20/+46
| | | | | | | | | | | | | discard and discard_if are both implemented with the branching pipeline on Midgard; essentially, we branch to the end of the fragment shader in a special "discard" mode, setting the condition as necessary. Previously, we hardcoded the form of this instruction, which worked for very simple shaders but was incorrect for anything remotely interesting. This patch instead emits logical branches in the IR, which are flattened to real discard ops the same way other branches are, allowing targets to be computed correctly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Emit extended branchesAlyssa Rosenzweig2019-02-212-17/+84
| | | | | | | | | | | | | | | Previously, we only emitted compact branches; however, the offset range of these branches is too small for many real world shaders. This patch implements support for emitting extended branches and switches to always using them for control flow. This incurs a code size and possibly performance penalty, but expands the range of working shaders and provides opportunity for further optimization. Support for emitting compact branches is retained but this code path is presently unused. In the future, we'll want to heuristically determine which type of branch should be emitted for optimal codegen. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Rectify doubleplusungood extended branchAlyssa Rosenzweig2019-02-212-8/+8
| | | | | | | | | | | | | | Midgard features "compact branches" and "extended branches", i.e. corresponds to short jumps and far jumps. The form of the extended branch was previously incorrect in the ISA headers; this patch corrects it and updates the disassembler (simultaneous to preserve bisectability). Additionally, we fix some a corner case in the disassembly of extended branches, and we now prefix extended branches with "brx", to visually differentiate from compact branches prefixed with "br". Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix nested/chained if-elseAlyssa Rosenzweig2019-02-211-3/+3
| | | | | | | | | | | | An if-else statement is compiled to a conditional branch (from the start to the second block) and an unconditional branch (from the end of the first block to the end of the else). We previously incorrectly computed the block index of the unconditional branch to be exactly one after that of the conditional branch, valid for a single if-else statement but nothing fancier. This patch correctly computes the unconditional branch target, fixing more complex if-else chains. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Refactor tag lookahead codeAlyssa Rosenzweig2019-02-211-27/+31
| | | | | | | | | | | | Each Midgard instruction is scheduled to a particular instruction type ("tag"). Presumably the hardware prefetches memory based on tag, so it is required to report out the first tag to the command stream and the next tag of a branch target. This procedure was implemented in two separate parts of the compiler (one time with a slight bug relating to empty blocks); this patch refactors to unite the two routines and solve the bug when branching to empty blocks. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement pantrace (command stream dump)Alyssa Rosenzweig2019-02-215-0/+188
| | | | | | | | | | Historically, Panfrost debugging entailed the use of the LD_PRELOADable `panwrap` tool. This setup is a tad fragile; Panfrost can be traced directly without the intermediate layer. pantrace implements the quivalent functionality of panwrap into Panfrost proper, allowing dumps to work regardless of the kernel layer in use. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add pandecode (command stream debugger)Alyssa Rosenzweig2019-02-216-3/+2289
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `panwrap` utility can be LD_PRELOAD'd into a GLES app, intercepting communication between the driver and the kernel. Modern panwrap versions do no processing of their own; instead, they create a trace directory. This directory contains the following files: - control.log: a line-by-line plain text file, denoting important syscalls (mmaps and job submits) along with their arguments - memory_*.bin, shader_*.bin: binary dumps of mapped memory Together, these files contain enough information to reconstruct the command stream and shaders of (at minimum) a single frame. The `pandecode` utility takes this directory structure as input, reconstructing the mapped memory and using the job submit command as an entrypoint. It then walks the descriptors as the hardware would, parsing and pretty-printing. Its final output is the pretty-printed command stream interleaved with the disassembled shaders, suitable for driver debugging. For instance, the behaviour of two driver versions (one working, one broken) can be compared by diff'ing their decoded logs. pandecode/decode.c was originally a part of `panwrap`; it is the oldest living code in the project. Its history is generally not worth preserving. panwrap itself will continue to live downstream for the foreseeable future, as it is specifically written for the vendor kernel. It is possible, however, to produce equivalent traces directly from Panfrost, bypassing the intermediate wrapping layer for well-behaved drivers. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stub out separate stencil functionsAlyssa Rosenzweig2019-02-212-4/+25
| | | | | | | This is not yet functional, but it resolves a crash in various apps and provides a framework for further work. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix clipping regionAlyssa Rosenzweig2019-02-181-4/+11
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Preserve w sign in perspective divisionAlyssa Rosenzweig2019-02-181-2/+4
| | | | | | | This fixes issues where polygons that should be culled (due to negative w, for instance) may not be. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Cleanup mali_viewport (clipping) codeAlyssa Rosenzweig2019-02-182-17/+19
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Swap order of tiled texture (de)allocAlyssa Rosenzweig2019-02-181-6/+6
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Free imported BOsAlyssa Rosenzweig2019-02-183-0/+12
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix various leaks unmapping resourcesAlyssa Rosenzweig2019-02-182-9/+14
| | | | | | v2: Don't check for NULL before free() Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Improve logging and patch memory leaksAlyssa Rosenzweig2019-02-152-49/+48
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't align framebuffer dimsAlyssa Rosenzweig2019-02-151-2/+2
| | | | | | Fixes regressions with EGL clients Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement PIPE_QUERY_OCCLUSION_COUNTERAlyssa Rosenzweig2019-02-151-1/+8
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Identify MALI_OCCLUSION_PRECISE bitAlyssa Rosenzweig2019-02-152-5/+7
| | | | | | Setting this is required for desktop-style occlusion queries. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Backport driver to Mali T600/T700Alyssa Rosenzweig2019-02-156-262/+340
| | | | | | | | | | | | | | | | | | | | | | | | | | | | There are a few differenes between Mali T860 (Panfrost's primary reference target) and the older Midgard generations (T600/T700): - Miscellaneous different magic numbers. It's not clear what these numbers mean on either the old or new configurations yet. - Errata fixes. T800 is the final Midgard generation and presumably the least buggy. Older Midgard has some extra hardware errata we have to workaround. - SFBD vs MFBD split. Essentially, older Midgard use a Single FrameBuffer Descriptor (SFBD), which corresponds to single render-target rendering. Newer Midgard (T760+) use a Multiple FrameBuffer Descriptor (MFBD), allowing multiple RTs. On ES 2.0, these descriptors serve the same function, but we implement both, depending on the version of the hardware. - CPU bitness. 32-bit systems generally use 32-bit GPU descriptors, and vice versa for 64-bit. Our target T760 systems are 32-bit whereas our target T860 systems are 64-bit. More work is needed in this area. This patch fixes support in these areas for supporting older Midgard hardware. It is tested on Mali T760 and Mali T860. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix build; depend on libdrmAlyssa Rosenzweig2019-02-151-0/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* drm-uapi: use local files, not system libdrmEric Engestrom2019-02-144-4/+3
| | | | | | | | | There was an issue recently caused by the system header being included by mistake, so let's just get rid of this include path and always explicitly #include "drm-uapi/FOO.h" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: Move panfrost's isign lowering to nir_opt_algebraic.Eric Anholt2019-02-142-1/+1
| | | | | | | I wanted to reuse this from v3d. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* panfrost: Specify supported draw modes per-contextAlyssa Rosenzweig2019-02-112-12/+11
| | | | | | | | | | | | Midgard has native support for QUADS and POLYGONS; Bifrost seemingly does not. Thus, Midgard generally skips prim_convert whereas Bifrost needs the pass; this patch allows the setting of allowed primitives to occur on a per-context basis (for runtime hardware selection). v2: Use (POLYGONS + 1) instead of LINES_ADJACENCY. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Robert Foss <[email protected]>
* panfrost: Elucidate texture op scheduling commentAlyssa Rosenzweig2019-02-101-8/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove speculative if 0'd format bit codeAlyssa Rosenzweig2019-02-101-6/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove if 0'd dead codeAlyssa Rosenzweig2019-02-105-83/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add kernel-agnostic resource managementAlyssa Rosenzweig2019-02-102-15/+172
| | | | | | | | | | Various methods relating to resource management were previously marked as kernel-specific, forcing them to stay downstream in the vendor overlay and eventually be duplicated for DRM code. This patch adds back this code in kernel-neutral space, allowing for code sharing and minimising the diff to downstream. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't hardcode number of nir_ssa_defsAlyssa Rosenzweig2019-02-101-14/+14
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Clean-up one-argument passing quirkAlyssa Rosenzweig2019-02-102-114/+112
| | | | | | | | | | | | | | | | | Most Midgard instructions take two-arguments logically; there are always two arguments at the assembly level. For the few instructions that take only a single argument, generally the second argument slot is unused, with a zero inline constant occupying the space. fmov/imov are the exception, where the first argument is filled with r24 and the logical argument is in the second slot. Previously, these constraints were handled by a delicate, buggy series of hacks. This commit removes these hacks. Instead, we look at the logical number of arguments (from NIR), switching between two argument and one-argument-one-zero style. We then introduce a quirk for the flipped style, which applies to fmov/imov. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* gallium: add PIPE_CAP_MAX_VARYINGSKarol Herbst2019-02-071-0/+3
| | | | | | | | | | | | | | | | | Some NVIDIA hardware can accept 128 fragment shader input components, but only have up to 124 varying-interpolated input components. We add a new cap to express this cleanly. For most drivers, this will have the same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader. Fixes KHR-GL45.limits.max_fragment_input_components Signed-off-by: Karol Herbst <[email protected]> [imirkin: rebased, improved docs/commit message] Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: 19.0 <[email protected]>
* panfrost: Include glue for out-of-tree legacy codeAlyssa Rosenzweig2019-02-075-7/+29
| | | | | | | | | | | | | | | | | In addition to the DRM interface in active development, for legacy kernels Panfrost has a small, optional, out-of-tree glue repository. For various reasons, this legacy code should not be included in Mesa proper, but this commit allows it to coexist peacefully with upstream Panfrost. If the nondrm repo is cloned/symlinked to the directory `src/gallium/drivers/panfrost/nondrm`, legacy functionality will be built. Otherwise, the driver will build normally, though a runtime error message will be printed if a legacy kernel is detected. This workaround is icky, but it allows a nearly-upstream Panfrost to work on real hardware, today. Ideally, this patch will be reverted when the Panfrost kernel module is mature and we drop legacy support. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Check in sources for command streamAlyssa Rosenzweig2019-02-0722-5/+5441
| | | | | | | | This patch includes the command stream portion of the driver, complementing the earlier compiler. It provides a base for future work, though it does not integrate with any particular winsys. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use u_pipe_screen_get_param_defaultsAlyssa Rosenzweig2019-02-071-151/+6
| | | | | | | | Switching to the defaults function cleans up pan_screen.h markedly and futureproofs for when new PIPE_CAPs are added. Signed-off-by: Alyssa Rosenzweig <[email protected]> Suggested-by: Eric Anholt <[email protected]>
* panfrost: Implement Midgard shader toolchainAlyssa Rosenzweig2019-02-0513-2/+6383
| | | | | | | | | | | | | | | This patch implements the free Midgard shader toolchain: the assembler, the disassembler, and the NIR-based compiler. The assembler is a standalone inaccessible Python script for reference purposes. The disassembler and the compiler are implemented in C, accessible via the standalone `midgard_compiler` binary. Later patches will use these interfaces from the driver for online compilation. Signed-off-by: Alyssa Rosenzweig <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]> Acked-by: Emil Velikov <[email protected]>
* panfrost: Initial stub for Panfrost driverAlyssa Rosenzweig2019-02-0511-0/+2984
This patch adds an initial stub for the Gallium driver, containing simple screen functions and the majority of the driver headers but no actual functionality. It further adds the winsys glue for linking in this stub driver via kmsro on Rockchip/Amlogic boards. Signed-off-by: Alyssa Rosenzweig <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]> Acked-by: Emil Velikov <[email protected]>