summaryrefslogtreecommitdiffstats
path: root/src/panfrost/shared
Commit message (Collapse)AuthorAgeFilesLines
* panfrost: Extend the tiled store fast-path to loadsIcecream952020-03-261-28/+47
| | | | | | | | | | | | The access functions are forced to be inline, so performance shouldn't be impacted for stores. WebGL performance in Firefox is more than doubled, and track loading in STK is noticeably faster. Reviewed-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4317> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4317>
* panfrost: split index cache into shared partVasily Khoruzhick2020-03-103-0/+178
| | | | | | | | | Split it into shared part since we're going to re-use it in lima. Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4051>
* panfrost: Rework linear<--->tiled conversionsAlyssa Rosenzweig2020-01-212-147/+203
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a lot going on here (it's a ton of commits squashed together since otherwise this would be impossible to review...) 1. We have a fast path for linear->tiled for whole (aligned) tiles, but we have to use a slow path for unaligned accesses. We can get a pretty major win for partial updates by using this slow path simply on the borders of the update region, and then hit the fast path for the tile-aligned interior. This does require some shuffling. 2. Mark the LUTs constant, which allows the compiler to inline them, which pairs well with loop unrolling (eliminating the memory accesses and just becoming some immediates.. which are not as immediate on aarch64 as I'd like..) 3. Add fast path for bpp1/2/8/16. These use the same algorithm and we have native types for them, so may as well get the fast path. 4. Drop generic path for bpp != 1/2/8/16, since these formats are generally awful and there's no way to tile them efficienctly and honestly there's not a good reason too either. Lima doesn't support any of these formats; Panfrost can make the opinionated choice to make them linear. 5. Specialize the unaligned routines. They don't have to be fully generic, they just can't assume alignment. So now they should be nearly as fast as the aligned versions (which get some extra tricks to be even faster but the difference might be neglible on some workloads). 6. Specialize also for the size of the tile, to allow 4x4 tiling as well as 16x16 tiling. This allows compressed textures to be efficiently tiled with the same routines (so we add support for tiling ASTC/ETC textures while we're at it) Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* panfrost,lima: De-Galliumize tiling routinesAlyssa Rosenzweig2020-01-212-21/+28
| | | | | | | | | | | | There's an implicit dependence on Gallium here that will add more complexity than needed when testing/optimizing out of driver as well as potentially Vulkanizing. We don't need a full pipe_box, just the x/y/w/h properties directly. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* panfrost: Compile tiling routines with -O3Alyssa Rosenzweig2020-01-211-1/+1
| | | | | | | | | | These are major hot spots for panfrost and lima; better let the compiler do its thing even on debug builds. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* panfrost: Extend software tiling to larger bppAlyssa Rosenzweig2019-07-011-9/+49
| | | | | | Should not affect lima. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Rewrite u-interleaving codeAlyssa Rosenzweig2019-07-011-101/+189
| | | | | | | | | | | | | | | | | | | | Rather than using a magic lookup table with no explanations, let's add liberal comments to the code to explain what this tiling scheme is and how to encode/decode it efficiently. It's not so mysterious after all -- just reordering bits with some XORs thrown in. v2: Correct copyright identifier. Fix spelling error. Switch space_4 to a LUT. Fix comment typo. Use LUT instead of space_x tricks. Fallback on generic rather than split up unaligned writes. v3: Correct stride order (fixes crash loading). Correct coordinate system mishap. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Andreas Baierl <[email protected]>
* lima,panfrost: Move lima_tiling.c/h to /src/panfrostAlyssa Rosenzweig2019-06-203-0/+262
This will allow both drivers to share this code. Both drivers build-tested with meson. Android build not tested. v2: Change naming from tiling->shared, in case Lima and Panfrost can share more in the future. Fix Android build system. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-and-tested-by: Qiang Yu <[email protected]>