summaryrefslogtreecommitdiffstats
path: root/src/panfrost
Commit message (Collapse)AuthorAgeFilesLines
* panfrost/midgard: Add missing lowering passes for type/size conversion opsBoris Brezillon2020-01-221-13/+34
| | | | | | | | | | Replace the manual type/size conversion lowering description by one that's automatically generated and covers all type/size conversions. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
* panfrost/midgard: Add 64 bits float <-> int convertersBoris Brezillon2020-01-221-0/+5
| | | | | | | | The 64 bit converter cases were missing, add them now. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
* panfrost/midgard: Fix mir_print_instruction() for branch instructionsBoris Brezillon2020-01-221-7/+31
| | | | | | | | Branch instructions should not be treated as regular ALUs. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
* panfrost/midgard: Add f2f64 supportBoris Brezillon2020-01-221-2/+4
| | | | | | | | So we can convert floats into doubles. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
* panfrost/midgard: Factorize f2f and u2u handlingBoris Brezillon2020-01-221-20/+7
| | | | | | | | | | Those size conversion operations work the same way apart from f2f using an fmov op code and u2u using an imov. Let's handle them in the same case block to avoid code duplication. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
* panfrost/midgard: Make sure promote_fmov() only promotes 32-bit imovsBoris Brezillon2020-01-221-0/+1
| | | | | | | | | mir_constant_float() assumes we're dealing with 32-bit integers/floats, which is only the case if reg_mode is equal to midgard_reg_mode_32. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
* panfrost/midgard: Rework mir_adjust_constants() to make it type/size agnosticBoris Brezillon2020-01-221-94/+69
| | | | | | | | | | Right now, constant combining is not supported in 16 bit mode, and 64 bit mode is simply ignored. Let's rework the function to make it type/bit-size agnostic. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
* panfrost/midgard: Use a union to manipulate embedded constantsBoris Brezillon2020-01-228-49/+85
| | | | | | | | | | | | Each instruction bundle can contain up to 16 constant bytes. The meaning of those byte is instruction dependent: it depends on the instruction native type (int, uint or float) and the instruction reg_mode (8, 16, 32 or 64 bit). Those different layouts can be exposed as a union to facilitate constants manipulation. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3478>
* panfrost/midgard: Print the actual source register for store operationsBoris Brezillon2020-01-211-1/+1
| | | | | | | | | | | Store operation use r26/r27 but have a word->reg set to 0 or 1 (base is r26). Let's take this base offset into account in print_load_store_instr(). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3482> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3482>
* panfrost: Add pandecode entries for ASTC/ETC formatsAlyssa Rosenzweig2020-01-211-0/+9
| | | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* panfrost: Add ASTC texture formatsIcecream952020-01-211-0/+2
| | | | | | Acked-by: Daniel Stone <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* panfrost: Add ETC1/ETC2 texture formatsIcecream952020-01-211-0/+11
| | | | | | Acked-by: Daniel Stone <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* panfrost: Rework linear<--->tiled conversionsAlyssa Rosenzweig2020-01-212-147/+203
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a lot going on here (it's a ton of commits squashed together since otherwise this would be impossible to review...) 1. We have a fast path for linear->tiled for whole (aligned) tiles, but we have to use a slow path for unaligned accesses. We can get a pretty major win for partial updates by using this slow path simply on the borders of the update region, and then hit the fast path for the tile-aligned interior. This does require some shuffling. 2. Mark the LUTs constant, which allows the compiler to inline them, which pairs well with loop unrolling (eliminating the memory accesses and just becoming some immediates.. which are not as immediate on aarch64 as I'd like..) 3. Add fast path for bpp1/2/8/16. These use the same algorithm and we have native types for them, so may as well get the fast path. 4. Drop generic path for bpp != 1/2/8/16, since these formats are generally awful and there's no way to tile them efficienctly and honestly there's not a good reason too either. Lima doesn't support any of these formats; Panfrost can make the opinionated choice to make them linear. 5. Specialize the unaligned routines. They don't have to be fully generic, they just can't assume alignment. So now they should be nearly as fast as the aligned versions (which get some extra tricks to be even faster but the difference might be neglible on some workloads). 6. Specialize also for the size of the tile, to allow 4x4 tiling as well as 16x16 tiling. This allows compressed textures to be efficiently tiled with the same routines (so we add support for tiling ASTC/ETC textures while we're at it) Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* panfrost,lima: De-Galliumize tiling routinesAlyssa Rosenzweig2020-01-212-21/+28
| | | | | | | | | | | | There's an implicit dependence on Gallium here that will add more complexity than needed when testing/optimizing out of driver as well as potentially Vulkanizing. We don't need a full pipe_box, just the x/y/w/h properties directly. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* panfrost: Compile tiling routines with -O3Alyssa Rosenzweig2020-01-211-1/+1
| | | | | | | | | | These are major hot spots for panfrost and lima; better let the compiler do its thing even on debug builds. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]> #lima on Mali400 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3414>
* pan/midgard: Fix recursive csel schedulingAlyssa Rosenzweig2020-01-181-0/+4
| | | | | | | | | | | | | | | | Corner case causing invalid scheduling on shaders with nested csels, i.e. GLSL code resembling: (foo ? bool1 : bool2) ? x : y By explicitly disallowing csels this is fixed. Fixes INSTR_INVALID_ENC on a glamor shader (noticeable with slowdown and visual corruption when scrolling "too far" on GTK apps). Signed-off-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3463>
* panfrost: Identify un/pack colour opcodesAlyssa Rosenzweig2020-01-183-0/+9
| | | | | | | | | We still need to identify formats in the disassembler, but this will at least get the opcode name clear. Signed-off-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>
* pan/midgard: Bytemasks should round up, not round downAlyssa Rosenzweig2020-01-183-9/+8
| | | | | | | Otherwise we'll lost components in DCE. Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3462>
* panfrost/midgard: Fix swizzle for store instructionsBoris Brezillon2020-01-171-3/+15
| | | | | | | | | | | | | | | | The current logic considers that the nir_intrinsic_component(store_intr) encodes the source components start, but it actually encodes the destination one. Source component offset adjustment is taken care of in install_registers_instr(), when offset_swizzle() is called. This fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.45 when PAN_MESA_DEBUG=deqp (looks like exposing GLES3 features has an impact on the varyings layout). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429>
* panfrost: Prefix schedule_program to prevent collisionRobert Foss2020-01-156-6/+6
| | | | | | | | | | | | Currently the schedule_program implementation being used is picked at compile time, which on the Android platform means that the bifrost compiler & scheduler is used for all targets, including midgard based hardware. This commit disambiguates between the two schedule_program functions. Signed-off-by: Robert Foss <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix linear depth texturesAlyssa Rosenzweig2020-01-142-11/+27
| | | | | | | | | | | | | | | | As pointed out by Boris, what we were calling PAN_LINEAR depth textures was in fact u-interleaved tiled (!), but we never noticed since we flipped the flag used for sampling, leading to all sorts of fun bugs when attempting to directly acess depth textures from the CPU. Which begs the question -- if what we called LINEAR was tiled, how do we actually render linear depth textures? It turns out the flags for AFBC form a mali_block_format 2-bit code just like their render-target counterparts, so we can render to any of the above. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reported-by: Boris Brezillon <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3393>
* pan/midgard: Fix midgard_compile.h includesAfonso Bordado2020-01-141-0/+1
| | | | | | | We now use enum mali_format which is defined in panfrost-job.h Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243>
* panfrost: Remove unneeded phi nodesBoris Brezillon2020-01-131-0/+1
| | | | | | | | | Add a pass to remove unneeded phi nodes as done in other drivers. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294>
* pan/midgard: Support indirect UBO offsetsAlyssa Rosenzweig2020-01-102-22/+7
| | | | | | | | | ...in case we have arrays in a UBO block that we'd like to access indirectly. Signed-off-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352>
* panfrost: Add negative lod bias supportIcecream952020-01-101-9/+11
| | | | Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't double-flip Z/W for 2D arraysAlyssa Rosenzweig2020-01-071-2/+5
| | | | | | | | | We need to mindful that we don't clobber the shadow comparator. Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darrayshadow_* Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Account for z/w flip in texelFetchAlyssa Rosenzweig2020-01-071-0/+9
| | | | | | | | | Required for proper txf of 2D arrays. Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetch.*2darray* Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Fix Android buildRoman Stratiienko2020-01-041-0/+1
| | | | | | | Include missing `encoder/pan_props.c` into the build. Signed-off-by: Roman Stratiienko <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use upper ALU tags for MFBD writeoutAlyssa Rosenzweig2020-01-023-2/+22
| | | | | | It's not clear yet what the distinction is. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Identity ld_color_buffer as 32-bitAlyssa Rosenzweig2020-01-023-4/+4
| | | | | | I'm not sure why I mistakenly identified it as an 8-bit op before. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove old commentAlyssa Rosenzweig2020-01-021-1/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Generate MRT writeout loopsAlyssa Rosenzweig2020-01-025-31/+84
| | | | | | | | They need a very particular form; the naive way we did before is not sufficient in practice, it doesn't look like. So let's follow the rough structure of the blob's writeout since this is fixed code anyway. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Generalize IS_ALU and quadword_sizeAlyssa Rosenzweig2020-01-028-98/+53
| | | | | | There are more ALU tags, let's do some cleanup while we're at it. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use better heuristic for shader terminationAlyssa Rosenzweig2020-01-021-24/+17
| | | | | | | | | This still may not be perfect (in the sense that legal shaders might still get cut off) but this fits how writeout is done with both Panfrost and the blob, so it's good enough for what we need and allows MRT shaders to be sanely disassembled. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix memory corruption in constant combiningAlyssa Rosenzweig2020-01-021-1/+1
| | | | | | | | It's a long story... but we'd try to insert constants that weren't there and end up clobbering fields in the bundle following the constant array... Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Dynamically allocate array of texture pointersTomeu Vizoso2020-01-022-8/+6
| | | | | | | | With 3D textures we can have lots of layers, so better allocate it dynamically at runtime. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Optimize branches with inverted argumentsAfonso Bordado2019-12-313-0/+26
| | | | | | | | | | Remove the invert on arguments to branches, and invert the branch condition instead. This saves one instruction per inverted argument. Closes #2088 Signed-off-by: Afonso Bordado <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Move midgard_is_branch_unit to helpersAfonso Bordado2019-12-312-7/+6
| | | | | Signed-off-by: Afonso Bordado <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove prepacked_branchAlyssa Rosenzweig2019-12-316-39/+6
| | | | | | It's an ugly hack that's no longer used. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Convert fragment writeout to proper branchesAlyssa Rosenzweig2019-12-311-3/+14
| | | | | | | This eliminates the only use of prepacked_branch, which is a such a hack anyway. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove MRT indirection in blend shadersAlyssa Rosenzweig2019-12-301-0/+4
| | | | | | | | | | Since we have a separate blend shader for each render target, let's simplify this structure and reduce the options memory footprint by 88% or something goofy like that. Should also enable separate blending per render target. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement integer varyingsAlyssa Rosenzweig2019-12-302-0/+54
| | | | | | | | | We need to actually work out the varying format on demand, rather than assuming rgba32f. Fixes dEQP-GLES3.functional.fragment_out.basic.int.* Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Identify glProvokingVertex flagAlyssa Rosenzweig2019-12-301-0/+6
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement flat shadingAlyssa Rosenzweig2019-12-302-6/+17
| | | | | | We need to shuffle around some lowerings but it's just a flag. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use type-appropriate st_varyAlyssa Rosenzweig2019-12-301-0/+16
| | | | | | We would like to store (u)ints as well. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix Makefile.sourcesCaio Marcelo de Oliveira Filho2019-12-281-1/+1
| | | | | | | Add missing `\`. Fixes Android build. Reviewed-by: Eric Engestrom <[email protected]> Fixes: de077c20788e9cccd0ef ("panfrost: Remove mali_alt_func")
* panfrost: Remove 32-bit next_job pathAlyssa Rosenzweig2019-12-272-11/+3
| | | | | | | | | | It has been unused for a while; let's just remove the abstraction. Technically the hardware does support 32-bit job descriptors, but we don't and we can't keep them from breaking so let's not pretend they work. Signed-off-by: Alyssa Rosenzweig <[email protected]> Suggested-by: Boris Brezillon <[email protected]>
* panfrost; Update comment about work/uniform_countAlyssa Rosenzweig2019-12-271-3/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove mali_alt_funcAlyssa Rosenzweig2019-12-276-38/+56
| | | | | | | | | | There's only one way to encode comparison functions in the command stream, not two. It's just that the semantics for texture comparisons are flipped from the semantics of stencil comparison. We can factor out that flip to common Panfrost code, rather than tying it to a second Gallium routine. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add missing #include in common headerAlyssa Rosenzweig2019-12-271-0/+1
| | | | | | Fixes way back when... Signed-off-by: Alyssa Rosenzweig <[email protected]>