aboutsummaryrefslogtreecommitdiffstats
path: root/src/panfrost
Commit message (Collapse)AuthorAgeFilesLines
...
* pan/decode: Pretty-print sRGB formatAlyssa Rosenzweig2019-08-211-4/+11
| | | | | | We can just stick an "s" in if it's sRGB. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: nr_mipmap_levels -> levelsAlyssa Rosenzweig2019-08-212-4/+5
| | | | | | No need to be so verbose. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Validate texture dimensionalityAlyssa Rosenzweig2019-08-211-6/+33
| | | | | | | | Textures of a smaller dimension don't need higher dimensions printed. This allows us to be more compact, while enforcing verification that higher dimensions must be zero. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Break out pandecode_texture functionAlyssa Rosenzweig2019-08-211-101/+108
| | | | | | | It's massive and hugely nested indentation -- break it out so it's legible. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Guard texture unknowns as zero tripsAlyssa Rosenzweig2019-08-211-9/+13
| | | | | | | | unknown3A I think I've actually seen on T6xx but.. we'll see what happens in traces going forward. We don't want the zero noise normally, and if they show up in the wild, we want to draw attention to them. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Use GLSL style formats/swizzlesAlyssa Rosenzweig2019-08-212-121/+132
| | | | | | | | | | | | | | | | | | This dramatically reduces visual clutter: now an entire attribute/varying record looks something like: rgba32f attribute_0[16].bgra; which is equivalent to the raw structure: { .index = 0, .format = MALI_FORMAT_RGBA32F, .swizzle = (MALI_CHANNEL_BLUE << 9) | ...., .src_offset = 16, } Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Don't print the default swizzleAlyssa Rosenzweig2019-08-211-2/+11
| | | | | | It's just noise. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Validate swizzles against formatAlyssa Rosenzweig2019-08-211-1/+76
| | | | | | | We want to make sure we don't access a component in the swizzle that doesn't exist in the format, since that is (as far as I know) undefined. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Treat RESERVED swizzles as errorsAlyssa Rosenzweig2019-08-211-2/+0
| | | | | | | We've never seen them, so if they come up in trace, we want to draw attention to that. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Handle VARYING_DISCARDAlyssa Rosenzweig2019-08-211-0/+32
| | | | | | | | | | | Varying discard is not used by Panfrost, but the blob uses it sometimes to have some padding in the varyings table, probably to minimize per-draw overhead. (...We should maybe consider this ourselves!) Let's check for this and ensure the rest of the record is consistent with a discarded varying. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't trip the prefix magic fieldAlyssa Rosenzweig2019-08-211-4/+2
| | | | | | What *is* this? Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Guard attribute unknownsAlyssa Rosenzweig2019-08-211-2/+10
| | | | | | | One should be zero. The other has always been seen as set, so check this. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Normalize final instances of XXXAlyssa Rosenzweig2019-08-211-3/+11
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Normalize case matching XXX formatAlyssa Rosenzweig2019-08-211-9/+16
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Mark tripped zeroes with XXXAlyssa Rosenzweig2019-08-211-20/+22
| | | | | | | | | This normalizes the printed format. It also makes it easier for the future when we may introduce semantic _warn and _error handlers. A tripped zero is essentially a hazard to check for. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Check for MFBD preload chicken bitAlyssa Rosenzweig2019-08-211-1/+10
| | | | | | | | | | | If this bit is clear, MFBD preload will be enabled, and you.. don't want that. (At least, when the bit is clear, the old contents of the framebuffer will be preserved. I'm assuming this is what "MFBD preload" refers to in kbase.) Validate that this bit is always set. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Validate AFBC fields are zero when AFBC is disabledAlyssa Rosenzweig2019-08-212-33/+20
| | | | | | | | | There is no "chunknown" structure; that part of the union is an artefact from falsely believing vertex/tiler MFBDs could have render targets attached (they can't). These are just plain old AFBC fields, and if there is no AFBC, it's error to set these field. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Do not print uniform/buffers explicitlyAlyssa Rosenzweig2019-08-211-87/+28
| | | | | | | | | | | For our purposes of driver debugging, the contents of uniform buffers are rarely interesting; we're more concerned about the metadata setting them up. We do need to be careful to validate the sizes of both uniforms and uniform buffers. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Add static bounds checking utilityAlyssa Rosenzweig2019-08-211-0/+38
| | | | | | | | | Many structures in the command stream have a GPU address and size determined statically. We should check that the pointers we are passed are valid and the buffers they point to are big enough for the given size. If they're not, an MMU fault would be raised. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Don't print unreferenced attribute memoryAlyssa Rosenzweig2019-08-211-1/+1
| | | | | | This is a source of uninitialized memory leaking into the traces. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Check for a number of potential issuesAlyssa Rosenzweig2019-08-212-17/+71
| | | | | | | | Verify sizes / masks / etc against something logical to cull down the trace space and automatically guard against a number of potential hazards. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Correct polygon size computationsAlyssa Rosenzweig2019-08-213-35/+69
| | | | | | | | | | | | | | | | | | | | | | | | | While the algorithm for computing the header size has been correct for a while, we used a major hack to conservatively guess the body size. Let's scrap that and figure out the algorithm we actually need to use to be bit-identical with what the hardware expects. We do have to be careful to add the header size to total comptued BO size. It's not clear how big the polygon list needs to be in practice -- but it has to be somewhat bigger than the polygon list itself. This needs more investigation. If we size the polygon list exactly based on the polygon_list_size field, we get faults like: [ 1224.219886] panfrost ff9a0000.gpu: Unhandled Page fault in AS0 at VA 0x000000001BDE8000 Reason: TODO raw fault status: 0x660003C3 decoded fault status: SLAVE FAULT exception type 0xC3: TRANSLATION_FAULT_LEVEL3 access type 0x3: WRITE source id 0x6600 Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Print "just right" count of texture pointersAlyssa Rosenzweig2019-08-211-8/+3
| | | | | | | The other commented lines just add noise/entropy we don't want, and can in fact crash the trace due to asserts failing. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Verify and omit polygon sizeAlyssa Rosenzweig2019-08-211-5/+19
| | | | | | | | | The polygon sizes are computed from the width/height/flags, so we can reverse the computation and use our computation to verify the two computation algorithms are bit-identical. If they are, we can omit the computed fields. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move pan_tiler.c outside of GalliumAlyssa Rosenzweig2019-08-213-0/+309
| | | | | | The routines in this file may be shared with Vulkan. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Bounds check polygon list and tiler heapAlyssa Rosenzweig2019-08-211-3/+18
| | | | | | | We have the BOs available; ensure that the bounds specified in the command stream are actually the correct bounds. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Allow updating mmapsAlyssa Rosenzweig2019-08-211-10/+31
| | | | | | | | This allows the caller to call track_mmap multiple times for the same gpu_va for the purpose of updating the mmap. This is used to trace invisible BOs with kbase and doesn't apply to native traces. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Express tiler structures as offsetsAlyssa Rosenzweig2019-08-211-11/+13
| | | | | | | This allows us to catch a class of errors (for negative offsets, etc) automatically. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Don't print zero exception_statusAlyssa Rosenzweig2019-08-211-1/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Fix missing NULL terminatorAlyssa Rosenzweig2019-08-211-2/+2
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Silence workgroups_x_shift_2Alyssa Rosenzweig2019-08-211-3/+5
| | | | | | Since we're bit-identical we can compare the computed value. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement workgroups_x_shift_2 quirkAlyssa Rosenzweig2019-08-211-2/+11
| | | | | | I'm not sure why this is done this way, but let's follow the blob. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Don't print canonical workgroup encodingAlyssa Rosenzweig2019-08-211-24/+60
| | | | | | | | | | The on-the-wire representation of workgroups is not 1:1 to the decoded Gallium-level workgroups (there are multiple valid encodings; see the previous commit). Nevertheless, since we're now bit-identical in packing vs the blob, we can check for a canonical form and only print the verbose trace if we fail the canonical form. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Set workgroups z to 32 for non-instanced graphicsAlyssa Rosenzweig2019-08-212-3/+12
| | | | | | | | This is a blob quirk; in so much as I know, the hardware doesn't care. But we're trying to be bit-identical to take as much entropy out of traces as possible, so let's introduce the quirk. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move pan_invocation to shared panfrost/Alyssa Rosenzweig2019-08-214-1/+225
| | | | | | | | The routines in this file have no dependency on Gallium. Let's share them so they can be used for a theoretical future Vulkan driver or, more immediately, consulted when tracing. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Don't print MALI_DRAW_NONEAlyssa Rosenzweig2019-08-211-1/+2
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Eliminate DYN_MEMORY_PROPAlyssa Rosenzweig2019-08-211-30/+7
| | | | | | | It's obvious that it's linked by virtue of us printing the struct it links against. No need to repeat ourselves. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Compute liveness per-blockAlyssa Rosenzweig2019-08-192-70/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than using a regalloc based on live internals, computed hastily with repeated invocations of a forward-analysis pass, we switch to compute liveness information on a per-block basis. Within a given basic block, we compute liveness backwards with a linear-time algorithm; for common shaders, this may help RA terminate quicker. Across blocks, we use a work list (really a work set) and check if we're making progress. This isn't terribly efficient, but it gets the job done. Point is, we get the live_in/live_out for each block. From there, it's simple to rerun the linear-time update algorithm to compute the interference graph. The benefit of this technique is the ability to ignore "gaps" in liveness across intermediate blocks that are never executed. On simple shaders like the loops in glmark, this results in a minor reduction in register pressure. The motivation was a complex shader in Krita that failed register allocation due to an unfortunate interaction between texture pipeline registers and control flow. This shader now compiles successfully. total instructions in shared programs: 3439 -> 3438 (-0.03%) instructions in affected programs: 22 -> 21 (-4.55%) helped: 1 HURT: 0 total bundles in shared programs: 2077 -> 2076 (-0.05%) bundles in affected programs: 12 -> 11 (-8.33%) helped: 1 HURT: 0 total quadwords in shared programs: 3457 -> 3456 (-0.03%) quadwords in affected programs: 20 -> 19 (-5.00%) helped: 1 HURT: 0 total registers in shared programs: 341 -> 338 (-0.88%) registers in affected programs: 9 -> 6 (-33.33%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33% Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Analyze load/store for swizzle propagationAlyssa Rosenzweig2019-08-191-3/+21
| | | | | | | | If there's a nontrivial swizzle fed into an extra (shortened) argument, we bail on copyprop. No glmark changes (since it doesn't use fancy texturing/loads). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Treat cubemaps "stores" as loadsAlyssa Rosenzweig2019-08-195-19/+15
| | | | | | | It's always been ambiguous which they are, but their primary register is their output, not their input; therefore, they are loads. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Clamp cubemap swizzle to XYXXAlyssa Rosenzweig2019-08-191-0/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Clamp st_vary swizzle by number of componentsAlyssa Rosenzweig2019-08-191-1/+2
| | | | | | | Same issue with liveness analysis. If we store out a vec3, we should not reference the .w component. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use type-appropriate swizzle for texture coordinateAlyssa Rosenzweig2019-08-191-1/+7
| | | | | | | | | | | The texture coordinate for a 2D texture could be a vec2 or a vec3, depending if it's an array texture or not. If it's vec2 (non-array texture), we should not reference the z component; otherwise, liveness analysis will get very confused when z is never written. v2: Fix typo (Ilia). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Set mask for lowered read-hazard movesAlyssa Rosenzweig2019-08-191-0/+1
| | | | | | | If we need to lower a move for a read from a vec2 texture coordinate, we shouldn't write zw, even incidentally. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix texw lowering with complex control flowAlyssa Rosenzweig2019-08-191-1/+1
| | | | | | | | | | | | | | | Fixes shaders with control flow like: out = 0; if (A) { if (B) out = texture(A, ...) } else { out = texture(B, ...) } Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_rewrite_index_dst_single helperAlyssa Rosenzweig2019-08-192-2/+8
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Print predecessors in MIRAlyssa Rosenzweig2019-08-191-0/+5
| | | | | | Just as a sanity check. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Index blocks for printingAlyssa Rosenzweig2019-08-193-2/+10
| | | | | | Better than having pointers flying about. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_foreach_srcAlyssa Rosenzweig2019-08-191-0/+3
| | | | | | This is repeated often enough. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_foreach_instr_in_block_revAlyssa Rosenzweig2019-08-191-0/+2
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>