aboutsummaryrefslogtreecommitdiffstats
path: root/src/panfrost
Commit message (Collapse)AuthorAgeFilesLines
* panfrost: prepare for p_compiler.h dependency removalLionel Landwerlin2019-08-091-0/+1
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* pan/midgard: Account for swizzle/mask in st_varyAlyssa Rosenzweig2019-08-091-2/+14
| | | | | | | | Register allocation for varying stores is a bit different, since the instructions ignore the writemask (varyings are normalized packed/vectorized..) Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Resolve crash with NULL attr/varyingsAlyssa Rosenzweig2019-08-091-0/+5
| | | | | | | This case needs more investigation, but this was found with geometry shaders. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Disassemble load/store barrel shiftAlyssa Rosenzweig2019-08-082-5/+30
| | | | | | Arm assembly intensifies. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* Add libpanfrost_shared to Android buildRoman Stratiienko2019-08-081-1/+6
| | | | | | | | | | 1. Add missing directory to ./Android.mk 2. Fix ./src/panfrost/Android.shared.mk Signed-off-by: Roman Stratiienko <[email protected]> Reviewed-by: Icenowy Zheng <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Acked-by: Qiang Yu <[email protected]>
* panfrost: Take into account a index_bias for glDrawElementsBaseVertex callsRohan Garg2019-08-062-7/+22
| | | | | | | | | | | | | | Midgard does not accept a index_bias directly and relies instead on a bias correction offset (offset_bias_correction) in order to calculate the unbiased vertex index. We need to make sure we adjust offset_start and vertex_count in order to take into account the index_bias as required by a glDrawElementsBaseVertex call and then supply a additional offset_bias_correction to the hardware. Signed-off-by: Rohan Garg <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend SSA concurrency checks to other argsAlyssa Rosenzweig2019-08-051-13/+12
| | | | | | No glmark changes, but this seems like a good idea. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Rewrite bidirectionally when eliminating movesAlyssa Rosenzweig2019-08-051-3/+2
| | | | | | | | | | | | | | Symptom: the sky is black in SuperTuxKart (flashbacks to SMB/NES emulation intensify). Essentially, what happened is a fixed (special) move to r0 was eliminated but scheduling did not factor this in, so can_run_concurrent_ssa returned true even when there was a logical data dependency that needed to be resolved. Fixes: 20771ede1c0 ("pan/midgard: Add post-RA move elimination") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* meson: drop unused dep_{thread,dl}Eric Engestrom2019-08-031-2/+0
| | | | | | | | Unused as of last commit. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* meson: replace libmesa_util with idep_mesautilEric Engestrom2019-08-031-5/+3
| | | | | | | | | | | This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* pan/midgard: Print texture outmodAlyssa Rosenzweig2019-08-022-4/+8
| | | | | | I have no idea who thought this was a good idea. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Promote all 16 uniformsAlyssa Rosenzweig2019-08-023-9/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that register spilling is in place, this is reasonable. It turns out for some shaders, it's actually better to cap at 8 work registers and extra >8 uniform reigsters and tolerate the spilling, since the extra resulting threads make up for the spillage. So incidentally, the shader that spills here is in -bterrain, which jumps from 19fps to 21fps as a result of this change. total instructions in shared programs: 3513 -> 3448 (-1.85%) instructions in affected programs: 776 -> 711 (-8.38%) helped: 20 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 3.25 x̃: 2 helped stats (rel) min: 3.57% max: 16.00% x̄: 8.37% x̃: 7.19% 95% mean confidence interval for instructions value: -4.28 -2.22 95% mean confidence interval for instructions %-change: -10.02% -6.73% Instructions are helped. total bundles in shared programs: 2067 -> 2024 (-2.08%) bundles in affected programs: 515 -> 472 (-8.35%) helped: 19 HURT: 1 helped stats (abs) min: 1 max: 6 x̄: 2.37 x̃: 2 helped stats (rel) min: 2.13% max: 17.86% x̄: 10.19% x̃: 11.11% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23% 95% mean confidence interval for bundles value: -3.01 -1.29 95% mean confidence interval for bundles %-change: -12.13% -6.91% Bundles are helped. total quadwords in shared programs: 3468 -> 3426 (-1.21%) quadwords in affected programs: 764 -> 722 (-5.50%) helped: 19 HURT: 1 helped stats (abs) min: 1 max: 5 x̄: 2.26 x̃: 2 helped stats (rel) min: 1.41% max: 12.50% x̄: 6.76% x̃: 7.14% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.08% max: 1.08% x̄: 1.08% x̃: 1.08% 95% mean confidence interval for quadwords value: -2.83 -1.37 95% mean confidence interval for quadwords %-change: -8.08% -4.65% Quadwords are helped. total registers in shared programs: 383 -> 360 (-6.01%) registers in affected programs: 112 -> 89 (-20.54%) helped: 19 HURT: 0 helped stats (abs) min: 1 max: 3 x̄: 1.21 x̃: 1 helped stats (rel) min: 12.50% max: 27.27% x̄: 20.63% x̃: 20.00% 95% mean confidence interval for registers value: -1.47 -0.95 95% mean confidence interval for registers %-change: -22.39% -18.87% Registers are helped. total threads in shared programs: 432 -> 451 (4.40%) threads in affected programs: 19 -> 38 (100.00%) helped: 11 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.73 x̃: 2 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for threads value: 1.41 2.04 95% mean confidence interval for threads %-change: 100.00% 100.00% Threads are [helped]. total loops in shared programs: 4 -> 4 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 0 -> 4 spills in affected programs: 0 -> 4 helped: 0 HURT: 2 total fills in shared programs: 0 -> 7 fills in affected programs: 0 -> 7 helped: 0 HURT: 2 Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Break mir_spill_register into its functionAlyssa Rosenzweig2019-08-021-117/+129
| | | | | | | No functional changes, just breaks out a megamonster function and fixes the indentation. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Switch sources to an array for trinary sourcesAlyssa Rosenzweig2019-08-0212-145/+133
| | | | | | | | | We need three independent sources to support indirect SSBO writes (as well as textures with both LOD/bias and offsets). Now is a good time to make sources just an array so we don't have to rewrite a ton of code if we ever needed a fourth source for some reason. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove "r27-only" register classAlyssa Rosenzweig2019-08-025-97/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As far as I know, there's no such thing as a load/store op that only takes its argument in r27. We just need to set the appropriate arg_1 field in the RA to specify other registers if we want them. To facilitate this, various RA-related changes are needed across the compiler ; this should also fix indirect offsets which were implicitly interpreted as "r27-only" despite not even passing through RA yet. One ripple effect change is switching the move insertion point and adjusting the liveness analysis accordingly, so while this was intended as a purely functional change, there are some shader-db changes: total instructions in shared programs: 3511 -> 3498 (-0.37%) instructions in affected programs: 563 -> 550 (-2.31%) helped: 12 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1 helped stats (rel) min: 0.93% max: 5.00% x̄: 2.58% x̃: 2.33% 95% mean confidence interval for instructions value: -1.27 -0.90 95% mean confidence interval for instructions %-change: -3.23% -1.93% Instructions are helped. total bundles in shared programs: 2067 -> 2067 (0.00%) bundles in affected programs: 398 -> 398 (0.00%) helped: 7 HURT: 4 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.54% max: 10.00% x̄: 5.04% x̃: 5.56% HURT stats (abs) min: 1 max: 2 x̄: 1.75 x̃: 2 HURT stats (rel) min: 2.13% max: 4.26% x̄: 3.72% x̃: 4.26% 95% mean confidence interval for bundles value: -0.95 0.95 95% mean confidence interval for bundles %-change: -5.21% 1.50% Inconclusive result (value mean confidence interval includes 0). total quadwords in shared programs: 3464 -> 3454 (-0.29%) quadwords in affected programs: 1199 -> 1189 (-0.83%) helped: 18 HURT: 4 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.03% max: 5.26% x̄: 2.44% x̃: 1.79% HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 2.56% max: 2.82% x̄: 2.63% x̃: 2.56% 95% mean confidence interval for quadwords value: -0.98 0.07 Inconclusive result (value mean confidence interval includes 0). total registers in shared programs: 383 -> 373 (-2.61%) registers in affected programs: 56 -> 46 (-17.86%) helped: 12 HURT: 2 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 9.09% max: 33.33% x̄: 29.58% x̃: 33.33% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 20.00% max: 50.00% x̄: 35.00% x̃: 35.00% 95% mean confidence interval for registers value: -1.13 -0.29 95% mean confidence interval for registers %-change: -35.07% -5.63% Registers are helped. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Handle get/set_swizzle for load/store argumentsAlyssa Rosenzweig2019-08-022-3/+83
| | | | | | | Load/store's main "argument 0" already has its swizzle handled correctly (for stores, that is). But the tinier arguments, the compact ones with a component select but not a full swizzle, those are not yet handled. Let's do something about that!
* pan/midgard: Fix block successorsAlyssa Rosenzweig2019-08-022-29/+43
| | | | | | | | | Rather than an ersatz thing that sort of looks like successors but is in fact just the source order traversal with some backward jumps hacked in for loops... construct an actual flow graph so we can do analysis sanely. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add helper to pack load/store registersAlyssa Rosenzweig2019-08-021-0/+18
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Decode register/component in load/store argumentAlyssa Rosenzweig2019-08-022-2/+24
| | | | | | 3-bits out of 8 down! Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix REGISTER_OFFSETAlyssa Rosenzweig2019-08-022-3/+2
| | | | | | r27 isn't the special one, usually. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Split ld/st unknown to arg_1/arg_2 fieldsAlyssa Rosenzweig2019-08-027-17/+46
| | | | | | | | | The 16-bit field can be decomposed to two independent 8-bit fields, each representing a single (additional) argument to the load/store op, generally used for encoding registers. Addressable registers here are substantially limited compared to the main register in a load/store op. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Print invert modifierAlyssa Rosenzweig2019-08-021-0/+3
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Flip conditionalsAlyssa Rosenzweig2019-08-021-4/+45
| | | | | | | We would like to flip ops to have a constant in the second place to enable inlining of the constant. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add bitwise src/invert fusingAlyssa Rosenzweig2019-08-023-0/+124
| | | | | | De Morgan's Laws and some special ops basically. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add .not propagation passAlyssa Rosenzweig2019-08-023-0/+35
| | | | | | Essentially .pos propagation but for bitwise. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fuse invert into bitwise opsAlyssa Rosenzweig2019-08-023-0/+57
| | | | | | We use the new invert flag to produce ops like inand. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use standard list traversal to find initial tagAlyssa Rosenzweig2019-08-011-7/+4
| | | | | | | | Fixes a hang (and abort) on empty shaders, which you shouldn't have anyway but better safe than sorry. DCE going on the fritz is no reason to freeze the system. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add support for decoding gl_FrontFacingAlyssa Rosenzweig2019-08-012-1/+11
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/decode: Use max varying index as varying buffer countAlyssa Rosenzweig2019-08-011-6/+6
| | | | | | | This allows us to decode asymmetric varyings correctly, which occurs with e.g. gl_FrontFacing. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Don't special case inline_constantAlyssa Rosenzweig2019-07-317-30/+13
| | | | | | Another constant source of bugs. Ain't that special. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: De-special-case branchingAlyssa Rosenzweig2019-07-316-30/+11
| | | | | | It's not that special. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add MALI_SAMP_NORM_COORDS flagAlyssa Rosenzweig2019-07-312-0/+6
| | | | | | | Corresponds to the normalized coordinates? flag on images in OpenCL and evidently also shows up in GL, so let's wire it in. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Simplify filter_mode definitionAlyssa Rosenzweig2019-07-312-17/+18
| | | | | | | It's just a bit field containing some flags; there's no need for all the macro magic. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Shrink "compute FBD"Alyssa Rosenzweig2019-07-311-1/+1
| | | | | | | We still don't know what it is, but from a newer trace we now know it's half the size we thought it was. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Flip texture/sampler fieldsAlyssa Rosenzweig2019-07-312-2/+2
| | | | | | | | We had them backwards in both the command stream and the Midgard stack. In OpenGL ES 2.0, they're always the same, but in Vulkan/later-GL/CL they diverge so we can fix this. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add MALI_ATTR_IMAGE valueAlyssa Rosenzweig2019-07-312-0/+7
| | | | | | | Images are implemented (in part) as special attributes, so include support for decoding this. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Simplify discard logicAlyssa Rosenzweig2019-07-311-17/+1
| | | | | | The "branch offset" is, in fact, ignored. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add units for more instructionsAlyssa Rosenzweig2019-07-312-6/+6
| | | | | | | For everything but freduce, we have some sense of what units the instruction takes. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix ball/bany opcode tableAlyssa Rosenzweig2019-07-312-17/+22
| | | | | | | This were seriously messed up beyond all recognition. How we're passing shaders.random.* is a mystery. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Document branch combination LUTAlyssa Rosenzweig2019-07-313-5/+25
| | | | | | This took way longer to figure out than it should have.. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Nothing to see here, move along folksAlyssa Rosenzweig2019-07-301-4/+4
| | | | | | Fixes: dee1e18fe4f ("pan/midgard: Cleanup ops table") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Cleanup ops tableAlyssa Rosenzweig2019-07-301-7/+7
| | | | | | | Hopefully this should make a few ops make more sense. No functional changes. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend copy-propagation to swizzlesAlyssa Rosenzweig2019-07-303-4/+106
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We can compose them when we rewrite, which is.. more code.. but helps. total instructions in shared programs: 3611 -> 3513 (-2.71%) instructions in affected programs: 672 -> 574 (-14.58%) helped: 11 HURT: 2 helped stats (abs) min: 2 max: 14 x̄: 9.09 x̃: 10 helped stats (rel) min: 5.71% max: 24.56% x̄: 17.99% x̃: 18.87% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 1.19% max: 2.08% x̄: 1.64% x̃: 1.64% 95% mean confidence interval for instructions value: -10.45 -4.62 95% mean confidence interval for instructions %-change: -20.07% -9.87% Instructions are helped. total bundles in shared programs: 2117 -> 2067 (-2.36%) bundles in affected programs: 356 -> 306 (-14.04%) helped: 11 HURT: 0 helped stats (abs) min: 1 max: 7 x̄: 4.55 x̃: 5 helped stats (rel) min: 4.55% max: 15.22% x̄: 13.63% x̃: 14.71% 95% mean confidence interval for bundles value: -5.64 -3.45 95% mean confidence interval for bundles %-change: -15.71% -11.55% Bundles are helped. total quadwords in shared programs: 3567 -> 3468 (-2.78%) quadwords in affected programs: 695 -> 596 (-14.24%) helped: 11 HURT: 1 helped stats (abs) min: 2 max: 14 x̄: 9.09 x̃: 10 helped stats (rel) min: 5.56% max: 21.88% x̄: 14.97% x̃: 15.15% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 2.38% max: 2.38% x̄: 2.38% x̃: 2.38% 95% mean confidence interval for quadwords value: -10.96 -5.54 95% mean confidence interval for quadwords %-change: -17.42% -9.63% Quadwords are helped. total registers in shared programs: 391 -> 383 (-2.05%) registers in affected programs: 46 -> 38 (-17.39%) helped: 9 HURT: 1 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 10.00% max: 10.00% x̄: 10.00% x̃: 10.00% 95% mean confidence interval for registers value: -1.25 -0.35 95% mean confidence interval for registers %-change: -29.42% -13.58% Registers are helped. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extract simple source mod checkAlyssa Rosenzweig2019-07-303-4/+15
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Lower texr/texw mixed registersAlyssa Rosenzweig2019-07-301-2/+2
| | | | | | | | | Conceptually, r28-r29 (as used for reading) and r28-r29 (as used for writing) aren't registers at all, merely push/pull arrangements. So you can't feed a texture result back into itself without explicitly moving in the middle. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Always set .cont for derivatives in loopsAlyssa Rosenzweig2019-07-301-0/+7
| | | | | | We need to keep the helper invocations alive. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement derivativesAlyssa Rosenzweig2019-07-305-1/+183
| | | | | | Implement the fdd* and fdd* opcodes in the Midgard compiler. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Compose original texture swizzle in RAAlyssa Rosenzweig2019-07-301-2/+4
| | | | | | Used for lowering derivatives. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add new swizzlesAlyssa Rosenzweig2019-07-301-0/+3
| | | | | | Used for derivatives. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add OP_IS_DERIVATIVE helperAlyssa Rosenzweig2019-07-301-0/+5
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>