summaryrefslogtreecommitdiffstats
path: root/src/panfrost/midgard
Commit message (Collapse)AuthorAgeFilesLines
* pan/midgard: Fix midgard_compile.h includesAfonso Bordado2020-01-141-0/+1
| | | | | | | We now use enum mali_format which is defined in panfrost-job.h Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3243>
* panfrost: Remove unneeded phi nodesBoris Brezillon2020-01-131-0/+1
| | | | | | | | | Add a pass to remove unneeded phi nodes as done in other drivers. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3294>
* pan/midgard: Support indirect UBO offsetsAlyssa Rosenzweig2020-01-102-22/+7
| | | | | | | | | ...in case we have arrays in a UBO block that we'd like to access indirectly. Signed-off-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3352>
* panfrost: Don't double-flip Z/W for 2D arraysAlyssa Rosenzweig2020-01-071-2/+5
| | | | | | | | | We need to mindful that we don't clobber the shadow comparator. Fixes dEQP-GLES3.functional.shaders.texture_functions.texture.sampler2darrayshadow_* Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Account for z/w flip in texelFetchAlyssa Rosenzweig2020-01-071-0/+9
| | | | | | | | | Required for proper txf of 2D arrays. Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetch.*2darray* Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Use upper ALU tags for MFBD writeoutAlyssa Rosenzweig2020-01-023-2/+22
| | | | | | It's not clear yet what the distinction is. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Identity ld_color_buffer as 32-bitAlyssa Rosenzweig2020-01-023-4/+4
| | | | | | I'm not sure why I mistakenly identified it as an 8-bit op before. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove old commentAlyssa Rosenzweig2020-01-021-1/+0
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Generate MRT writeout loopsAlyssa Rosenzweig2020-01-025-31/+84
| | | | | | | | They need a very particular form; the naive way we did before is not sufficient in practice, it doesn't look like. So let's follow the rough structure of the blob's writeout since this is fixed code anyway. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Generalize IS_ALU and quadword_sizeAlyssa Rosenzweig2020-01-028-98/+53
| | | | | | There are more ALU tags, let's do some cleanup while we're at it. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use better heuristic for shader terminationAlyssa Rosenzweig2020-01-021-24/+17
| | | | | | | | | This still may not be perfect (in the sense that legal shaders might still get cut off) but this fits how writeout is done with both Panfrost and the blob, so it's good enough for what we need and allows MRT shaders to be sanely disassembled. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix memory corruption in constant combiningAlyssa Rosenzweig2020-01-021-1/+1
| | | | | | | | It's a long story... but we'd try to insert constants that weren't there and end up clobbering fields in the bundle following the constant array... Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Optimize branches with inverted argumentsAfonso Bordado2019-12-313-0/+26
| | | | | | | | | | Remove the invert on arguments to branches, and invert the branch condition instead. This saves one instruction per inverted argument. Closes #2088 Signed-off-by: Afonso Bordado <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Move midgard_is_branch_unit to helpersAfonso Bordado2019-12-312-7/+6
| | | | | Signed-off-by: Afonso Bordado <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove prepacked_branchAlyssa Rosenzweig2019-12-316-39/+6
| | | | | | It's an ugly hack that's no longer used. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Convert fragment writeout to proper branchesAlyssa Rosenzweig2019-12-311-3/+14
| | | | | | | This eliminates the only use of prepacked_branch, which is a such a hack anyway. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove MRT indirection in blend shadersAlyssa Rosenzweig2019-12-301-0/+4
| | | | | | | | | | Since we have a separate blend shader for each render target, let's simplify this structure and reduce the options memory footprint by 88% or something goofy like that. Should also enable separate blending per render target. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement integer varyingsAlyssa Rosenzweig2019-12-302-0/+54
| | | | | | | | | We need to actually work out the varying format on demand, rather than assuming rgba32f. Fixes dEQP-GLES3.functional.fragment_out.basic.int.* Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement flat shadingAlyssa Rosenzweig2019-12-302-6/+17
| | | | | | We need to shuffle around some lowerings but it's just a flag. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use type-appropriate st_varyAlyssa Rosenzweig2019-12-301-0/+16
| | | | | | We would like to store (u)ints as well. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix minor typoAlyssa Rosenzweig2019-12-271-1/+1
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reported-by: Erik Faye-Lund <[email protected]>
* pan/midgard: Lower gl_VertexID/gl_InstanceID to attributesAlyssa Rosenzweig2019-12-242-0/+35
| | | | | | | We have special records for these, put in a fixed location by convention per the blob. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Factor out emit_attr_readAlyssa Rosenzweig2019-12-241-24/+33
| | | | | | We will load attributes directly for gl_VertexID. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Unset vertex_id_zero_basedAlyssa Rosenzweig2019-12-241-1/+0
| | | | | | We don't want the lowering; we have native gl_VertexID. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Compute destination overrideAlyssa Rosenzweig2019-12-241-7/+25
| | | | | | We shift over the mask in this case. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_upper_override helperAlyssa Rosenzweig2019-12-242-0/+30
| | | | | | Checks if we should emit a dest_override=upper, given a mask. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Enable lower_(un)pack_* loweringAlyssa Rosenzweig2019-12-242-2/+13
| | | | | | | These show up in some blend shaders. Let's use the shared lowering and remove our own. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement shadow cubemapsAlyssa Rosenzweig2019-12-241-26/+22
| | | | | | | | We need to reshuffle to sync up the shadow coordinate temporary with the cubemap coordinate temporary. Once that's in place, it's simple enough (we load the shadow coordinate into .z like 2D). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Generalize temp coordinate to non-2DAlyssa Rosenzweig2019-12-241-3/+5
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Do witchcraft on texture offsetsAlyssa Rosenzweig2019-12-243-49/+49
| | | | | | | | | | | My latest divination spell has uncovered a pattern in the aether. Although the swizzle is unaligned, its format is otherwise standard. Document this, removing the old incorrect understanding of the swizzle (which coincided on common special swizzles only). Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetchoffset.sampler2d_fixed_fragment Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix fallthrough from offset to comparatorAlyssa Rosenzweig2019-12-241-0/+1
| | | | | Fixes: ccbc9a4e678 ("pan/midgard: Implement textureOffset for 2D textures") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Expand swizzle for texelFetchAlyssa Rosenzweig2019-12-241-0/+6
| | | | | | | We zero the extra components anyway. Fixes dEQP-GLES3.functional.shaders.texture_functions.texelfetch.sampler2d_fixed_fragment Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Clamp LOD register swizzleAlyssa Rosenzweig2019-12-241-0/+4
| | | | | | Fixes register allocation failures with textureLodOffset. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend IS_VEC4_ONLY to argumentsAlyssa Rosenzweig2019-12-241-1/+5
| | | | | | I think both need to be aligned at least for ld_cubemap_coords. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Bounds check lcra_restrict_rangeAlyssa Rosenzweig2019-12-241-1/+1
| | | | | | | We may call it with sentinel values (~0 in particular) corresponding to unused arguments; ignore these. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix disassembler cycle/quadword countingAlyssa Rosenzweig2019-12-241-4/+6
| | | | | | | Due to the succeeding break we would fall into some off-by-one errors. These should be resolved now. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Optimize comparisions with similar operationsAfonso Bordado2019-12-204-1/+92
| | | | | | | | | Optimizes comparisions by removing the invert flag on operands which we can prove to be equal without the invert. Reviewed-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3036> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3036>
* pan/midgard: Lower txd with lower_texAlyssa Rosenzweig2019-12-201-1/+6
| | | | | | | | | | | | This is a hack since we do have native gradient stuff, but for the moment I'm more interested in conformance and the lowered code is good enough. Fixes dEQP-GLES3.functional.shaders.texture_functions.texturegrad.sampler2d_fixed_fragment Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>
* pan/midgard: Fix crash with txsAlyssa Rosenzweig2019-12-201-1/+3
| | | | | | | | This regressed since we implemented RECT textures natively, oops. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>
* pan/midgard: Implement textureOffset for 2D texturesAlyssa Rosenzweig2019-12-209-14/+57
| | | | | | | | Fixes dEQP-GLES3.functional.shaders.texture_functions.textureoffset.sampler2d_fixed_fragment. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3169>
* pan/midgard: Add uniform/work heuristicAlyssa Rosenzweig2019-12-193-19/+100
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Uniform/work registers are partitioned on a shader-by-shader basis as determined by the compiler. We add a simple heuristic here running before scheduling that prioritizes mitigating spilling at all costs. A more sophisticated heuristic should run *after* scheduling, doing a dry run of the register allocator itself to determine spilling. Fitting this into our current scheduling model is difficult, so while this heuristic does hurt some shaders, overall the results are acceptable: total instructions in shared programs: 50065 -> 38747 (-22.61%) instructions in affected programs: 37187 -> 25869 (-30.44%) helped: 59 HURT: 77 helped stats (abs) min: 1 max: 757 x̄: 198.46 x̃: 151 helped stats (rel) min: 0.48% max: 62.89% x̄: 32.95% x̃: 36.27% HURT stats (abs) min: 1 max: 9 x̄: 5.08 x̃: 6 HURT stats (rel) min: 0.92% max: 14.29% x̄: 6.71% x̃: 4.60% 95% mean confidence interval for instructions value: -111.15 -55.29 95% mean confidence interval for instructions %-change: -14.33% -6.67% Instructions are helped. total bundles in shared programs: 30606 -> 19157 (-37.41%) bundles in affected programs: 23907 -> 12458 (-47.89%) helped: 58 HURT: 74 helped stats (abs) min: 6 max: 757 x̄: 203.09 x̃: 152 helped stats (rel) min: 5.19% max: 77.00% x̄: 49.38% x̃: 53.79% HURT stats (abs) min: 1 max: 9 x̄: 4.46 x̃: 5 HURT stats (rel) min: 1.85% max: 26.32% x̄: 11.70% x̃: 9.57% 95% mean confidence interval for bundles value: -115.46 -58.01 95% mean confidence interval for bundles %-change: -20.87% -9.41% Bundles are helped. total quadwords in shared programs: 31305 -> 32027 (2.31%) quadwords in affected programs: 20471 -> 21193 (3.53%) helped: 0 HURT: 133 HURT stats (abs) min: 1 max: 9 x̄: 5.43 x̃: 5 HURT stats (rel) min: 0.76% max: 15.15% x̄: 5.47% x̃: 4.65% 95% mean confidence interval for quadwords value: 5.00 5.86 95% mean confidence interval for quadwords %-change: 4.85% 6.08% Quadwords are HURT. total registers in shared programs: 2256 -> 2545 (12.81%) registers in affected programs: 708 -> 997 (40.82%) helped: 0 HURT: 95 HURT stats (abs) min: 1 max: 8 x̄: 3.04 x̃: 3 HURT stats (rel) min: 12.50% max: 100.00% x̄: 39.41% x̃: 37.50% 95% mean confidence interval for registers value: 2.64 3.45 95% mean confidence interval for registers %-change: 34.62% 44.19% Registers are HURT. total threads in shared programs: 1776 -> 1709 (-3.77%) threads in affected programs: 134 -> 67 (-50.00%) helped: 0 HURT: 67 HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -1.00 -1.00 95% mean confidence interval for threads %-change: -50.00% -50.00% Threads are HURT. total spills in shared programs: 3868 -> 2 (-99.95%) spills in affected programs: 3868 -> 2 (-99.95%) helped: 60 HURT: 0 total fills in shared programs: 6456 -> 4 (-99.94%) fills in affected programs: 6456 -> 4 (-99.94%) helped: 60 HURT: 0 Signed-off-by: Alyssa Rosenzweig <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3150>
* pan/midgard: Set Z to shadow comparator for 2DAlyssa Rosenzweig2019-12-171-2/+21
| | | | | | | | | | | We still need to generalize for other types of (non-2D / array) shadow samplers, but this is enough for sampler2DShadow to work with initial dEQP tests passing. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
* pan/midgard: Set .shadow for shadow samplersAlyssa Rosenzweig2019-12-171-0/+1
| | | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
* pan/midgard: Hoist temporary coordinate for cubemapsAlyssa Rosenzweig2019-12-171-12/+18
| | | | | | | | | We'll reuse some of this code for shadow samplers, which are represented by a distinct source in NIR. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
* pan/midgard: Use a reg temporary for mutiple writesAlyssa Rosenzweig2019-12-171-1/+1
| | | | | | | | Bug in texelfetch implementation from inspection. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
* panfrost: Let precompile imply shaderdbAlyssa Rosenzweig2019-12-172-3/+3
| | | | | | | | | | This cuts down the number of random environmental variables we need flying around; now PAN_MESA_DEBUG=precompile is sufficient and MIDGARD_MESA_DEBUG=shaderdb will be implied. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3125>
* pan/midgard: Set r1.w magicAlyssa Rosenzweig2019-12-163-3/+32
| | | | | | | | | I'm honestly unsure what this is for, but it's needed on MFBD systems for unknown reasons, at least when MRT is actually in use and then sometimes without MRT (it fixes a blend shader issue on T760?) Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Visoso <[email protected]>
* pan/midgard: Fix liveness analysis with multiple epiloguesAlyssa Rosenzweig2019-12-163-1/+5
| | | | | | | | | Epilogues are special fixed-function blocks, so they need special handling for liveness analysis to work completely. This in turns fixes RA issues for many shaders using MRT. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Visoso <[email protected]>
* pan/midgard: Writeout per render targetAlyssa Rosenzweig2019-12-163-48/+38
| | | | | | | | | The flow is considerably more complicated. Instead of one writeout loop like usual, we have a separate write loop for each render target. This requires some scheduling shenanigans to get right. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Visoso <[email protected]>
* pan/midgard: Add schedule barrier after fragment writeoutAlyssa Rosenzweig2019-12-161-0/+1
| | | | | | | This is a branch, like discard, so we need a barrier to make it safe. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Visoso <[email protected]>