aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a6xx: Fix UBWC mipmapping height alignment.Eric Anholt2020-05-132-6/+137
| | | | | | | After fixing the power of two sizing, pitches worked, but 1-pixel high and unaligned height miplevels were off. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a6xx: Fix UBWC mipmap sizing.Eric Anholt2020-05-132-14/+95
| | | | | | | | | The HW requires a log2 width/height of the level 0 meta_* size in the descriptors, making it pretty clear that UBWC mipmapping is all power-of-two sized. Fixes a bunch of failures in the upcoming unit UBWC layout unit tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a6xx: Fix UBWC blockheight for RG8.Eric Anholt2020-05-131-1/+4
| | | | | | | | Using texturator on a P3A at 1024x1024, RG8 has log2w/h of 6x7 instead of R16I/UI's 6x8. The other blockw/h I verified other than cpp=1 (R8/R8I/R8UI didn't use UBWC) and 32 (would need a bigger type). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno: Pull the tile_alignment lookup for a layout to a helper.Eric Anholt2020-05-131-20/+25
| | | | | | | The r8g8 case UBWC alignment will be changing in the next commit, so fdl6_get_ubwc_blockwidth needs to start paying attention to r8g8 too. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a6xx: Add a testcase for UBWC buffer sharing.Eric Anholt2020-05-131-4/+22
| | | | | | | These offsets are hand-computed referencing msm_media_info.h, and match our driver's current behavior. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a6xx: Improve layout testcase logging for UBWC fails.Eric Anholt2020-05-131-2/+2
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a4xx+: Increase max texture size to 16384.Eric Anholt2020-05-134-6/+10
| | | | | | | | Noticed when poking around with texture layouts and found that my big texture layout from the blob buffer overflowed. Values come from http://vulkan.gpuinfo.org for Adreno 418, 512, 630. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* nir: reset ssa-defs as non-divergent during divergence analysis instead of ↵Daniel Schürmann2020-05-131-21/+36
| | | | | | | upfront Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: simplify phi handling in divergence analysisDaniel Schürmann2020-05-131-113/+116
| | | | | | | | | | This patch adds some control flow information to the state to keep track whether a loop contains divergent continue or break statements to not having to recalculate this property for every phi. Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: rework phi handling in divergence analysisDaniel Schürmann2020-05-131-173/+214
| | | | | | | | | | | | This patch splits the visit_phi() function into three different ones according to the kind of phi (merge-node, loop-header or loop-exit) and calls them when visiting the cf_nodes. This allows to revisit loops if the loop header's phis have changed, only. Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: refactor divergence analysis stateDaniel Schürmann2020-05-131-35/+37
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: add nir_intrinsic_elect to divergence analysisDaniel Schürmann2020-05-131-0/+1
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: Make "divergent" a property of an SSA valueJason Ekstrand2020-05-135-95/+122
| | | | | | | v2: fix usage in ACO (by Daniel Schürmann) Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* gallium: remove more "state tracker" occurencesMarek Olšák2020-05-136-13/+11
| | | | | | Trivial. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
* gallium: rename PIPE_RESOURCE_FLAG_ST_PRIV to FRONTEND_PRIVMarek Olšák2020-05-132-3/+3
| | | | | | | Acked-by: Eric Anholt <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
* gallium: change comments to remove 'state tracker'Marek Olšák2020-05-1395-183/+178
| | | | | | | Acked-by: Eric Anholt <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
* gallium: rename 'state tracker' to 'frontend'Marek Olšák2020-05-13443-162/+348
| | | | | | | Acked-by: Eric Anholt <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
* tu: Implement fallback linear staging blit for CopyImageConnor Abbott2020-05-131-24/+173
| | | | | | | | | | | Also, rewrite the format decision code so that we correctly decide when the linear fallback is needed, even if UBWC is disabled. As part of that, I also moved around some of the code to handle compressed formats to make sure that copying compressed formats with a linear staging blit works (this is now possible since we started allowing tiled compressed textures). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
* tu: Add noubwc debug flag to disable UBWCConnor Abbott2020-05-133-1/+4
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
* tu: Add a "scratch bo" allocation mechanismConnor Abbott2020-05-132-0/+74
| | | | | | | | | This is simpler than a full-blown memory reuse mechanism, but is good enough to make sure that repeatedly doing a copy that requires the linear staging buffer workaround won't use excessive memory or be slowed down due to repeated allocations. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
* aco: improve phi affinities with p_split_vectorRhys Perry2020-05-131-0/+19
| | | | | | | | | | | | Totals from 5860 (4.59% of 127638) affected shaders: VGPRs: 460212 -> 460216 (+0.00%) CodeSize: 65554356 -> 65464816 (-0.14%) Instrs: 12655972 -> 12633578 (-0.18%) Copies: 1309994 -> 1292163 (-1.36%) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
* aco: consider affinities when creating v_mac_f32Rhys Perry2020-05-131-2/+8
| | | | | | | | | | | | Totals from 8487 (6.65% of 127638) affected shaders: CodeSize: 62061988 -> 62058020 (-0.01%); split: -0.01%, +0.01% Instrs: 11910757 -> 11885409 (-0.21%); split: -0.21%, +0.00% Copies: 1065244 -> 1040945 (-2.28%); split: -2.30%, +0.02% Branches: 349665 -> 348914 (-0.21%) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
* aco: mark phi definitions as last-seen phi operandsRhys Perry2020-05-131-14/+14
| | | | | | | | | | | | | | | Totals from 14340 (11.23% of 127638) affected shaders: SGPRs: 1251648 -> 1251512 (-0.01%) VGPRs: 994556 -> 994104 (-0.05%); split: -0.06%, +0.01% CodeSize: 122894528 -> 121099604 (-1.46%); split: -1.49%, +0.03% MaxWaves: 106039 -> 106103 (+0.06%); split: +0.06%, -0.00% Instrs: 23860066 -> 23414317 (-1.87%); split: -1.90%, +0.03% Copies: 2448228 -> 2049305 (-16.29%); split: -16.37%, +0.07% Branches: 789381 -> 757921 (-3.99%); split: -4.62%, +0.64% Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
* aco: fix consecutively written vgprs from vmem instructionsRhys Perry2020-05-131-10/+26
| | | | | | | | | | | | | | | If one VMEM instruction uses a sampler and the other doesn't, we can't do this optimization. Totals from 47 (0.04% of 127638) affected shaders: CodeSize: 271744 -> 271656 (-0.03%); split: -0.04%, +0.01% Instrs: 52783 -> 52761 (-0.04%); split: -0.05%, +0.01% Cycles: 5547040 -> 5546952 (-0.00%); split: -0.00%, +0.00% VMEM: 10022 -> 9887 (-1.35%) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>
* aco: simplify consecutive ordered vmem/lds writes optimizationRhys Perry2020-05-131-10/+2
| | | | | | | | This was unnecessary and messed with statistics Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>
* radv: add a LLVM version string workaround for SotTR and ACOSamuel Pitoiset2020-05-133-3/+36
| | | | | | | | | | | | | When the LLVM version is too old or missing, SotTR applies shader workarounds and that reduces performance by 2-5% with ACO. SotTR workarounds are applied with LLVM 8 and older, so reporting LLVM 9.0.1 should be fine. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Edmondo Tommasina <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4984>
* turnip: use the common code for generating extensions and dispatch tablesSamuel Pitoiset2020-05-132-204/+12
| | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
* anv: use the common code for generating extensions and dispatch tablesSamuel Pitoiset2020-05-132-335/+14
| | | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
* radv: use the common code for generating extensions and dispatch tablesSamuel Pitoiset2020-05-132-348/+13
| | | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
* vulkan: import common code for generating extensionsSamuel Pitoiset2020-05-132-0/+370
| | | | | | | | | | | ANV and RADV have similar Python code for generating extensions and dispatch tables. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
* radv: implement VK_EXT_private_dataSamuel Pitoiset2020-05-132-0/+52
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>
* radv: use the base object struct typesSamuel Pitoiset2020-05-1312-7/+138
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>
* radv: use the common base object type for VkDeviceSamuel Pitoiset2020-05-1312-90/+87
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>
* etnaviv: Disable seamless cube map on GC880Marek Vasut2020-05-133-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GC880 on iMX6DL indicates in it's minorFeatures2 register that it does support SEAMLESS_CUBE_MAP, however when the TE.SAMPLER_CONFIG1 VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit is set on GC880 on iMX6DL, the result is corrupted image. In particular, the following ~112 dEQPs are affected and fail: dEQP-GLES2.functional.texture.filtering.cube.* This only happens on MX6DL GC880, MX6Q GC2000 and STM32MP1 GC400(GCnano) do not report the minorFeatures2 SEAMLESS_CUBE_MAP bit and ignore the TE_SAMPLER_CONFIG1 VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit (note that ss->seamless_cube_map is unconditionally set by mesa at times even PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE returns 0), so there is no visible problem and there are no failing dEQP tests on the GC2000 and GCnano. This might imply that the minorFeatures2 SEAMLESS_CUBE_MAP has some different meaning on GC880 or the SEAMLESS_CUBE_MAP behaves differently on the GC880. This patch does not set the SEAMLESS_CUBE_MAP bit on hardware which does not indicate support for seamless cube map and on GC880, which results in reduction in failed dEQPs: 635 to 186 on GC880, 274 to 270 on GC2000 and no change on GC400(GCnano). Fixes: 8dd26fa2f06 ("etnaviv: support GL_ARB_seamless_cubemap_per_texture") Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Marek Vasut <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4865>
* freedreno/a6xx: fix max-scissor optRob Clark2020-05-132-13/+10
| | | | | | | | | | On a6xx we need a 0,0 based scissor in the binning pass, but can use the blit-scissor to avoid restore/resolve of untouched pixels, and use the conditional execution if the IB to bin to skip bins with no geometry (due to the scissor). Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5021>
* freedreno/ir3/sched: try to avoid syncsRob Clark2020-05-131-13/+99
| | | | | | | | | | Similar to what we do in postsched. It is useful for pre-RA sched to be a bit aware of things that would cause syncs. In particular for the tex fetches, since the vecN src/dst tends to limit postsched's ability to re-order them. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/sched: avoid scheduling outputsRob Clark2020-05-133-22/+101
| | | | | | | | | | | | If an instruction's only use is as an output, and it increases register pressure, then try to avoid scheduling it until there are no other options. A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these immed loads to the end of the shader. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/postsched: try to avoid (sy) syncsRob Clark2020-05-131-2/+19
| | | | | | | | | | Similar to avoidance of `(ss)` syncs, it turns out to be helpful to avoid `(sy)` syncs as well. This helps us turn an tex, (sy)alu, tex, (sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in gfxbench gl_fill2. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/postsched: reset sfu_delay on syncRob Clark2020-05-132-4/+33
| | | | | | | | | Once we schedule an instruction that will require an `(ss)` sync flag, there is no need to delay any further instructions that consume an SFU result (until the next SFU instruction is scheduled). Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3: limit # of tex prefetch by shader sizeRob Clark2020-05-133-1/+40
| | | | | | | | | | | | It seems for short frag shaders, too much prefetch can be detrimental. I think what we *really* want to do is decide after pre-RA sched, when we also know about nop's and what the actual ir3 instruction count is. But that will require re-working how prefetch lowering works. For now this is a super crude heuristic to attempt to approximate a good solution. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3: fix indirect cb0 load_ubo loweringRob Clark2020-05-121-2/+2
| | | | | | | | | | | | | | | | We can no longer assume that `state->ranges[0]` is block 0. It *often* is, but when we encounter a "real" ubo that we lower to `load_uniform` before a block 0 `load_ubo`, it could end up another entry in the table. Resulting in the second pass after gathering ubo ranges, not finding a valid range. Which results in a `load_ubo` for a thing that is not actually a ubo making it's way into ir3 frontend. Resulting in grabbing what we think is a ubo address out of some unrelated const register, and trying to dereference that. Which as you can imagine, fails in amusing ways. Fixes: fc850080ee3 ("ir3: Rewrite UBO push analysis to support bindless") Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
* freedreno/ir3: don't allow negative const_offsetRob Clark2020-05-121-3/+14
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
* pan/mdg: Fix derivative swizzleAlyssa Rosenzweig2020-05-121-4/+2
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Set types for derivativesAlyssa Rosenzweig2020-05-121-0/+2
| | | | | | | Closes #2900 Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Remove texture_op_countAlyssa Rosenzweig2020-05-124-15/+0
| | | | | | | | Was used as a crude approximation of the terminate flag, which we now can do properly. Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Use analysis to set .cont/.last flagsAlyssa Rosenzweig2020-05-121-10/+2
| | | | | | | | | | | | | | Corresponds roughly to what we analyze. Note that "terminate AND execute" is a contradiction (rather: it's equivalent to just terminating), hence why there are only three possibilities for the states of the flags: .cont = continue, don't execute .last = don't continue, don't execute .cont.last = continue and execute Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Use the helper invo analyze passesAlyssa Rosenzweig2020-05-121-0/+5
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Analyze helper execution requirementsAlyssa Rosenzweig2020-05-125-8/+99
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Analyze helper invocation terminationAlyssa Rosenzweig2020-05-125-0/+112
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Explain helper invocations dataflow theoryAlyssa Rosenzweig2020-05-121-0/+63
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>