From f09fda5071813751ba3fa77c28e588689795e17e Mon Sep 17 00:00:00 2001 From: Paul Dagnelie Date: Fri, 16 Aug 2019 08:08:21 -0700 Subject: Cap metaslab memory usage On systems with large amounts of storage and high fragmentation, a huge amount of space can be used by storing metaslab range trees. Since metaslabs are only unloaded during a txg sync, and only if they have been inactive for 8 txgs, it is possible to get into a state where all of the system's memory is consumed by range trees and metaslabs, and txgs cannot sync. While ZFS knows how to evict ARC data when needed, it has no such mechanism for range tree data. This can result in boot hangs for some system configurations. First, we add the ability to unload metaslabs outside of syncing context. Second, we store a multilist of all loaded metaslabs, sorted by their selection txg, so we can quickly identify the oldest metaslabs. We use a multilist to reduce lock contention during heavy write workloads. Finally, we add logic that will unload a metaslab when we're loading a new metaslab, if we're using more than a certain fraction of the available memory on range trees. Reviewed-by: Matt Ahrens Reviewed-by: George Wilson Reviewed-by: Sebastien Roy Reviewed-by: Serapheim Dimitropoulos Reviewed-by: Brian Behlendorf Signed-off-by: Paul Dagnelie Closes #9128 --- module/zfs/spa.c | 4 ++++ 1 file changed, 4 insertions(+) (limited to 'module/zfs/spa.c') diff --git a/module/zfs/spa.c b/module/zfs/spa.c index 437efb50f..c404e876b 100644 --- a/module/zfs/spa.c +++ b/module/zfs/spa.c @@ -9013,6 +9013,10 @@ spa_sync(spa_t *spa, uint64_t txg) while ((vd = txg_list_remove(&spa->spa_vdev_txg_list, TXG_CLEAN(txg))) != NULL) vdev_sync_done(vd, txg); + + metaslab_class_evict_old(spa->spa_normal_class, txg); + metaslab_class_evict_old(spa->spa_log_class, txg); + spa_sync_close_syncing_log_sm(spa); spa_update_dspace(spa); -- cgit v1.2.3