diff options
author | Serapheim Dimitropoulos <[email protected]> | 2019-02-12 10:38:11 -0800 |
---|---|---|
committer | Brian Behlendorf <[email protected]> | 2019-02-12 10:38:11 -0800 |
commit | 425d3237ee88abc53d8522a7139c926d278b4b7f (patch) | |
tree | b5ac7302fdc38129013399b1dcb63eb8cf1fddb6 /include/sys/metaslab_impl.h | |
parent | d8d418ff0cc90776182534bce10b01e9487b63e4 (diff) |
Get rid of space_map_update() for ms_synced_length
Initially, metaslabs and space maps used to be the same thing
in ZFS. Later, we started differentiating them by referring
to the space map as the on-disk state of the metaslab, making
the metaslab a higher-level concept that is metadata that deals
with space accounting. Today we've managed to split that code
furthermore, with the space map being its own on-disk data
structure used in areas of ZFS besides metaslabs (e.g. the
vdev-wide space maps used for zpool checkpoint or vdev removal
features).
This patch refactors the space map code to further split the
space map code from the metaslab code. It does so by getting
rid of the idea that the space map can have a different in-core
and on-disk length (sm_length vs smp_length) which is something
that is only used for the metaslab code, and other consumers
of space maps just have to deal with. Instead, this patch
introduces changes that move the old in-core length of the
metaslab's space map to the metaslab structure itself (see
ms_synced_length field) while making the space map code only
care about the actual space map's length on-disk.
The result of this is that space map consumers no longer have
to deal with syncing two different lengths for the same
structure (e.g. space_map_update() goes away) while metaslab
specific behavior stays within the metaslab code. Specifically,
the ms_synced_length field keeps track of the amount of data
metaslab_load() can read from the metaslab's space map while
working concurrently with metaslab_sync() that may be
appending to that same space map.
As a side note, the patch also adds a few comments around
the metaslab code documenting some assumptions and expected
behavior.
Reviewed-by: Matt Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
Signed-off-by: Serapheim Dimitropoulos <[email protected]>
Closes #8328
Diffstat (limited to 'include/sys/metaslab_impl.h')
-rw-r--r-- | include/sys/metaslab_impl.h | 36 |
1 files changed, 36 insertions, 0 deletions
diff --git a/include/sys/metaslab_impl.h b/include/sys/metaslab_impl.h index 137a84769..02ce02226 100644 --- a/include/sys/metaslab_impl.h +++ b/include/sys/metaslab_impl.h @@ -340,8 +340,34 @@ struct metaslab_group { * being written. */ struct metaslab { + /* + * This is the main lock of the metaslab and its purpose is to + * coordinate our allocations and frees [e.g metaslab_block_alloc(), + * metaslab_free_concrete(), ..etc] with our various syncing + * procedures [e.g. metaslab_sync(), metaslab_sync_done(), ..etc]. + * + * The lock is also used during some miscellaneous operations like + * using the metaslab's histogram for the metaslab group's histogram + * aggregation, or marking the metaslab for initialization. + */ kmutex_t ms_lock; + + /* + * Acquired together with the ms_lock whenever we expect to + * write to metaslab data on-disk (i.e flushing entries to + * the metaslab's space map). It helps coordinate readers of + * the metaslab's space map [see spa_vdev_remove_thread()] + * with writers [see metaslab_sync()]. + * + * Note that metaslab_load(), even though a reader, uses + * a completely different mechanism to deal with the reading + * of the metaslab's space map based on ms_synced_length. That + * said, the function still uses the ms_sync_lock after it + * has read the ms_sm [see relevant comment in metaslab_load() + * as to why]. + */ kmutex_t ms_sync_lock; + kcondvar_t ms_load_cv; space_map_t *ms_sm; uint64_t ms_id; @@ -351,6 +377,7 @@ struct metaslab { range_tree_t *ms_allocating[TXG_SIZE]; range_tree_t *ms_allocatable; + uint64_t ms_allocated_this_txg; /* * The following range trees are accessed only from syncing context. @@ -375,6 +402,12 @@ struct metaslab { boolean_t ms_loaded; boolean_t ms_loading; + /* + * Tracks the exact amount of allocated space of this metaslab + * (and specifically the metaslab's space map) up to the most + * recently completed sync pass [see usage in metaslab_sync()]. + */ + uint64_t ms_allocated_space; int64_t ms_deferspace; /* sum of ms_defermap[] space */ uint64_t ms_weight; /* weight vs. others in group */ uint64_t ms_activation_weight; /* activation weight */ @@ -411,6 +444,9 @@ struct metaslab { avl_node_t ms_group_node; /* node in metaslab group tree */ txg_node_t ms_txg_node; /* per-txg dirty metaslab links */ + /* updated every time we are done syncing the metaslab's space map */ + uint64_t ms_synced_length; + boolean_t ms_new; }; |