diff options
author | Serapheim Dimitropoulos <[email protected]> | 2019-02-20 09:59:57 -0800 |
---|---|---|
committer | Brian Behlendorf <[email protected]> | 2019-02-20 09:59:56 -0800 |
commit | 928e8ad47d3478a3d5d01f0dd6ae74a9371af65e (patch) | |
tree | 877e0d0920705d870aa96183c305957e94dc5e9d /module/zfs/vdev.c | |
parent | bb1be77a35d3cf5389a36f1d13935811962278c3 (diff) |
Introduce auxiliary metaslab histograms
This patch introduces 3 new histograms per metaslab. These
histograms track segments that have made it to the metaslab's
space map histogram (and are part of the spacemap) but have
not yet reached the ms_allocatable tree on loaded metaslab's
because these metaslab's are currently syncing and haven't
gone through metaslab_sync_done() yet.
The histograms help when we decide whether to load an unloaded
metaslab in-order to allocate from it. When calculating the
weight of an unloaded metaslab traditionally, we look at the
highest bucket of its spacemap's histogram. The problem is
that we are not guaranteed to be able to allocated that
segment when we load the metaslab because it may still be at
the freeing, freed, or defer trees. The new histograms are
used when we try to calculate an unloaded metaslab's weight
to deal with this issue by removing segments that have would
not be in the allocatable tree at runtime. Note, that this
method of dealing with this is not completely accurate as
adjacent segments are not always consolidated in the space
map histogram of a metaslab.
In addition and to make things deterministic, we always reset
the weight of unloaded metaslabs based on their space map
weight (instead of doing that on a need basis). Thus, every
time a metaslab is loaded and its weight is reset again (from
the weight based on its space map to the one based on its
allocatable range tree) we expect (and assert) that this
change in weight can only get better if it doesn't stay the
same.
Reviewed by: Paul Dagnelie <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Signed-off-by: Serapheim Dimitropoulos <[email protected]>
Closes #8358
Diffstat (limited to 'module/zfs/vdev.c')
-rw-r--r-- | module/zfs/vdev.c | 19 |
1 files changed, 13 insertions, 6 deletions
diff --git a/module/zfs/vdev.c b/module/zfs/vdev.c index 81c34da07..b17682d81 100644 --- a/module/zfs/vdev.c +++ b/module/zfs/vdev.c @@ -1346,12 +1346,12 @@ vdev_metaslab_fini(vdev_t *vd) } if (vd->vdev_ms != NULL) { - uint64_t count = vd->vdev_ms_count; + metaslab_group_t *mg = vd->vdev_mg; + metaslab_group_passivate(mg); - metaslab_group_passivate(vd->vdev_mg); + uint64_t count = vd->vdev_ms_count; for (uint64_t m = 0; m < count; m++) { metaslab_t *msp = vd->vdev_ms[m]; - if (msp != NULL) metaslab_fini(msp); } @@ -1359,6 +1359,9 @@ vdev_metaslab_fini(vdev_t *vd) vd->vdev_ms = NULL; vd->vdev_ms_count = 0; + + for (int i = 0; i < RANGE_TREE_HISTOGRAM_SIZE; i++) + ASSERT0(mg->mg_histogram[i]); } ASSERT0(vd->vdev_ms_count); ASSERT3U(vd->vdev_pending_fastwrite, ==, 0); @@ -3006,7 +3009,10 @@ vdev_load(vdev_t *vd) "asize=%llu", (u_longlong_t)vd->vdev_ashift, (u_longlong_t)vd->vdev_asize); return (SET_ERROR(ENXIO)); - } else if ((error = vdev_metaslab_init(vd, 0)) != 0) { + } + + error = vdev_metaslab_init(vd, 0); + if (error != 0) { vdev_dbgmsg(vd, "vdev_load: metaslab_init failed " "[error=%d]", error); vdev_set_state(vd, B_FALSE, VDEV_STATE_CANT_OPEN, @@ -3021,9 +3027,10 @@ vdev_load(vdev_t *vd) ASSERT(vd->vdev_asize != 0); ASSERT3P(vd->vdev_checkpoint_sm, ==, NULL); - if ((error = space_map_open(&vd->vdev_checkpoint_sm, + error = space_map_open(&vd->vdev_checkpoint_sm, mos, checkpoint_sm_obj, 0, vd->vdev_asize, - vd->vdev_ashift))) { + vd->vdev_ashift); + if (error != 0) { vdev_dbgmsg(vd, "vdev_load: space_map_open " "failed for checkpoint spacemap (obj %llu) " "[error=%d]", |