summaryrefslogtreecommitdiffstats
path: root/man
diff options
context:
space:
mode:
authorMatthew Ahrens <mahrens@delphix.com>2018-02-26 15:33:55 -0800
committerBrian Behlendorf <behlendorf1@llnl.gov>2018-05-24 10:18:07 -0700
commit0dc2f70c5cece6ef2474e14552111ae098d9f5b4 (patch)
tree8414edcb42c28aecbc4e9422eb02d15d7e98d035 /man
parentba863d0be4cbfbea938b10e49fb6ff459ac9ec20 (diff)
OpenZFS 9486 - reduce memory used by device removal on fragmented pools
Device removal allocates a new location for each allocated segment on the disk that's being removed. Each allocation results in one entry in the mapping table, which maps from old location + length to new location. When a fragmented disk is removed, this can result in a large number of mapping entries, and thus a large amount of memory consumed by the mapping table. In the worst real-world cases, we've seen around 1GB of RAM per 1TB of storage removed. We can improve on this situation by allocating larger segments, which span across both allocated and free regions of the device being removed. By including free regions in the allocation (and thus mapping), we reduce the number of mapping entries. For example, if we have a 4K allocation followed by 1K free and then 4K allocated, we would allocate 4+1+4 = 9KB, and then move the entire region (including allocated and free parts). In this case we used one mapping where previously we would have used two, but often the ratio is much higher (up to 20:1 in real-world use). We then need to mark the regions that were free on the removing device as free in the new locations, and also obsolete in the mapping entry. This method preserves the fragmentation of the removing device, rather than consolidating its allocated space into a small number of chunks where possible. But it results in drastic reduction of memory used by the mapping table - around 20x in the most-fragmented cases. In the most fragmented real-world cases, this reduces memory used by the mapping from ~1GB to ~50MB of RAM per 1TB of storage removed. Less fragmented cases will typically also see around 50-100MB of RAM per 1TB of storage. Porting notes: * Add the following as module parameters: * zfs_condense_indirect_vdevs_enable * zfs_condense_max_obsolete_bytes * Document the following module parameters: * zfs_condense_indirect_vdevs_enable * zfs_condense_max_obsolete_bytes * zfs_condense_min_mapping_bytes Authored by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Ported-by: Tim Chase <tim@chase2k.com> Signed-off-by: Tim Chase <tim@chase2k.com> OpenZFS-issue: https://illumos.org/issues/9486 OpenZFS-commit: https://github.com/ahrens/illumos/commit/07152e142e44c External-issue: DLPX-57962 Closes #7536
Diffstat (limited to 'man')
-rw-r--r--man/man5/zfs-module-parameters.559
1 files changed, 59 insertions, 0 deletions
diff --git a/man/man5/zfs-module-parameters.5 b/man/man5/zfs-module-parameters.5
index 886dffce8..dbfa8806a 100644
--- a/man/man5/zfs-module-parameters.5
+++ b/man/man5/zfs-module-parameters.5
@@ -428,6 +428,24 @@ Default value: \fB5\fR.
.sp
.ne 2
.na
+\fBvdev_removal_max_span\fR (int)
+.ad
+.RS 12n
+During top-level vdev removal, chunks of data are copied from the vdev
+which may include free space in order to trade bandwidth for IOPS.
+This parameter determines the maximum span of free space (in bytes)
+which will be included as "unnecessary" data in a chunk of copied data.
+
+The default value here was chosen to align with
+\fBzfs_vdev_read_gap_limit\fR, which is a similar concept when doing
+regular reads (but there's no reason it has to be the same).
+.sp
+Default value: \fB32,768\fR.
+.RE
+
+.sp
+.ne 2
+.na
\fBzfetch_array_rd_sz\fR (ulong)
.ad
.RS 12n
@@ -871,6 +889,47 @@ Default value: \fB5\fR%.
.sp
.ne 2
.na
+\fBzfs_condense_indirect_vdevs_enable\fR (int)
+.ad
+.RS 12n
+Enable condensing indirect vdev mappings. When set to a non-zero value,
+attempt to condense indirect vdev mappings if the mapping uses more than
+\fBzfs_condense_min_mapping_bytes\fR bytes of memory and if the obsolete
+space map object uses more than \fBzfs_condense_max_obsolete_bytes\fR
+bytes on-disk. The condensing process is an attempt to save memory by
+removing obsolete mappings.
+.sp
+Default value: \fB1\fR.
+.RE
+
+.sp
+.ne 2
+.na
+\fBzfs_condense_max_obsolete_bytes\fR (ulong)
+.ad
+.RS 12n
+Only attempt to condense indirect vdev mappings if the on-disk size
+of the obsolete space map object is greater than this number of bytes
+(see \fBfBzfs_condense_indirect_vdevs_enable\fR).
+.sp
+Default value: \fB1,073,741,824\fR.
+.RE
+
+.sp
+.ne 2
+.na
+\fBzfs_condense_min_mapping_bytes\fR (ulong)
+.ad
+.RS 12n
+Minimum size vdev mapping to attempt to condense (see
+\fBzfs_condense_indirect_vdevs_enable\fR).
+.sp
+Default value: \fB131,072\fR.
+.RE
+
+.sp
+.ne 2
+.na
\fBzfs_dbgmsg_enable\fR (int)
.ad
.RS 12n