aboutsummaryrefslogtreecommitdiffstats
path: root/man/man5
diff options
context:
space:
mode:
authorBrian Behlendorf <[email protected]>2019-03-29 09:13:20 -0700
committerGitHub <[email protected]>2019-03-29 09:13:20 -0700
commit1b939560be5c51deecf875af9dada9d094633bf7 (patch)
tree2a780b838134636ddbc65f89d227e37c74abe17b /man/man5
parentf94b3cbf43d62f4962e71cfe7ba8c6f0602e2a45 (diff)
Add TRIM support
UNMAP/TRIM support is a frequently-requested feature to help prevent performance from degrading on SSDs and on various other SAN-like storage back-ends. By issuing UNMAP/TRIM commands for sectors which are no longer allocated the underlying device can often more efficiently manage itself. This TRIM implementation is modeled on the `zpool initialize` feature which writes a pattern to all unallocated space in the pool. The new `zpool trim` command uses the same vdev_xlate() code to calculate what sectors are unallocated, the same per- vdev TRIM thread model and locking, and the same basic CLI for a consistent user experience. The core difference is that instead of writing a pattern it will issue UNMAP/TRIM commands for those extents. The zio pipeline was updated to accommodate this by adding a new ZIO_TYPE_TRIM type and associated spa taskq. This new type makes is straight forward to add the platform specific TRIM/UNMAP calls to vdev_disk.c and vdev_file.c. These new ZIO_TYPE_TRIM zios are handled largely the same way as ZIO_TYPE_READs or ZIO_TYPE_WRITEs. This makes it possible to largely avoid changing the pipieline, one exception is that TRIM zio's may exceed the 16M block size limit since they contain no data. In addition to the manual `zpool trim` command, a background automatic TRIM was added and is controlled by the 'autotrim' property. It relies on the exact same infrastructure as the manual TRIM. However, instead of relying on the extents in a metaslab's ms_allocatable range tree, a ms_trim tree is kept per metaslab. When 'autotrim=on', ranges added back to the ms_allocatable tree are also added to the ms_free tree. The ms_free tree is then periodically consumed by an autotrim thread which systematically walks a top level vdev's metaslabs. Since the automatic TRIM will skip ranges it considers too small there is value in occasionally running a full `zpool trim`. This may occur when the freed blocks are small and not enough time was allowed to aggregate them. An automatic TRIM and a manual `zpool trim` may be run concurrently, in which case the automatic TRIM will yield to the manual TRIM. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Tim Chase <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: George Wilson <[email protected]> Reviewed-by: Serapheim Dimitropoulos <[email protected]> Contributions-by: Saso Kiselkov <[email protected]> Contributions-by: Tim Chase <[email protected]> Contributions-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8419 Closes #598
Diffstat (limited to 'man/man5')
-rw-r--r--man/man5/zfs-module-parameters.5121
1 files changed, 120 insertions, 1 deletions
diff --git a/man/man5/zfs-module-parameters.5 b/man/man5/zfs-module-parameters.5
index c1994f340..a1a586df1 100644
--- a/man/man5/zfs-module-parameters.5
+++ b/man/man5/zfs-module-parameters.5
@@ -14,7 +14,7 @@
.\" CDDL HEADER, with the fields enclosed by brackets "[]" replaced with your
.\" own identifying information:
.\" Portions Copyright [yyyy] [name of copyright owner]
-.TH ZFS-MODULE-PARAMETERS 5 "Feb 8, 2019"
+.TH ZFS-MODULE-PARAMETERS 5 "Feb 15, 2019"
.SH NAME
zfs\-module\-parameters \- ZFS module parameters
.SH DESCRIPTION
@@ -1535,6 +1535,30 @@ Default value: \fB10\fR.
.sp
.ne 2
.na
+\fBzfs_vdev_trim_max_active\fR (int)
+.ad
+.RS 12n
+Maximum trim/discard I/Os active to each device.
+See the section "ZFS I/O SCHEDULER".
+.sp
+Default value: \fB2\fR.
+.RE
+
+.sp
+.ne 2
+.na
+\fBzfs_vdev_trim_min_active\fR (int)
+.ad
+.RS 12n
+Minimum trim/discard I/Os active to each device.
+See the section "ZFS I/O SCHEDULER".
+.sp
+Default value: \fB1\fR.
+.RE
+
+.sp
+.ne 2
+.na
\fBzfs_vdev_queue_depth_pct\fR (int)
.ad
.RS 12n
@@ -1619,6 +1643,12 @@ _
_
512 ZFS_DEBUG_SET_ERROR
Enable SET_ERROR and dprintf entries in the debug log.
+_
+1024 ZFS_DEBUG_INDIRECT_REMAP
+ Verify split blocks created by device removal.
+_
+2048 ZFS_DEBUG_TRIM
+ Verify TRIM ranges are always within the allocatable range tree.
.TE
.sp
* Requires debug build.
@@ -2344,6 +2374,82 @@ Default value: \fB75\fR%.
.sp
.ne 2
.na
+\fBzfs_trim_extent_bytes_max\fR (unsigned int)
+.ad
+.RS 12n
+Maximum size of TRIM command. Ranges larger than this will be split in to
+chunks no larger than \fBzfs_trim_extent_bytes_max\fR bytes before being
+issued to the device.
+.sp
+Default value: \fB134,217,728\fR.
+.RE
+
+.sp
+.ne 2
+.na
+\fBzfs_trim_extent_bytes_min\fR (unsigned int)
+.ad
+.RS 12n
+Minimum size of TRIM commands. TRIM ranges smaller than this will be skipped
+unless they're part of a larger range which was broken in to chunks. This is
+done because it's common for these small TRIMs to negatively impact overall
+performance. This value can be set to 0 to TRIM all unallocated space.
+.sp
+Default value: \fB32,768\fR.
+.RE
+
+.sp
+.ne 2
+.na
+\fBzfs_trim_metaslab_skip\fR (unsigned int)
+.ad
+.RS 12n
+Skip uninitialized metaslabs during the TRIM process. This option is useful
+for pools constructed from large thinly-provisioned devices where TRIM
+operations are slow. As a pool ages an increasing fraction of the pools
+metaslabs will be initialized progressively degrading the usefulness of
+this option. This setting is stored when starting a manual TRIM and will
+persist for the duration of the requested TRIM.
+.sp
+Default value: \fB0\fR.
+.RE
+
+.sp
+.ne 2
+.na
+\fBzfs_trim_queue_limit\fR (unsigned int)
+.ad
+.RS 12n
+Maximum number of queued TRIMs outstanding per leaf vdev. The number of
+concurrent TRIM commands issued to the device is controlled by the
+\fBzfs_vdev_trim_min_active\fR and \fBzfs_vdev_trim_max_active\fR module
+options.
+.sp
+Default value: \fB10\fR.
+.RE
+
+.sp
+.ne 2
+.na
+\fBzfs_trim_txg_batch\fR (unsigned int)
+.ad
+.RS 12n
+The number of transaction groups worth of frees which should be aggregated
+before TRIM operations are issued to the device. This setting represents a
+trade-off between issuing larger, more efficient TRIM operations and the
+delay before the recently trimmed space is available for use by the device.
+.sp
+Increasing this value will allow frees to be aggregated for a longer time.
+This will result is larger TRIM operations and potentially increased memory
+usage. Decreasing this value will have the opposite effect. The default
+value of 32 was determined to be a reasonable compromise.
+.sp
+Default value: \fB32\fR.
+.RE
+
+.sp
+.ne 2
+.na
\fBzfs_txg_history\fR (int)
.ad
.RS 12n
@@ -2367,6 +2473,19 @@ Default value: \fB5\fR.
.sp
.ne 2
.na
+\fBzfs_vdev_aggregate_trim\fR (int)
+.ad
+.RS 12n
+Allow TRIM I/Os to be aggregated. This is normally not helpful because
+the extents to be trimmed will have been already been aggregated by the
+metaslab. This option is provided for debugging and performance analysis.
+.sp
+Default value: \fB0\fR.
+.RE
+
+.sp
+.ne 2
+.na
\fBzfs_vdev_aggregation_limit\fR (int)
.ad
.RS 12n