summaryrefslogtreecommitdiffstats
path: root/man/man8
diff options
context:
space:
mode:
authorBrian Behlendorf <[email protected]>2020-11-13 13:51:51 -0800
committerGitHub <[email protected]>2020-11-13 13:51:51 -0800
commitb2255edcc0099e62ad46a3dd9d64537663c6aee3 (patch)
tree6cfe0d0fd30fb451396551a991d50f4bdc0cf353 /man/man8
parenta724db03740133c46b9a577b41a6f7221acd3e1f (diff)
Distributed Spare (dRAID) Feature
This patch adds a new top-level vdev type called dRAID, which stands for Distributed parity RAID. This pool configuration allows all dRAID vdevs to participate when rebuilding to a distributed hot spare device. This can substantially reduce the total time required to restore full parity to pool with a failed device. A dRAID pool can be created using the new top-level `draid` type. Like `raidz`, the desired redundancy is specified after the type: `draid[1,2,3]`. No additional information is required to create the pool and reasonable default values will be chosen based on the number of child vdevs in the dRAID vdev. zpool create <pool> draid[1,2,3] <vdevs...> Unlike raidz, additional optional dRAID configuration values can be provided as part of the draid type as colon separated values. This allows administrators to fully specify a layout for either performance or capacity reasons. The supported options include: zpool create <pool> \ draid[<parity>][:<data>d][:<children>c][:<spares>s] \ <vdevs...> - draid[parity] - Parity level (default 1) - draid[:<data>d] - Data devices per group (default 8) - draid[:<children>c] - Expected number of child vdevs - draid[:<spares>s] - Distributed hot spares (default 0) Abbreviated example `zpool status` output for a 68 disk dRAID pool with two distributed spares using special allocation classes. ``` pool: tank state: ONLINE config: NAME STATE READ WRITE CKSUM slag7 ONLINE 0 0 0 draid2:8d:68c:2s-0 ONLINE 0 0 0 L0 ONLINE 0 0 0 L1 ONLINE 0 0 0 ... U25 ONLINE 0 0 0 U26 ONLINE 0 0 0 spare-53 ONLINE 0 0 0 U27 ONLINE 0 0 0 draid2-0-0 ONLINE 0 0 0 U28 ONLINE 0 0 0 U29 ONLINE 0 0 0 ... U42 ONLINE 0 0 0 U43 ONLINE 0 0 0 special mirror-1 ONLINE 0 0 0 L5 ONLINE 0 0 0 U5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 L6 ONLINE 0 0 0 U6 ONLINE 0 0 0 spares draid2-0-0 INUSE currently in use draid2-0-1 AVAIL ``` When adding test coverage for the new dRAID vdev type the following options were added to the ztest command. These options are leverages by zloop.sh to test a wide range of dRAID configurations. -K draid|raidz|random - kind of RAID to test -D <value> - dRAID data drives per group -S <value> - dRAID distributed hot spares -R <value> - RAID parity (raidz or dRAID) The zpool_create, zpool_import, redundancy, replacement and fault test groups have all been updated provide test coverage for the dRAID feature. Co-authored-by: Isaac Huang <[email protected]> Co-authored-by: Mark Maybee <[email protected]> Co-authored-by: Don Brady <[email protected]> Co-authored-by: Matthew Ahrens <[email protected]> Co-authored-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10102
Diffstat (limited to 'man/man8')
-rw-r--r--man/man8/zpool-create.82
-rw-r--r--man/man8/zpool-scrub.82
-rw-r--r--man/man8/zpoolconcepts.878
3 files changed, 78 insertions, 4 deletions
diff --git a/man/man8/zpool-create.8 b/man/man8/zpool-create.8
index 7f3f27b9b..7406a493e 100644
--- a/man/man8/zpool-create.8
+++ b/man/man8/zpool-create.8
@@ -73,12 +73,14 @@ and period
The pool names
.Sy mirror ,
.Sy raidz ,
+.Sy draid ,
.Sy spare
and
.Sy log
are reserved, as are names beginning with
.Sy mirror ,
.Sy raidz ,
+.Sy draid ,
.Sy spare ,
and the pattern
.Sy c[0-9] .
diff --git a/man/man8/zpool-scrub.8 b/man/man8/zpool-scrub.8
index ede569978..6ff2eb261 100644
--- a/man/man8/zpool-scrub.8
+++ b/man/man8/zpool-scrub.8
@@ -52,7 +52,7 @@ Begins a scrub or resumes a paused scrub.
The scrub examines all data in the specified pools to verify that it checksums
correctly.
For replicated
-.Pq mirror or raidz
+.Pq mirror, raidz, or draid
devices, ZFS automatically repairs any damage discovered during the scrub.
The
.Nm zpool Cm status
diff --git a/man/man8/zpoolconcepts.8 b/man/man8/zpoolconcepts.8
index f9c262f4b..d999b0354 100644
--- a/man/man8/zpoolconcepts.8
+++ b/man/man8/zpoolconcepts.8
@@ -64,7 +64,7 @@ A file must be specified by a full path.
A mirror of two or more devices.
Data is replicated in an identical fashion across all components of a mirror.
A mirror with N disks of size X can hold X bytes and can withstand (N-1) devices
-failing before data integrity is compromised.
+failing without losing data.
.It Sy raidz , raidz1 , raidz2 , raidz3
A variation on RAID-5 that allows for better distribution of parity and
eliminates the RAID-5
@@ -88,11 +88,75 @@ vdev type is an alias for
.Sy raidz1 .
.Pp
A raidz group with N disks of size X with P parity disks can hold approximately
-(N-P)*X bytes and can withstand P device(s) failing before data integrity is
-compromised.
+(N-P)*X bytes and can withstand P device(s) failing without losing data.
The minimum number of devices in a raidz group is one more than the number of
parity disks.
The recommended number is between 3 and 9 to help increase performance.
+.It Sy draid , draid1 , draid2 , draid3
+A variant of raidz that provides integrated distributed hot spares which
+allows for faster resilvering while retaining the benefits of raidz.
+A dRAID vdev is constructed from multiple internal raidz groups, each with D
+data devices and P parity devices.
+These groups are distributed over all of the children in order to fully
+utilize the available disk performance.
+.Pp
+Unlike raidz, dRAID uses a fixed stripe width (padding as necessary with
+zeros) to allow fully sequential resilvering.
+This fixed stripe width significantly effects both usable capacity and IOPS.
+For example, with the default D=8 and 4k disk sectors the minimum allocation
+size is 32k.
+If using compression, this relatively large allocation size can reduce the
+effective compression ratio.
+When using ZFS volumes and dRAID the default volblocksize property is increased
+to account for the allocation size.
+If a dRAID pool will hold a significant amount of small blocks, it is
+recommended to also add a mirrored
+.Sy special
+vdev to store those blocks.
+.Pp
+In regards to IO/s, performance is similar to raidz since for any read all D
+data disks must be accessed.
+Delivered random IOPS can be reasonably approximated as
+floor((N-S)/(D+P))*<single-drive-IOPS>.
+.Pp
+Like raidz a dRAID can have single-, double-, or triple-parity. The
+.Sy draid1 ,
+.Sy draid2 ,
+and
+.Sy draid3
+types can be used to specify the parity level.
+The
+.Sy draid
+vdev type is an alias for
+.Sy draid1 .
+.Pp
+A dRAID with N disks of size X, D data disks per redundancy group, P parity
+level, and S distributed hot spares can hold approximately (N-S)*(D/(D+P))*X
+bytes and can withstand P device(s) failing without losing data.
+.It Sy draid[<parity>][:<data>d][:<children>c][:<spares>s]
+A non-default dRAID configuration can be specified by appending one or more
+of the following optional arguments to the
+.Sy draid
+keyword.
+.Pp
+.Em parity
+- The parity level (1-3).
+.Pp
+.Em data
+- The number of data devices per redundancy group.
+In general a smaller value of D will increase IOPS, improve the compression ratio, and speed up resilvering at the expense of total usable capacity.
+Defaults to 8, unless N-P-S is less than 8.
+.Pp
+.Em children
+- The expected number of children.
+Useful as a cross-check when listing a large number of devices.
+An error is returned when the provided number of children differs.
+.Pp
+.Em spares
+- The number of distributed hot spares.
+Defaults to zero.
+.Pp
+.Pp
.It Sy spare
A pseudo-vdev which keeps track of available hot spares for a pool.
For more information, see the
@@ -273,6 +337,14 @@ If the original faulted device is detached, then the hot spare assumes its
place in the configuration, and is removed from the spare list of all active
pools.
.Pp
+The
+.Sy draid
+vdev type provides distributed hot spares.
+These hot spares are named after the dRAID vdev they're a part of (
+.Qq draid1-2-3 specifies spare 3 of vdev 2, which is a single parity dRAID
+) and may only be used by that dRAID vdev.
+Otherwise, they behave the same as normal hot spares.
+.Pp
Spares cannot replace log devices.
.Ss Intent Log
The ZFS Intent Log (ZIL) satisfies POSIX requirements for synchronous