summaryrefslogtreecommitdiffstats
path: root/man/man8
diff options
context:
space:
mode:
authorSerapheim Dimitropoulos <[email protected]>2016-12-16 14:11:29 -0800
committerBrian Behlendorf <[email protected]>2018-06-26 10:07:42 -0700
commitd2734cce68cf740e015312314415f9034c67851c (patch)
treeb7a140a3cf2a19bb7c88f2d277f3b5a33c121cea /man/man8
parent88eaf610d9c7056f0946e5090cba1e6288ff2b70 (diff)
OpenZFS 9166 - zfs storage pool checkpoint
Details about the motivation of this feature and its usage can be found in this blogpost: https://sdimitro.github.io/post/zpool-checkpoint/ A lightning talk of this feature can be found here: https://www.youtube.com/watch?v=fPQA8K40jAM Implementation details can be found in big block comment of spa_checkpoint.c Side-changes that are relevant to this commit but not explained elsewhere: * renames members of "struct metaslab trees to be shorter without losing meaning * space_map_{alloc,truncate}() accept a block size as a parameter. The reason is that in the current state all space maps that we allocate through the DMU use a global tunable (space_map_blksz) which defauls to 4KB. This is ok for metaslab space maps in terms of bandwirdth since they are scattered all over the disk. But for other space maps this default is probably not what we want. Examples are device removal's vdev_obsolete_sm or vdev_chedkpoint_sm from this review. Both of these have a 1:1 relationship with each vdev and could benefit from a bigger block size. Porting notes: * The part of dsl_scan_sync() which handles async destroys has been moved into the new dsl_process_async_destroys() function. * Remove "VERIFY(!(flags & FWRITE))" in "kernel.c" so zhack can write to block device backed pools. * ZTS: * Fix get_txg() in zpool_sync_001_pos due to "checkpoint_txg". * Don't use large dd block sizes on /dev/urandom under Linux in checkpoint_capacity. * Adopt Delphix-OS's setting of 4 (spa_asize_inflation = SPA_DVAS_PER_BP + 1) for the checkpoint_capacity test to speed its attempts to fill the pool * Create the base and nested pools with sync=disabled to speed up the "setup" phase. * Clear labels in test pool between checkpoint tests to avoid duplicate pool issues. * The import_rewind_device_replaced test has been marked as "known to fail" for the reasons listed in its DISCLAIMER. * New module parameters: zfs_spa_discard_memory_limit, zfs_remove_max_bytes_pause (not documented - debugging only) vdev_max_ms_count (formerly metaslabs_per_vdev) vdev_min_ms_count Authored by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: John Kennedy <[email protected]> Reviewed by: Dan Kimmel <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Approved by: Richard Lowe <[email protected]> Ported-by: Tim Chase <[email protected]> Signed-off-by: Tim Chase <[email protected]> OpenZFS-issue: https://illumos.org/issues/9166 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7159fdb8 Closes #7570
Diffstat (limited to 'man/man8')
-rw-r--r--man/man8/zdb.85
-rw-r--r--man/man8/zpool.893
2 files changed, 97 insertions, 1 deletions
diff --git a/man/man8/zdb.8 b/man/man8/zdb.8
index 0ce4d852d..f00d9c0cf 100644
--- a/man/man8/zdb.8
+++ b/man/man8/zdb.8
@@ -23,7 +23,7 @@
.Nd display zpool debugging and consistency information
.Sh SYNOPSIS
.Nm
-.Op Fl AbcdDFGhiLMPsvX
+.Op Fl AbcdDFGhikLMPsvX
.Op Fl e Oo Fl V Oc Op Fl p Ar path ...
.Op Fl I Ar inflight I/Os
.Oo Fl o Ar var Ns = Ns Ar value Oc Ns ...
@@ -172,6 +172,9 @@ Display information about intent log
.Pq ZIL
entries relating to each dataset.
If specified multiple times, display counts of each intent log transaction type.
+.It Fl k
+Examine the checkpointed state of the pool.
+Note, the on disk format of the pool is not reverted to the checkpointed state.
.It Fl l Ar device
Read the vdev labels from the specified device.
.Nm Fl l
diff --git a/man/man8/zpool.8 b/man/man8/zpool.8
index 228051809..ddcba2e72 100644
--- a/man/man8/zpool.8
+++ b/man/man8/zpool.8
@@ -47,6 +47,10 @@
.Oo Fl o Ar property Ns = Ns Ar value Oc
.Ar pool device new_device
.Nm
+.Cm checkpoint
+.Op Fl d, -discard
+.Ar pool
+.Nm
.Cm clear
.Ar pool
.Op Ar device
@@ -93,6 +97,7 @@
.Fl a
.Op Fl DflmN
.Op Fl F Oo Fl n Oc Oo Fl T Oc Oo Fl X Oc
+.Op Fl -rewind-to-checkpoint
.Op Fl c Ar cachefile Ns | Ns Fl d Ar dir Ns | Ns device
.Op Fl o Ar mntopts
.Oo Fl o Ar property Ns = Ns Ar value Oc Ns ...
@@ -101,6 +106,7 @@
.Cm import
.Op Fl Dflm
.Op Fl F Oo Fl n Oc Oo Fl T Oc Oo Fl X Oc
+.Op Fl -rewind-to-checkpoint
.Op Fl c Ar cachefile Ns | Ns Fl d Ar dir Ns | Ns device
.Op Fl o Ar mntopts
.Oo Fl o Ar property Ns = Ns Ar value Oc Ns ...
@@ -470,6 +476,50 @@ configuration.
.Pp
The content of the cache devices is considered volatile, as is the case with
other system caches.
+.Ss Pool checkpoint
+Before starting critical procedures that include destructive actions (e.g
+.Nm zfs Cm destroy
+), an administrator can checkpoint the pool's state and in the case of a
+mistake or failure, rewind the entire pool back to the checkpoint.
+Otherwise, the checkpoint can be discarded when the procedure has completed
+successfully.
+.Pp
+A pool checkpoint can be thought of as a pool-wide snapshot and should be used
+with care as it contains every part of the pool's state, from properties to vdev
+configuration.
+Thus, while a pool has a checkpoint certain operations are not allowed.
+Specifically, vdev removal/attach/detach, mirror splitting, and
+changing the pool's guid.
+Adding a new vdev is supported but in the case of a rewind it will have to be
+added again.
+Finally, users of this feature should keep in mind that scrubs in a pool that
+has a checkpoint do not repair checkpointed data.
+.Pp
+To create a checkpoint for a pool:
+.Bd -literal
+# zpool checkpoint pool
+.Ed
+.Pp
+To later rewind to its checkpointed state, you need to first export it and
+then rewind it during import:
+.Bd -literal
+# zpool export pool
+# zpool import --rewind-to-checkpoint pool
+.Ed
+.Pp
+To discard the checkpoint from a pool:
+.Bd -literal
+# zpool checkpoint -d pool
+.Ed
+.Pp
+Dataset reservations (controlled by the
+.Nm reservation
+or
+.Nm refreservation
+zfs properties) may be unenforceable while a checkpoint exists, because the
+checkpoint is allowed to consume the dataset's reservation.
+Finally, data that is part of the checkpoint but has been freed in the
+current state of the pool won't be scanned during a scrub.
.Ss Properties
Each pool has several properties associated with it.
Some properties are read-only statistics while others are configurable and
@@ -856,6 +906,39 @@ supported at the moment is ashift.
.El
.It Xo
.Nm
+.Cm checkpoint
+.Op Fl d, -discard
+.Ar pool
+.Xc
+Checkpoints the current state of
+.Ar pool
+, which can be later restored by
+.Nm zpool Cm import --rewind-to-checkpoint .
+The existence of a checkpoint in a pool prohibits the following
+.Nm zpool
+commands:
+.Cm remove ,
+.Cm attach ,
+.Cm detach ,
+.Cm split ,
+and
+.Cm reguid .
+In addition, it may break reservation boundaries if the pool lacks free
+space.
+The
+.Nm zpool Cm status
+command indicates the existence of a checkpoint or the progress of discarding a
+checkpoint from a pool.
+The
+.Nm zpool Cm list
+command reports how much space the checkpoint takes from the pool.
+.Bl -tag -width Ds
+.It Fl d, -discard
+Discards an existing checkpoint from
+.Ar pool .
+.El
+.It Xo
+.Nm
.Cm clear
.Ar pool
.Op Ar device
@@ -1290,6 +1373,16 @@ and the
.Sy altroot
property to
.Ar root .
+.It Fl -rewind-to-checkpoint
+Rewinds pool to the checkpointed state.
+Once the pool is imported with this flag there is no way to undo the rewind.
+All changes and data that were written after the checkpoint are lost!
+The only exception is when the
+.Sy readonly
+mounting option is enabled.
+In this case, the checkpointed state of the pool is opened and an
+administrator can see how the pool would look like if they were
+to fully rewind.
.It Fl s
Scan using the default search path, the libblkid cache will not be
consulted. A custom search path may be specified by setting the