diff options
author | Matthew Ahrens <[email protected]> | 2020-04-10 10:39:55 -0700 |
---|---|---|
committer | GitHub <[email protected]> | 2020-04-10 10:39:55 -0700 |
commit | c618f87cd2e96438468a391246d63ba1803f35c8 (patch) | |
tree | bdd9beb37d34e04c17543d99e10e21980f0f760a /man/man8 | |
parent | 77f6826b83b7e27f0996f6d192202c36f65e41fd (diff) |
Add `zstream redup` command to convert deduplicated send streams
Deduplicated send and receive is deprecated. To ease migration to the
new dedup-send-less world, the commit adds a `zstream redup` utility to
convert deduplicated send streams to normal streams, so that they can
continue to be received indefinitely.
The new `zstream` command also replaces the functionality of
`zstreamdump`, by way of the `zstream dump` subcommand. The
`zstreamdump` command is replaced by a shell script which invokes
`zstream dump`.
The way that `zstream redup` works under the hood is that as we read the
send stream, we build up a hash table which maps from `<GUID, object,
offset> -> <file_offset>`.
Whenever we see a WRITE record, we add a new entry to the hash table,
which indicates where in the stream file to find the WRITE record for
this block. (The key is `drr_toguid, drr_object, drr_offset`.)
For entries other than WRITE_BYREF, we pass them through unchanged
(except for the running checksum, which is recalculated).
For WRITE_BYREF records, we change them to WRITE records. We find the
referenced WRITE record by looking in the hash table (for the record
with key `drr_refguid, drr_refobject, drr_refoffset`), and then reading
the record header and payload from the specified offset in the stream
file. This is why the stream can not be a pipe. The found WRITE record
replaces the WRITE_BYREF record, with its `drr_toguid`, `drr_object`,
and `drr_offset` fields changed to be the same as the WRITE_BYREF's
(i.e. we are writing the same logical block, but with the data supplied
by the previous WRITE record).
This algorithm requires memory proportional to the number of WRITE
records (same as `zfs send -D`), but the size per WRITE record is
relatively low (40 bytes, vs. 72 for `zfs send -D`). A 1TB send stream
with 8KB blocks (`recordsize=8k`) would use around 5GB of RAM to
"redup".
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #10124
Closes #10156
Diffstat (limited to 'man/man8')
-rw-r--r-- | man/man8/Makefile.am | 1 | ||||
-rw-r--r-- | man/man8/zstream.8 | 101 |
2 files changed, 102 insertions, 0 deletions
diff --git a/man/man8/Makefile.am b/man/man8/Makefile.am index 8239c2157..b7d26570e 100644 --- a/man/man8/Makefile.am +++ b/man/man8/Makefile.am @@ -78,6 +78,7 @@ dist_man_MANS = \ zpool-trim.8 \ zpool-upgrade.8 \ zpool-wait.8 \ + zstream.8 \ zstreamdump.8 nodist_man_MANS = \ diff --git a/man/man8/zstream.8 b/man/man8/zstream.8 new file mode 100644 index 000000000..1c4d3fa9a --- /dev/null +++ b/man/man8/zstream.8 @@ -0,0 +1,101 @@ +.\" +.\" CDDL HEADER START +.\" +.\" The contents of this file are subject to the terms of the +.\" Common Development and Distribution License (the "License"). +.\" You may not use this file except in compliance with the License. +.\" +.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE +.\" or http://www.opensolaris.org/os/licensing. +.\" See the License for the specific language governing permissions +.\" and limitations under the License. +.\" +.\" When distributing Covered Code, include this CDDL HEADER in each +.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE. +.\" If applicable, add the following below this CDDL HEADER, with the +.\" fields enclosed by brackets "[]" replaced with your own identifying +.\" information: Portions Copyright [yyyy] [name of copyright owner] +.\" +.\" CDDL HEADER END +.\" +.\" +.\" Copyright (c) 2020 by Delphix. All rights reserved. +.Dd March 25, 2020 +.Dt ZSTREAM 8 +.Os Linux +.Sh NAME +.Nm zstream +.Nd manipulate zfs send streams +.Sh SYNOPSIS +.Nm +.Cm dump +.Op Fl Cvd +.Op Ar file +.Nm +.Cm redup +.Op Fl v +.Ar file +.Sh DESCRIPTION +.sp +.LP +The +.Sy zstream +utility manipulates zfs send streams, which are the output of the +.Sy zfs send +command. +.Bl -tag -width "" +.It Xo +.Nm +.Cm dump +.Op Fl Cvd +.Op Ar file +.Xc +Print information about the specified send stream, including headers and +record counts. +The send stream may either be in the specified +.Ar file , +or provided on standard input. +.Bl -tag -width "-D" +.It Fl C +Suppress the validation of checksums. +.It Fl v +Verbose. +Print metadata for each record. +.It Fl d +Dump data contained in each record. +Implies verbose. +.El +.It Xo +.Nm +.Cm redup +.Op Fl v +.Ar file +.Xc +Deduplicated send streams can be generated by using the +.Nm zfs Cm send Fl D +command. +The ability to send deduplicated send streams is deprecated. +In the future, the ability to receive a deduplicated send stream with +.Nm zfs Cm receive +will be removed. +However, deduplicated send streams can still be received by utilizing +.Nm zstream Cm redup . +.Pp +The +.Nm zstream Cm redup +command is provided a +.Ar file +containing a deduplicated send stream, and outputs an equivalent +non-deduplicated send stream on standard output. +Therefore, a deduplicated send stream can be received by running: +.Bd -literal +# zstream redup DEDUP_STREAM_FILE | zfs receive ... +.Ed +.Bl -tag -width "-D" +.It Fl v +Verbose. +Print summary of converted records. +.Sh SEE ALSO +.Xr zfs 8 , +.Xr zfs-send 8 , +.Xr zfs-receive 8 |