aboutsummaryrefslogtreecommitdiffstats
path: root/lib
Commit message (Collapse)AuthorAgeFilesLines
* Add 'zfs umount -u' for encrypted datasetsTom Caputi2019-06-281-1/+27
| | | | | | | | | | | This patch adds the ability for the user to unload keys for datasets as they are being unmounted. This is analogous to 'zfs mount -l'. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes: #8917 Closes: #8952
* Remove code for zfs remapMatthew Ahrens2019-06-242-40/+0
| | | | | | | | | | | | | | | | The "zfs remap" command was disabled by 6e91a72fe3ff8bb282490773bd687632f3e8c79d, because it has little utility and introduced some tricky bugs. This commit removes the code for it, the associated ZFS_IOC_REMAP ioctl, and tests. Note that the ioctl and property will remain, but have no functionality. This allows older software to fail gracefully if it attempts to use these, and avoids a backwards incompatibility that would be introduced if we renumbered the later ioctls/props. Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #8944
* Fix error message on promoting encrypted datasetTom Caputi2019-06-241-0/+10
| | | | | | | | | This patch corrects the error message reported when attempting to promote a dataset outside of its encryption root. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8905 Closes #8935
* OpenZFS 9425 - channel programs can be interruptedDon Brady2019-06-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problem Statement ================= ZFS Channel program scripts currently require a timeout, so that hung or long-running scripts return a timeout error instead of causing ZFS to get wedged. This limit can currently be set up to 100 million Lua instructions. Even with a limit in place, it would be desirable to have a sys admin (support engineer) be able to cancel a script that is taking a long time. Proposed Solution ================= Make it possible to abort a channel program by sending an interrupt signal.In the underlying txg_wait_sync function, switch the cv_wait to a cv_wait_sig to catch the signal. Once a signal is encountered, the dsl_sync_task function can install a Lua hook that will get called before the Lua interpreter executes a new line of code. The dsl_sync_task can resume with a standard txg_wait_sync call and wait for the txg to complete. Meanwhile, the hook will abort the script and indicate that the channel program was canceled. The kernel returns a EINTR to indicate that the channel program run was canceled. Porting notes: Added missing return value from cv_wait_sig() Authored by: Don Brady <[email protected]> Reviewed by: Sebastien Roy <[email protected]> Reviewed by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Sara Hartse <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Don Brady <[email protected]> Signed-off-by: Don Brady <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9425 OpenZFS-commit: https://github.com/illumos/illumos-gate/commit/d0cb1fb926 Closes #8904
* Add libnvpair to libzfs pkg-configHarry Mallon2019-06-221-1/+1
| | | | | | | | Functions such as `fnvlist_lookup_nvlist` need libnvpair to be linked. Default pkg-config file did not contain it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Harry Mallon <[email protected]> Closes #8919
* Allow unencrypted children of encrypted datasetsTom Caputi2019-06-203-79/+23
| | | | | | | | | | | | | | | | | | When encryption was first added to ZFS, we made a decision to prevent users from creating unencrypted children of encrypted datasets. The idea was to prevent users from inadvertently leaving some of their data unencrypted. However, since the release of 0.8.0, some legitimate reasons have been brought up for this behavior to be allowed. This patch simply removes this limitation from all code paths that had checks for it and updates the tests accordingly. Reviewed-by: Jason King <[email protected]> Reviewed-by: Sean Eric Fagan <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8737 Closes #8870
* Remove dedupditto functionalityMatthew Ahrens2019-06-191-9/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If dedup is in use, the `dedupditto` property can be set, causing ZFS to keep an extra copy of data that is referenced many times (>100x). The idea was that this data is more important than other data and thus we want to be really sure that it is not lost if the disk experiences a small amount of random corruption. ZFS (and system administrators) rely on the pool-level redundancy to protect their data (e.g. mirroring or RAIDZ). Since the user/sysadmin doesn't have control over what data will be offered extra redundancy by dedupditto, this extra redundancy is not very useful. The bulk of the data is still vulnerable to loss based on the pool-level redundancy. For example, if particle strikes corrupt 0.1% of blocks, you will either be saved by mirror/raidz, or you will be sad. This is true even if dedupditto saved another 0.01% of blocks from being corrupted. Therefore, the dedupditto functionality is rarely enabled (i.e. the property is rarely set), and it fulfills its promise of increased redundancy even more rarely. Additionally, this feature does not work as advertised (on existing releases), because scrub/resilver did not repair the extra (dedupditto) copy (see https://github.com/zfsonlinux/zfs/pull/8270). In summary, this seldom-used feature doesn't work, and even if it did it wouldn't provide useful data protection. It has a non-trivial maintenance burden (again see https://github.com/zfsonlinux/zfs/pull/8270). We should remove the dedupditto functionality. For backwards compatibility with the existing CLI, "zpool set dedupditto" will still "succeed" (exit code zero), but won't have any effect. For backwards compatibility with existing pools that had dedupditto enabled at some point, the code will still be able to understand dedupditto blocks and free them when appropriate. However, ZFS won't write any new dedupditto blocks. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Issue #8270 Closes #8310
* Use ZFS_DEV macro instead of literalsTomohiro Kusumi2019-06-192-4/+4
| | | | | | | | | The rest of the code/comments use ZFS_DEV, so sync with that. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Elling <[email protected]> Signed-off-by: Tomohiro Kusumi <[email protected]> Closes #8912
* Implement Redacted Send/ReceivePaul Dagnelie2019-06-198-179/+949
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Redacted send/receive allows users to send subsets of their data to a target system. One possible use case for this feature is to not transmit sensitive information to a data warehousing, test/dev, or analytics environment. Another is to save space by not replicating unimportant data within a given dataset, for example in backup tools like zrepl. Redacted send/receive is a three-stage process. First, a clone (or clones) is made of the snapshot to be sent to the target. In this clone (or clones), all unnecessary or unwanted data is removed or modified. This clone is then snapshotted to create the "redaction snapshot" (or snapshots). Second, the new zfs redact command is used to create a redaction bookmark. The redaction bookmark stores the list of blocks in a snapshot that were modified by the redaction snapshot(s). Finally, the redaction bookmark is passed as a parameter to zfs send. When sending to the snapshot that was redacted, the redaction bookmark is used to filter out blocks that contain sensitive or unwanted information, and those blocks are not included in the send stream. When sending from the redaction bookmark, the blocks it contains are considered as candidate blocks in addition to those blocks in the destination snapshot that were modified since the creation_txg of the redaction bookmark. This step is necessary to allow the target to rehydrate data in the case where some blocks are accidentally or unnecessarily modified in the redaction snapshot. The changes to bookmarks to enable fast space estimation involve adding deadlists to bookmarks. There is also logic to manage the life cycles of these deadlists. The new size estimation process operates in cases where previously an accurate estimate could not be provided. In those cases, a send is performed where no data blocks are read, reducing the runtime significantly and providing a byte-accurate size estimate. Reviewed-by: Dan Kimmel <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Prashanth Sreenivasa <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: George Wilson <[email protected]> Reviewed-by: Chris Williamson <[email protected]> Reviewed-by: Pavel Zhakarov <[email protected]> Reviewed-by: Sebastien Roy <[email protected]> Reviewed-by: Prakash Surya <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #7958
* Restrict filesystem creation if name referred either '.' or '..'Tulsi Jain2019-06-131-0/+10
| | | | | | | | | | | This change restricts filesystem creation if the given name contains either '.' or '..' Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Elling <[email protected]> Signed-off-by: TulsiJain <[email protected]> Closes #8842 Closes #8564
* Refactor parent dataset handling in libzfs zfs_rename()Tomohiro Kusumi2019-05-281-9/+4
| | | | | | | | | | | For recursive renaming, simplify the code by moving `zhrp` and `parentname` to inner scope. `zhrp` is only used to test existence of a parent dataset for recursive dataset dir scan since ba6a24026c. Reviewed by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Tomohiro Kusumi <[email protected]> Closes #8815
* zfs: don't pretty-print objsetid propertyloli10K2019-05-241-1/+4
| | | | | | | | | | The objsetid property, while being stored as a number, is a dataset identifier and should not be pretty-printed. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Chris Dunlop <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8784
* Fix wrong assertion in libzfs diff error handlingRyan Moeller2019-05-191-1/+1
| | | | | | | | | | In compare(), all error cases set the error code to EPIPE, so when an error is set, the correct assertion to make is that the error is EPIPE, not EINVAL. Reviewed-by: Richard Elling <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #8743
* Fix send/recv lost spill blockBrian Behlendorf2019-05-071-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When receiving a DRR_OBJECT record the receive_object() function needs to determine how to handle a spill block associated with the object. It may need to be removed or kept depending on how the object was modified at the source. This determination is currently accomplished using a heuristic which takes in to account the DRR_OBJECT record and the existing object properties. This is a problem because there isn't quite enough information available to do the right thing under all circumstances. For example, when only the block size changes the spill block is removed when it should be kept. What's needed to resolve this is an additional flag in the DRR_OBJECT which indicates if the object being received references a spill block. The DRR_OBJECT_SPILL flag was added for this purpose. When set then the object references a spill block and it must be kept. Either it is update to date, or it will be replaced by a subsequent DRR_SPILL record. Conversely, if the object being received doesn't reference a spill block then any existing spill block should always be removed. Since previous versions of ZFS do not understand this new flag additional DRR_SPILL records will be inserted in to the stream. This has the advantage of being fully backward compatible. Existing ZFS systems receiving this stream will recreate the spill block if it was incorrectly removed. Updated ZFS versions will correctly ignore the additional spill blocks which can be identified by checking for the DRR_SPILL_UNMODIFIED flag. The small downside to this approach is that is may increase the size of the stream and of the received snapshot on previous versions of ZFS. Additionally, when receiving streams generated by previous unpatched versions of ZFS spill blocks may still be lost. OpenZFS-issue: https://www.illumos.org/issues/9952 FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=233277 Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8668
* Add feature check for 'zpool resilver' commandTom Caputi2019-05-022-0/+9
| | | | | | | | | | | | The 'zpool resilver' command requires that the resilver_defer feature is active on the pool. Unfortunately, the check for this was left out of the original patch. This commit simply corrects this so that the command properly returns an error in this case. Reviewed by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8700
* Correct snprintf() size argumentTomohiro Kusumi2019-04-301-3/+3
| | | | | | | | | | | | | | | | The size argument of snprintf(3) in glibc and snprintf() in Linux kernel includes trailing \0, as snprintf(3) man page explains it as "write at most size bytes (including the trailing null byte ('\0'))", i.e. snprintf() can just take buffer size. e.g. For snprintf() in module/zfs/zfs_ctldir.c, a buffer size is MAXPATHLEN, and a caller is passing MAXPATHLEN to snprintf(), so size should just be `path_len` to do what the caller is trying to do. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Tomohiro Kusumi <[email protected]> Closes #8692
* Fix typo "/zbin/zpool" -> "/sbin/zpool"Tomohiro Kusumi2019-04-191-1/+1
| | | | | | | Reviewed-by: Richard Elling <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Tomohiro Kusumi <[email protected]> Closes #8643
* Add option [-V|--version] to emit version stringTerraTech2019-04-161-0/+64
| | | | | | | | | | | | | | | | | | | | | Add the 'zfs version' and 'zpool version' subcommands to display the version of the user space utilities and loaded zfs kernel module. For example: $ zfs version zfs-0.8.0-rc3_169_g67e0366b88 zfs-kmod-0.8.0-rc3_169_g67e0366b88 The '-V' and '--version' aliases were added to support the common convention of using 'zfs --version` to obtain the version information. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: TerraTech <[email protected]> Closes #2501 Closes #8567
* Fix hierarchy misspellingsRichard Laager2019-04-141-5/+5
| | | | | | | | Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reported-by: Matthew Ahrens <[email protected]> Signed-off-by: Richard Laager <[email protected]> Closes #8563 Closes #8622
* Don't assume pthread_t is uint_t for portabilityTomohiro Kusumi2019-04-091-1/+2
| | | | | | | | | | | | | | POSIX doesn't define pthread_t as uint_t. It could be a pointer. This code causes below compile error on a platform using pointer for pthread_t. -- kernel.c:815:25: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] (void) printf("%u ", (uint_t)pthread_self()); Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Tomohiro Kusumi <[email protected]> Closes #8558
* Fix buffer length in strlcpy()Brian Behlendorf2019-04-081-1/+1
| | | | | | | | | | | | The length used for the strlcpy() used the size of zv_value when it should have used the size of zc_name. Correct this typo. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8595 Closes #8596
* Restrict kstats and print real pointersSara Hartse2019-04-041-0/+1
| | | | | | | | | | | | | | | There are several places where we use zfs_dbgmsg and %p to print pointers. In the Linux kernel, these values obfuscated to prevent information leaks which means the pointers aren't very useful for debugging crash dumps. We decided to restrict the permissions of dbgmsg (and some other kstats while we were at it) and print pointers with %px in zfs_dbgmsg as well as spl_dumpstack Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Gallagher <[email protected]> Signed-off-by: sara hartse <[email protected]> Closes #8467 Closes #8476
* Append snapshot name to "TIME SENT SNAPSHOT" outputTerraTech2019-04-011-1/+2
| | | | | | | | | Simply appends zhp->zfs_name to the "TIME SENT SNAPSHOT" output. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Richard Elling <[email protected]> Signed-off-by: TerraTech <[email protected]> Closes #8543
* Add TRIM supportBrian Behlendorf2019-03-294-41/+224
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | UNMAP/TRIM support is a frequently-requested feature to help prevent performance from degrading on SSDs and on various other SAN-like storage back-ends. By issuing UNMAP/TRIM commands for sectors which are no longer allocated the underlying device can often more efficiently manage itself. This TRIM implementation is modeled on the `zpool initialize` feature which writes a pattern to all unallocated space in the pool. The new `zpool trim` command uses the same vdev_xlate() code to calculate what sectors are unallocated, the same per- vdev TRIM thread model and locking, and the same basic CLI for a consistent user experience. The core difference is that instead of writing a pattern it will issue UNMAP/TRIM commands for those extents. The zio pipeline was updated to accommodate this by adding a new ZIO_TYPE_TRIM type and associated spa taskq. This new type makes is straight forward to add the platform specific TRIM/UNMAP calls to vdev_disk.c and vdev_file.c. These new ZIO_TYPE_TRIM zios are handled largely the same way as ZIO_TYPE_READs or ZIO_TYPE_WRITEs. This makes it possible to largely avoid changing the pipieline, one exception is that TRIM zio's may exceed the 16M block size limit since they contain no data. In addition to the manual `zpool trim` command, a background automatic TRIM was added and is controlled by the 'autotrim' property. It relies on the exact same infrastructure as the manual TRIM. However, instead of relying on the extents in a metaslab's ms_allocatable range tree, a ms_trim tree is kept per metaslab. When 'autotrim=on', ranges added back to the ms_allocatable tree are also added to the ms_free tree. The ms_free tree is then periodically consumed by an autotrim thread which systematically walks a top level vdev's metaslabs. Since the automatic TRIM will skip ranges it considers too small there is value in occasionally running a full `zpool trim`. This may occur when the freed blocks are small and not enough time was allowed to aggregate them. An automatic TRIM and a manual `zpool trim` may be run concurrently, in which case the automatic TRIM will yield to the manual TRIM. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Tim Chase <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: George Wilson <[email protected]> Reviewed-by: Serapheim Dimitropoulos <[email protected]> Contributions-by: Saso Kiselkov <[email protected]> Contributions-by: Tim Chase <[email protected]> Contributions-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8419 Closes #598
* Send stream should only list included snapsTom Caputi2019-03-281-24/+83
| | | | | | | | | | | | | | | | | Currently, zfs send streams will include a list of all snapshots on the source side if the '-p' option is provided. This can cause performance problems on the receive side, especially if those snapshots aren't present on the destination. These problems arise because guid_to_name(), which is used for several receive side functions, will search the entire receive-side pool if it can't find a snapshot with a matching guid. This patch corrects the issue by ensuring only streams that require this list of snapshots include them. Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8533
* ZFS Reads may result in unneccesary calls to zil_commitGeorge Wilson2019-03-221-1/+0
| | | | | | | | | | | | | | | ZFS supports O_RSYNC for read operations and when specified will ensure the same level of data integrity that O_DSYNC and O_SYNC provides for writes. O_RSYNC by itself has no effect so it must be combined with either O_DSYNC or O_SYNC. However, many platforms don't support O_RSYNC and have mapped O_SYNC to mean O_RSYNC within ZFS. This is incorrect and causes unnecessary calls to zil_commit. Only platforms which support O_RSYNC should implement the zil_commit functionality in the read code path. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Wilson <[email protected]> Closes #8523
* Improve `zpool labelclear`Brian Behlendorf2019-03-211-3/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | 1) As implemented the `zpool labelclear` command overwrites the calculated offsets of all four vdev labels even when only a single valid label is found. If the device as been re-purposed but still contains a valid label this can result in space no longer owned by ZFS being zeroed. Prevent this by verifying every label removed is intact before it's overwritten. 2) Address a small bug in zpool_do_labelclear() which prevented labelclear from working on file vdevs. Only block devices support BLKFLSBUF, try the ioctl() but when it's reported as unsupported this should not be fatal. 3) Fix `zpool labelclear` so it can be run on vdevs which were removed from the pool with `zpool remove`. Additionally, allow intact but partial labels to be cleared as in the case of a failed `zpool attach` or `zpool replace`. 4) Remove LABELCLEAR and LABELREAD variables for test cases. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Tim Chase <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8500 Closes #8373 Closes #6261
* Add space in error messageTom Caputi2019-03-191-2/+2
| | | | | | | | | | | This patch simply adds a missing space in the ZFS_ERR_FROM_IVSET_GUID_MISSING error message. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8514
* Detect and prevent mixed raw and non-raw sendsTom Caputi2019-03-133-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, there is an issue in the raw receive code where raw receives are allowed to happen on top of previously non-raw received datasets. This is a problem because the source-side dataset doesn't know about how the blocks on the destination were encrypted. As a result, any MAC in the objset's checksum-of-MACs tree that is a parent of both blocks encrypted on the source and blocks encrypted by the destination will be incorrect. This will result in authentication errors when we decrypt the dataset. This patch fixes this issue by adding a new check to the raw receive code. The code now maintains an "IVset guid", which acts as an identifier for the set of IVs used to encrypt a given snapshot. When a snapshot is raw received, the destination snapshot will take this value from the DRR_BEGIN payload. Non-raw receives and normal "zfs snap" operations will cause ZFS to generate a new IVset guid. When a raw incremental stream is received, ZFS will check that the "from" IVset guid in the stream matches that of the "from" destination snapshot. If they do not match, the code will error out the receive, preventing the problem. This patch requires an on-disk format change to add the IVset guids to snapshots and bookmarks. As a result, this patch has errata handling and a tunable to help affected users resolve the issue with as little interruption as possible. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8308
* Avoid retrieving unused snapshot propsAlek P2019-03-123-43/+90
| | | | | | | | | | | | | | This patch modifies the zfs_ioc_snapshot_list_next() ioctl to enable it to take input parameters that alter the way looping through the list of snapshots is performed. The idea here is to restrict functions that throw away some of the snapshots returned by the ioctl to a range of snapshots that these functions actually use. This improves efficiency and execution speed for some rollback and send operations. Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Signed-off-by: Alek Pinchuk <[email protected]> Closes #8077
* config: better libtirpc detectionRafael Kitover2019-03-023-17/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Improve the autoconf code for finding libtirpc and do not assume the headers are in /usr/include/tirpc. Also remove this assumption from the `rpc/xdr.h` header in libspl and use the same `#include_next` mechanism that is used for other libspl headers. Include pkg.m4 from pkg-config in config/ for PKG_CHECK_MODULES(), the file license allows this. Include ax_save_flags.m4 and ax_restore_flags.m4 from autoconf-archive, the file licenses are compatible. Use the 2012 versions so as not rely on a more recent autoconf feature AS_VAR_COPY(), which breaks some build slaves. Add new macro library `config/find_system_library.m4` which defines the FIND_SYSTEM_LIBRARY() macro which is a convenience wrapper over using PKG_CHECK_MODULES() with a fallback to standard library locations and some sanity checks. The parameters are: ``` FIND_SYSTEM_LIBRARY(VARIABLE-PREFIX, MODULE, HEADER, HEADER-PREFIXES, LIBRARY, FUNCTIONS, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND]) ``` `HEADER-PREFIXES` and `FUNCTIONS` are comma-separated m4 lists. For libtirpc we are using: ``` FIND_SYSTEM_LIBRARY(LIBTIRPC, [libtirpc], [rpc/xdr.h], [tirpc], [tirpc], [xdrmem_create], [], [...]) ``` The headers are first checked for without the prefixes and then with. This system works with pkg-config and falls back on checking standard header/library locations, it can be easily overridden by the user by setting the `PREFIX_CFLAGS` and `PREFIX_LIBS` variables which are automatically added to the `./configure --help` output. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rafael Kitover <[email protected]> Closes #7422 Closes #8313
* Sort by full path name instead of by GUID when importingkpande2019-02-261-3/+3
| | | | | | | | | | | | | Preferentially sort by the full path name instead of GUID when determining which device links to use. This helps ensure that the pool vdevs are named consistently when multiple links for a device appear in the same directory. For example, the /dev/disk/by-id/scsi* and /dev/disk/by-id/wwn* links. Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Richard Elling <[email protected]> Authored-by: Brian Behlendorf <[email protected]> Signed-off-by: Kash Pande <[email protected]> Closes #8108 Closes #8440
* Improve error message for zfs create with @ or # in nameDamian Wojsław2019-02-251-38/+39
| | | | | | | | | | | | | Reorder the `zfs create` error messages in order to return the most specific one first. If none of them apply then an expanded version of the invalid name message is used. Reviewed by: Tom Caputi <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Signed-off-by: Damian Wojsław <[email protected]> Closes #8155 Closes #8352
* Fix zdb crashIgor K2019-02-191-2/+2
| | | | | | | | | We have to use umem_free() instead of free() if we are using umem_zalloc() Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Igor Kozhukhov <[email protected]> Closes #8402
* zfs should optionally send holdsPaul Zuchowski2019-02-151-12/+84
| | | | | | | | | | | | | Add -h switch to zfs send command to send dataset holds. If holds are present in the stream, zfs receive will create them on the target dataset, unless the zfs receive -h option is used to skip receive of holds. Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #7513
* ZVOLs should not be allowed to have childrenloli10K2019-02-083-13/+41
| | | | | | | | | | | | | | | zfs create, receive and rename can bypass this hierarchy rule. Update both userland and kernel module to prevent this issue and use pyzfs unit tests to exercise the ioctls directly. Note: this commit slightly changes zfs_ioc_create() ABI. This allow to differentiate a generic error (EINVAL) from the specific case where we tried to create a dataset below a ZVOL (ZFS_ERR_WRONG_PARENT). Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: loli10K <[email protected]>
* Include third party licenses in dist tarballsNeal Gompa (ニール・ゴンパ)2019-01-081-0/+3
| | | | | | | | | | | | | | | | Since the merge of the Linux Solaris Porting Layer source tree into the ZFS codebase, ZFS is now a double-licensed codebase, with the former SPL codebase retaining its license (GPLv2+) within the ZFS source tree. However, the license files for SPL were not being included in the tarballs generated by autotools. This change corrects that. In addition, all the other third party licenses in the codebase are now properly declared to be included in the dist tarballs. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Neal Gompa <[email protected]> Closes #8242
* OpenZFS 9102 - zfs should be able to initialize storage devicesGeorge Wilson2019-01-074-0/+139
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PROBLEM ======== The first access to a block incurs a performance penalty on some platforms (e.g. AWS's EBS, VMware VMDKs). Therefore we recommend that volumes are "thick provisioned", where supported by the platform (VMware). This can create a large delay in getting a new virtual machines up and running (or adding storage to an existing Engine). If the thick provision step is omitted, write performance will be suboptimal until all blocks on the LUN have been written. SOLUTION ========= This feature introduces a way to 'initialize' the disks at install or in the background to make sure we don't incur this first read penalty. When an entire LUN is added to ZFS, we make all space available immediately, and allow ZFS to find unallocated space and zero it out. This works with concurrent writes to arbitrary offsets, ensuring that we don't zero out something that has been (or is in the middle of being) written. This scheme can also be applied to existing pools (affecting only free regions on the vdev). Detailed design: - new subcommand:zpool initialize [-cs] <pool> [<vdev> ...] - start, suspend, or cancel initialization - Creates new open-context thread for each vdev - Thread iterates through all metaslabs in this vdev - Each metaslab: - select a metaslab - load the metaslab - mark the metaslab as being zeroed - walk all free ranges within that metaslab and translate them to ranges on the leaf vdev - issue a "zeroing" I/O on the leaf vdev that corresponds to a free range on the metaslab we're working on - continue until all free ranges for this metaslab have been "zeroed" - reset/unmark the metaslab being zeroed - if more metaslabs exist, then repeat above tasks. - if no more metaslabs, then we're done. - progress for the initialization is stored on-disk in the vdev’s leaf zap object. The following information is stored: - the last offset that has been initialized - the state of the initialization process (i.e. active, suspended, or canceled) - the start time for the initialization - progress is reported via the zpool status command and shows information for each of the vdevs that are initializing Porting notes: - Added zfs_initialize_value module parameter to set the pattern written by "zpool initialize". - Added zfs_vdev_{initializing,removal}_{min,max}_active module options. Authored by: George Wilson <[email protected]> Reviewed by: John Wren Kennedy <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Prakash Surya <[email protected]> Reviewed by: loli10K <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Approved by: Richard Lowe <[email protected]> Signed-off-by: Tim Chase <[email protected]> Ported-by: Tim Chase <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9102 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c3963210eb Closes #8230
* Add missing MMP status code to libzfs_statusbunder20152019-01-031-19/+33
| | | | | | | | | | | | When MMP was merged the status codes in libzfs_status were not updated to add the status code for ZPOOL_STATUS_IO_FAILURE_MMP. This commit corrects this and adds comments to help keep track of which code is used for which status. Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: bunder2015 <[email protected]> Closes #8148 Closes #8222
* OpenZFS 9284 - arc_reclaim_thread has 2 jobsBrad Lewis2018-12-261-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Following the fix for 9018 (Replace kmem_cache_reap_now() with kmem_cache_reap_soon), the arc_reclaim_thread() no longer blocks while reaping. However, the code is still confusing and error-prone, because this thread has two responsibilities. We should instead separate this into two threads each with their own responsibility: 1. keep `arc_size` under `arc_c`, by calling `arc_adjust()`, which improves `arc_is_overflowing()` 2. keep enough free memory in the system, by calling `arc_kmem_reap_now()` plus `arc_shrink()`, which improves `arc_available_memory()`. Furthermore, we can use the zthr infrastructure to separate the "should we do something" from "do it" parts of the logic, and normalize the start up / shut down of the threads. Authored by: Brad Lewis <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Dan Kimmel <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Dan McDonald <[email protected]> Reviewed by: Tim Kordas <[email protected]> Reviewed by: Tim Chase <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Ported-by: Brad Lewis <[email protected]> Signed-off-by: Brad Lewis <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9284 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/de753e34f9 Closes #8165
* OpenZFS 9559 - zfs diff handles files on delete queue in fromsnap poorlyPaul Dagnelie2018-12-141-6/+6
| | | | | | | | | | | | Authored by: Paul Dagnelie <[email protected]> Reviewed by: Joshua M. Clulow <[email protected]> Reviewed by: Tom Caputi <[email protected]> Approved by: Richard Lowe <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9559 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/d7e45412 Closes #8211
* OpenZFS 9630 - add lzc_rename and lzc_destroy to libzfs_coreAndriy Gapon2018-12-143-34/+54
| | | | | | | | | | | | | | | | | | | | Porting Notes: * Additional changes to recv_rename_impl() were required due to encryption code not being merged in OpenZFS yet. * libzfs_core python bindings (pyzfs) were updated to fully support both lzc_rename() and lzc_destroy() Authored by: Andriy Gapon <[email protected]> Reviewed by: Andy Stormont <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: loli10K <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9630 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/049ba63 Closes #8207
* Check for strlcat and strlcpyBrian Behlendorf2018-12-114-49/+124
| | | | | | | | | | | | | | | | This partially reverts commit 8005ca4 by moving the strlcat() and strlcpy() compatibility implementations back to their original location. In addition, these two functions were added to the AC_CHECK_FUNCS macro. When these functions are available from the C library, HAVE_STRLCAT and HAVE_STRLCPY will be defined and library version used. Otherwise the compatibility version is built. Reviewed-by: Sebastian Gottschall <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8157 Closes #8202
* OpenZFS 9880 - Race in ZFS parallel mountAndy Fiddaman2018-12-071-3/+31
| | | | | | | | | | | | | | | | | Porting Notes: * Not required for Linux since the zone is always global. But we'll want this change if we start using the zones code. Authored by: Andy Fiddaman <[email protected]> Reviewed by: Jason King <[email protected]> Reviewed by: Sebastien Roy <[email protected]> Reviewed by: Tom Caputi <[email protected]> Approved by: Joshua M. Clulow <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9880 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/bc4c0ff134 Closes #8189
* Fix error message when zfs module is not loadedTom Caputi2018-12-071-3/+3
| | | | | | | | | | | This patch corrects a small issue where the wrong error message was being displayed when the zfs kernel module was not loaded. This also avoids waiting for the (by default) 10s timeout to see if the /dev/zfs device appears. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8187
* Fix 'zfs receive -F' message when destination has snapshotsloli10K2018-12-051-1/+1
| | | | | | | | | | | | | | | | | | When receiving a send stream with forced rollback on a dataset with snapshots zfs suggests said snapshots must be removed to successfully receive the stream; however the message is misleading because it prints the dataset name instead of one of its snapshots. $ sudo zfs snap pp/recvfs@snap-orig $ sudo zfs recv -F pp/recvfs < sendstream cannot receive new filesystem stream: destination has snapshots (eg. pp/recvfs) must destroy them to overwrite it This change simply restores the snapshot name in the error message. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8167
* Fix ASSERT in zfs_receive_one()LOLi2018-12-041-2/+3
| | | | | | | | | | | | | | | This commit fixes the following ASSERT in zfs_receive_one() when receiving a send stream from a root dataset with the "-e" option: $ sudo zfs snap source@snap $ sudo zfs send source@snap | sudo zfs recv -e destination/recv chopprefix > drrb->drr_toname ASSERT at libzfs_sendrecv.c:3804:zfs_receive_one() Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8121
* Move strlcat, strlcpy, and strnlenBrian Behlendorf2018-11-205-158/+50
| | | | | | | | | | | | | | | | | | | | | | | | Move strlcat() and strlcpy() from .c source files in to the libspl string.h header. By changing these compatibility functions to static inline functions they can included as needed without requiring linking with the libspl.so library. Remove strnlen() which is barely used in the source, and has been provided by glibc since v2.10. Finally, convert four instances of strncpy() to strlcpy() in libzfs_input_check.c which were causing build warnings when compiling with gcc 8.2.1. For example: libzfs_input_check.c: In function ‘zfs_destroy’: libzfs_input_check.c:651:9: error: ‘strncpy’ specified bound \ 4096 equals destination size [-Werror=stringop-truncation] (void) strncpy(zc.zc_name, dataset, sizeof (zc.zc_name)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8116
* OpenZFS 8115 - parallel zfs mountSebastien Roy2018-11-152-98/+368
| | | | | | | | | | | | | | | | | | | | | | | | Porting Notes: * Use thread pools (tpool) API instead of introducing taskq interfaces to libzfs. * Use pthread_mutext for locks as mutex_t isn't available. * Ignore alternative libshare initialization since OpenZFS-7955 is not present on zfsonlinux. Authored by: Sebastien Roy <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Prashanth Sreenivasa <[email protected]> Authored by: Brian Behlendorf <[email protected]> Approved by: Matt Ahrens <[email protected]> Ported-by: Don Brady <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/8115 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a3f0e2b569 Closes #8092
* Fix libudev dependency in libzutilDon Brady2018-11-061-0/+2
| | | | | | | | | | ZFS should be able to build without libudev installed. The recent change for libzutil inadvertently broke that. Make the libudev code conditional in zutil_import.c to resolve the build failure. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #8097