summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* OpenZFS 9962 - zil_commit should omit cache thrashPrakash Surya2018-12-076-78/+206
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a result of the changes made in 8585, it's possible for an excessive amount of vdev flush commands to be issued under some workloads. Specifically, when the workload consists of mostly async write activity, interspersed with some sync write and/or fsync activity, we can end up issuing more flush commands to the underlying storage than is actually necessary. As a result of these flush commands, the write latency and overall throughput of the pool can be poorly impacted (latency increases, throughput decreases). Currently, any time an lwb completes, the vdev(s) written to as a result of that lwb will be issued a flush command. The intenion is so the data written to that vdev is on stable storage, prior to communicating to any waiting threads that their data is safe on disk. The problem with this scheme, is that sometimes an lwb will not have any threads waiting for it to complete. This can occur when there's async activity that gets "converted" to sync requests, as a result of calling the zil_async_to_sync() function via zil_commit_impl(). When this occurs, the current code may issue many lwbs that don't have waiters associated with them, resulting in many flush commands, potentially to the same vdev(s). For example, given a pool with a single vdev, and a single fsync() call that results in 10 lwbs being written out (e.g. due to other async writes), that will result in 10 flush commands to that single vdev (a flush issued after each lwb write completes). Ideally, we'd only issue a single flush command to that vdev, after all 10 lwb writes completed. Further, and most important as it pertains to this change, since the flush commands are often very impactful to the performance of the pool's underlying storage, unnecessarily issuing these flush commands can poorly impact the performance of the lwb writes themselves. Thus, we need to avoid issuing flush commands when possible, in order to acheive the best possible performance out of the pool's underlying storage. This change attempts to address this problem by changing the ZIL's logic to only issue a vdev flush command when it detects an lwb that has a thread waiting for it to complete. When an lwb does not have threads waiting for it, the responsibility of issuing the flush command to the vdevs involved with that lwb's write is passed on to the "next" lwb. It's only once a write for an lwb with waiters completes, do we issue the vdev flush command(s). As a result, now when we issue the flush(s), we will issue them to the vdevs involved with that specific lwb's write, but potentially also to vdevs involved with "previous" lwb writes (i.e. if the previous lwbs did not have waiters associated with them). Thus, in our prior example with 10 lwbs, it's only once the last lwb completes (which will be the lwb containing the waiter for the thread that called fsync) will we issue the vdev flush command; all of the other lwbs will find they have no waiters, so they'll pass the responsibility of the flush to the "next" lwb (until reaching the last lwb that has the waiter). Porting Notes: * Reconciled conflicts with the fastwrite feature. Authored by: Prakash Surya <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: Patrick Mooney <[email protected]> Reviewed by: Jerry Jelinek <[email protected]> Approved by: Joshua M. Clulow <[email protected]> Ported-by: Signed-off-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9962 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/545190c6 Closes #8188
* OpenZFS 9963 - Separate tunable for disabling ZIL vdev flushPrakash Surya2018-12-073-12/+35
| | | | | | | | | | | | | | | | | | | | Porting Notes: * Add options to zfs-module-parameters(5) man page. * zfs_nocacheflush move to vdev.c instead of vdev_disk.c, since the latter doesn't get built for user space. Authored by: Prakash Surya <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: Patrick Mooney <[email protected]> Reviewed by: Tom Caputi <[email protected]> Reviewed by: George Melikov <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: Signed-off-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9963 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f8fdf68125 Closes #8186
* OpenZFS 9993 - zil writes can get delayed in zio pipelineGeorge Wilson2018-12-071-1/+2
| | | | | | | | | | | | | | | Authored by: George Wilson <[email protected]> Reviewed by: Prakash Surya <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Tom Caputi <[email protected]> Reviewed by: George Melikov <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9993 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2258ad0b Closes #8185
* OpenZFS 9880 - Race in ZFS parallel mountAndy Fiddaman2018-12-071-3/+31
| | | | | | | | | | | | | | | | | Porting Notes: * Not required for Linux since the zone is always global. But we'll want this change if we start using the zones code. Authored by: Andy Fiddaman <[email protected]> Reviewed by: Jason King <[email protected]> Reviewed by: Sebastien Roy <[email protected]> Reviewed by: Tom Caputi <[email protected]> Approved by: Joshua M. Clulow <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9880 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/bc4c0ff134 Closes #8189
* Fix error message when zfs module is not loadedTom Caputi2018-12-071-3/+3
| | | | | | | | | | | This patch corrects a small issue where the wrong error message was being displayed when the zfs kernel module was not loaded. This also avoids waiting for the (by default) 10s timeout to see if the /dev/zfs device appears. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8187
* Do not enable stack tracer for ZFS performance testTony Nguyen2018-12-072-6/+21
| | | | | | | | | | | | | | Linux ZFS test suite runs with /proc/sys/kernel/stack_tracer_enabled=1, via zfs.sh script, which has negative performance impact, up to 40%. Since large stack is a rare issue now, preferred behavior would be: - making stack tracer an opt-in feature for zfs.sh - zfs-test.sh enables stack tracer only when requested Reviewed-by: George Melikov <[email protected]> Reviewed-by: Richard Elling <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> #8173
* Ensure dsl scan prefetch queue is emptiedTom Caputi2018-12-061-0/+20
| | | | | | | | | This patch simply ensures that scn->scn_prefetch_queue is emptied before the kernel module is unloaded and when scanning completes. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8178
* Fix 'zfs receive -F' message when destination has snapshotsloli10K2018-12-051-1/+1
| | | | | | | | | | | | | | | | | | When receiving a send stream with forced rollback on a dataset with snapshots zfs suggests said snapshots must be removed to successfully receive the stream; however the message is misleading because it prints the dataset name instead of one of its snapshots. $ sudo zfs snap pp/recvfs@snap-orig $ sudo zfs recv -F pp/recvfs < sendstream cannot receive new filesystem stream: destination has snapshots (eg. pp/recvfs) must destroy them to overwrite it This change simply restores the snapshot name in the error message. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8167
* Use autoconf variable for C preprocessorBen Wolsieffer2018-12-051-1/+1
| | | | | | | | | | This fixes the build when cross-compiling, where the preprocessor might be prefixed. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Ben Wolsieffer <[email protected]> Closes #8180
* Move assert in dump_dir() in zdbTom Caputi2018-12-051-3/+3
| | | | | | | | | | | This one line patch moves an assert in the function dump_dir() below an error check that ensures it ran correctly. This ensures zdb dumps the error that actually caused the problem, as opposed to one of its symptoms. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8171
* Fix dnode_hold() freeing dnode behaviorBrian Behlendorf2018-12-051-2/+4
| | | | | | | | | | | | | | | | | | | | | Commit 4c5b89f59 refactored dnode_hold() and in the process accidentally introduced a slight change in behavior which was not intended. The required behavior is that once the ZPL, or other consumer, declares its intent to free a dnode then dnode_hold() should immediately start failing. This updated code wouldn't return the failure until after it was freed. When DNODE_MUST_BE_ALLOCATED is set it must return ENOENT, and when DNODE_MUST_BE_FREE is set it must return EEXIST; This issue was uncovered by ztest_remap() which attempted to remap a freeing object which should have been skipped as described by the comment in dmu_objset_remap_indirects_impl(). Reviewed-by: George Melikov <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Olaf Faaland <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8172
* Fix 'zpool list -v' alignmentBrian Behlendorf2018-12-042-52/+111
| | | | | | | | | | | | | | | The verbose output of 'zpool list' was not correctly aligned due to differences in the vdev name lengths. Minimally update the code the correct the alignment using the same strategy employed by 'zpool status'. Missing dashes were added for the empty defaults columns, and the vdev state is now printed for all vdevs. Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7308 Closes #8147
* zfs-functions.in: is_mounted() always returns 1TerraTech2018-12-041-2/+7
| | | | | | | | | | | | | | The 'while read line; ...; done' loop is run in a piped subshell therefore the 'return 0' would not cause a return from the is_mounted() function. In all cases, this function will always return 1. The fix is to 'return 1' from the subshell on a successful match (no match == return 0), and then negating the final return value. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: TerraTech <[email protected]> Closes #8151
* Fix ztest deadlock in spa_vdev_remove()Tom Caputi2018-12-041-12/+19
| | | | | | | | | | | | | | This patch corrects an issue where spa_vdev_remove() would call spa_history_log_internal() while holding the spa config lock. This function may decide to block until the next txg if the current one seems too full. However, since the thread is holding the config log, the txg sync thread cannot progress and the system ends up deadlocked. This patch simply moves all calls to spa_history_log_internal() outside of the config lock. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8162
* Fix ztest deadlock in ztest_zil_remount()Tom Caputi2018-12-041-0/+8
| | | | | | | | | | | | This patch fixes a small race condition in ztest_zil_remount() that could result in a deadlock. ztest_device_removal() calls spa_vdev_remove() which may eventually call spa_reset_logs(). If ztest_zil_remount() attempts to call zil_close() while this is happening, it may fail when it asserts !zilog_is_dirty(zilog). This patch simply adds locking to correct the issue. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8154
* Fix ASSERT in zfs_receive_one()LOLi2018-12-045-5/+128
| | | | | | | | | | | | | | | This commit fixes the following ASSERT in zfs_receive_one() when receiving a send stream from a root dataset with the "-e" option: $ sudo zfs snap source@snap $ sudo zfs send source@snap | sudo zfs recv -e destination/recv chopprefix > drrb->drr_toname ASSERT at libzfs_sendrecv.c:3804:zfs_receive_one() Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8121
* Detect IO errors during device removalBrian Behlendorf2018-12-049-19/+350
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Detect IO errors during device removal While device removal cannot verify the checksums of individual blocks during device removal, it can reasonably detect hard IO errors from the leaf vdevs. Failure to perform this error checking can result in device removal completing successfully, but moving no data which will permanently corrupt the pool. Situation 1: faulted/degraded vdevs In the configuration shown below, the removal of mirror-0 will permanently corrupt the pool. Device removal will preferentially copy data from 'vdev1 -> vdev3' and from 'vdev2 -> vdev4'. Which in this case will result in nothing being copied since one vdev in each of those groups in unavailable. However, device removal will complete successfully since all IO errors are ignored. tank DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 /var/tmp/vdev1 FAULTED 0 0 0 external fault /var/tmp/vdev2 ONLINE 0 0 0 mirror-1 DEGRADED 0 0 0 /var/tmp/vdev3 ONLINE 0 0 0 /var/tmp/vdev4 FAULTED 0 0 0 external fault This issue is resolved by updating the source child selection logic to exclude unreadable leaf vdevs. Additionally, unwritable destination child vdevs which can never succeed are skipped to prevent generating a large number of write IO errors. Situation 2: individual hard IO errors During removal if an unexpected hard IO error is encountered when either reading or writing the child vdev the entire removal operation is cancelled. While it may be possible to reconstruct the data after removal that cannot be guaranteed. The only strictly safe thing to do is to cancel the removal. As a future improvement we may want to instead suspend the removal process and allow the damaged region to be retried. But that work is left for another time, hard IO errors during the removal process are expected to be exceptionally rare. Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #6900 Closes #8161
* Fix consistency of ztest_device_removal_activeTom Caputi2018-11-283-10/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | ztest currently uses the boolean flag ztest_device_removal_active to protect some tests that may not run successfully if they occur at the same time as ztest_device_removal(). Unfortunately, in the event that ztest is in the middle of a device removal when it decides to issue a SIGKILL, the device removal will be automatically restarted (without setting the flag) when the pool is re-imported on the next run. This patch corrects this by ensuring that any in-progress removals are completed before running further tests after the re-import. This patch also makes a few small changes to prevent race conditions involving the creation and destruction of spa->spa_vdev_removal, since this field is not protected by any locks. Some checks that may run concurrently with setting / unsetting this field have been updated to check spa->spa_removing_phys.sr_state instead. The most significant change here is that spa_removal_get_stats() no longer accounts for in-flight work done, since that could result in a NULL pointer dereference. Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8105
* zfs_dbgmsg() is not safe from every contextLOLi2018-11-281-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit reverts to using printk() instead of zfs_dbgmsg() to log messages in vdev_disk_error(): this is necessary because the latter can be called from interrupt context where we are not allowed to sleep. Unfortunately zfs_dbgmsg() performs its allocations calling kmalloc() with the KM_SLEEP flag which may result in the following oops: BUG: scheduling while atomic: swapper/4/0/0x10000100 Call Trace: <IRQ> [<0>] dump_stack+0x19/0x1b ... [<0>] spl_kmem_alloc+0xdf/0x140 [spl] <-- kmem_alloc(size, KM_SLEEP) [<0>] __dprintf+0x69/0x150 [zfs] [<0>] ? kmem_cache_free+0x1e2/0x200 [<0>] vdev_disk_error.part.15+0x5f/0x70 [zfs] [<0>] vdev_disk_io_flush_completion+0x48/0x70 [zfs] [<0>] bio_endio+0x67/0xb0 [<0>] blk_update_request+0x90/0x360 ... [<0>] scsi_finish_command+0xdc/0x140 [<0>] scsi_softirq_done+0x132/0x160 [<0>] blk_done_softirq+0x96/0xc0 [<0>] __do_softirq+0xf5/0x280 [<0>] call_softirq+0x1c/0x30 [<0>] do_softirq+0x65/0xa0 [<0>] irq_exit+0x105/0x110 [<0>] do_IRQ+0x56/0xf0 [<0>] common_interrupt+0x162/0x162 <EOI> [<0>] ? cpuidle_enter_state+0x54/0xd0 [<0>] cpuidle_idle_call+0xde/0x230 [<0>] arch_cpu_idle+0xe/0xb0 [<0>] cpu_startup_entry+0x14a/0x1e0 [<0>] start_secondary+0x1f7/0x270 [<0>] start_cpu+0x5/0x14 Reviewed-by: Olaf Faaland <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8137 Closes #8150
* Remove races from scrub / resilver testsTom Caputi2018-11-2814-83/+65
| | | | | | | | | | | | | | | | | | | | | | Currently, several tests in the ZFS Test Suite that attempt to test scrub and resilver behavior occasionally fail. A big reason for this is that these tests use a combination of zinject and zfs_scan_vdev_limit to attempt to slow these operations enough to verify their test commands. This method works most of the time, but provides no guarantees and leads to flaky behavior. This patch adds a new tunable, zfs_scan_suspend_progress, that ensures that scans make no progress, guaranteeing that tests can be run without racing. This patch also changes zfs_remove_max_bytes_pause to match this new tunable. This provides some consistency between these two similar tunables and ensures that the tunable will not misbehave on 32-bit systems. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8111
* ZTS: fix "not found" errorsLOLi2018-11-2714-37/+28
| | | | | | | | | | | | | | | | | This commit fixes several "not found" errors caused by calling undefined or incorrect shell functions in the following ZFS Test Suite groups: * alloc_class * channel_program/lua_core * channel_program/synctask_core * cli_root/zpool_import * cli_user/misc Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: bunder2015 <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8152
* Fix typo in update to zfs-module-parameters(5)Rich Ercolani2018-11-261-1/+1
| | | | | | | | Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: bunder2015 <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #8153
* Move strlcat, strlcpy, and strnlenBrian Behlendorf2018-11-206-162/+54
| | | | | | | | | | | | | | | | | | | | | | | | Move strlcat() and strlcpy() from .c source files in to the libspl string.h header. By changing these compatibility functions to static inline functions they can included as needed without requiring linking with the libspl.so library. Remove strnlen() which is barely used in the source, and has been provided by glibc since v2.10. Finally, convert four instances of strncpy() to strlcpy() in libzfs_input_check.c which were causing build warnings when compiling with gcc 8.2.1. For example: libzfs_input_check.c: In function ‘zfs_destroy’: libzfs_input_check.c:651:9: error: ‘strncpy’ specified bound \ 4096 equals destination size [-Werror=stringop-truncation] (void) strncpy(zc.zc_name, dataset, sizeof (zc.zc_name)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8116
* zpool: allow split with whole-disk devicesLOLi2018-11-2011-4/+116
| | | | | | | | | | This change allows 'zpool split' to work with whole-disk devices and updates the ZFS Test Suite with a new script to exercise this functionality. Reviewed by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6643 Closes #8133
* man/zfs.8: document 'received' property sourceChristian Schwarz2018-11-201-2/+3
| | | | | | | | Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #8134
* ZTS: Fix parsing of zpool status in checksum testJohn Wren Kennedy2018-11-201-3/+7
| | | | | | | | | | | | | | filetest_001_pos consumes the output using read -r, assigning each field to a variable. The problem comes when a vdev is marked degraded, which appends extra fields to the line. This causes the trailing text to be treated as part of the `cksum` variable. Using awk instead of read -r allows us to extract the checksum error count from the output whether the vdev is degraded or not. Reviewed-by: loli10K <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: John Wren Kennedy <[email protected]> Closes #8136
* ZTS: "checksum" test group needs "lscpu"LOLi2018-11-201-0/+1
| | | | | | | | | | | This change adds "lscpu" to the list of commands used by the ZFS Test Suite: this is required by the "checksum" test group to read the CPU frequency which is used in EdonR, Skein and SHA2 performance tests. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8139
* OpenZFS 8115 - parallel zfs mountSebastien Roy2018-11-1510-139/+716
| | | | | | | | | | | | | | | | | | | | | | | | Porting Notes: * Use thread pools (tpool) API instead of introducing taskq interfaces to libzfs. * Use pthread_mutext for locks as mutex_t isn't available. * Ignore alternative libshare initialization since OpenZFS-7955 is not present on zfsonlinux. Authored by: Sebastien Roy <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Prashanth Sreenivasa <[email protected]> Authored by: Brian Behlendorf <[email protected]> Approved by: Matt Ahrens <[email protected]> Ported-by: Don Brady <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/8115 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a3f0e2b569 Closes #8092
* Tag 0.8.0-rc2zfs-0.8.0-rc2Brian Behlendorf2018-11-121-2/+2
| | | | Signed-off-by: Brian Behlendorf <[email protected]>
* Allow spaces in pool names for cmdline argumentkpande2018-11-111-0/+3
| | | | | | | | | | | | | | | | | | | | | PR #8114 quoted the ${ENCRYPTIONROOT} parameter to ensure we don't lose spaces when unlocking root filesystem in the off chance that it has a space in its name. Unfortunately, dracut and initramfs-tools do not actually get the quotes from the cmdline. If we use root=ZFS="root pool/filesystem name" the script still only sees root=ZFS=root and no quotation marks. Because + is a reserved character in ZFS, it's used as a placeholder for spaces in the kernel cmdline. In this way, root=ZFS=root+pool/filesystem+name will properly expand by replacing the character with sed (POSIX compliant method). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: bunder2015 <[email protected]> Signed-off-by: Kash Pande <[email protected]> Issue #8114 Closes #8117
* Fix coverity defects: CID 184285LOLi2018-11-111-2/+1
| | | | | | | | | | | | CID 184285: Read from pointer after free (USE_AFTER_FREE) This patch fixes an use-after-free in vdev_config_generate_stats() moving the kmem_free() call at the end of the function. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8120
* Fix systemd spec file macrosBrian Behlendorf2018-11-115-5/+29
| | | | | | | | | | | | | | | | Ensure that the _unitdir, _presetdir, _modulesloaddir, and _systemdgeneratordir macros are always defined. If not set them to the expected default values. Pass all of these options to ./configure and package the resulting files in those locations. Additionally, set __brp_mangle_shebangs_exclude_from until the conversion to Python 3 is complete so they may be built cleanly under mock. Reviewed-by: Neal Gompa <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7567 Closes #8119
* Make initramfs-tools script encryption awareGarrett Fields2018-11-091-64/+34
| | | | | | | | | | | | | | | | | Changed decrypt_fs zfs command to "load-key" Plymouth case code based on "contrib/dracut/90zfs/zfs-lib.sh.in" Systemd case based on "contrib/dracut/90zfs/zfs-load-key.sh.in" Cleaned up misspelling of "available" throughout Code style fixes Single quote for ${ENCRYPTIONROOT} Changed "${DECRYPT_CMD}" to "eval ${DECRYPT_CMD}" Reviewed-by: Kash Pande <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Garrett Fields <[email protected]> Closes #8093
* zed: detect and offline physically removed devicesloli10K2018-11-0911-43/+341
| | | | | | | | | | | | | | | | This commit adds a new test case to the ZFS Test Suite to verify ZED can detect when a device is physically removed from a running system: the device will be offlined if a spare is not available in the pool. We implement this by using the existing libudev functionality and without relying solely on the FM kernel module capabilities which have been observed to be unreliable with some kernels. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Don Brady <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #1537 Closes #7926
* Add quotations for ${ENCRYPTIONROOT}kpande2018-11-091-1/+1
| | | | | | | | | | | | Add quotations for ${ENCRYPTIONROOT} to avoid breaking systems with a space in the name. Reviewed-by: bunder2015 <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Kash Pande <[email protected]> Related-to: #8093 Closes #8114
* Add zpool status -s (slow I/Os) and -p (parseable)Tony Hutter2018-11-0816-136/+321
| | | | | | | | | | | | | | | | | | This patch adds a new slow I/Os (-s) column to zpool status to show the number of VDEV slow I/Os. This is the number of I/Os that didn't complete in zio_slow_io_ms milliseconds. It also adds a new parsable (-p) flag to display exact values. NAME STATE READ WRITE CKSUM SLOW testpool ONLINE 0 0 0 - mirror-0 ONLINE 0 0 0 - loop0 ONLINE 0 0 0 20 loop1 ONLINE 0 0 0 0 Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7756 Closes #6885
* Update zfs_admin_snapshot value (disabled)George Melikov2018-11-086-14/+30
| | | | | | | | | | | | | It's disabled by default, update code and tests to reflect the documentation. Minor cleanup in delegate_common.kshlib. Reviewed-by: Gregor Kopka <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #7835 Closes #8045
* ZTS: Fix and reenable zfs_rename testsTom Caputi2018-11-073-9/+8
| | | | | | | | | | | | | | | zfs_rename_006_pos has been flaky in the past because it was missing a call to block_device_wait to ensure the zvols it creates are present before running dd. Whenever this this happened, zfs_rename_009_neg would also fail because the first test would leak a zvol clone that it did not know how to clean up. This patch fixes the root cause and reenables the test. It also fixes some minor grammar errors. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #5647 Closes #5648 Closes #8088
* ZTS: Fix test zfs_mount_006_posPaul Zuchowski2018-11-072-19/+38
| | | | | | | | | | | For Linux, place a file in the mount point folder so it will be considered "busy". Fix the while loop so it doesn't rm in directories above the testdir. Add Linux-specific code to test overlay on|off. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #4990 Closes #8081
* Add BuildRequires gcc, make, elfutils-libelf-develTony Hutter2018-11-072-0/+5
| | | | | | | | | | | | | | | | This adds a BuildRequires for gcc, make, and elfutils-libelf-devel into our spec files. gcc has been a packaging requirement for awhile now: https://fedoraproject.org/wiki/Packaging:C_and_C%2B%2B These additional BuildRequires allow us to mock build in Fedora 29. Reviewed-by: Neal Gompa <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #8095 Closes #8102
* Fix !zilog_is_dirty() assert during ztestTom Caputi2018-11-071-3/+9
| | | | | | | | | | | | | | | | ztest occasionally hits an assert that !zilog_is_dirty() during zil_close(). This is caused by an interaction between 2 threads. First, ztest_run() waits for each test thread to complete and closes the associated dataset as soon as the thread joins. At the same time, the ztest_vdev_add_remove() test may attempt to remove the slog, which will open, dirty, and reset the logs on every dataset in the pool (including those of other threads). This patch simply ensures that we always join all of the test threads before closing any datasets. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8094
* Fix divide by zero during indirect split damageTom Caputi2018-11-071-1/+8
| | | | | | | | | | | | This patch simply ensures that vdev_indirect_splits_damage() cannot hit a divide by zero exception if a split has no children with valid data. The normal reconstruction code path in vdev_indirect_reconstruct_io_done() already has this check. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8086
* Fix dirtying vdev config on with RO spaTom Caputi2018-11-071-2/+3
| | | | | | | | | | This patch simply corrects an issue where vdev_dtl_reassess() could attempt to dirty the vdev config even when the spa was not elligable for writing. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8085
* Replay logs before starting ztest workersTom Caputi2018-11-072-13/+77
| | | | | | | | | | | | | | | This patch ensures that logs are replayed on all datasets prior to starting ztest workers. This ensures that the call to vdev_offline() a log device in ztest_fault_inject() will not fail due to the log device being required for replay. This patch also fixes a small issue found during testing where spa_keystore_load_wkey() does not check that the dataset specified is an encryption root. This check was present in libzfs, however. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8084
* Fix vdev removal finishing raceTom Caputi2018-11-071-9/+6
| | | | | | | | | | | | | This patch fixes a race condition where the end of vdev_remove_replace_with_indirect(), which holds svr_lock, would race against spa_vdev_removal_destroy(), which destroys the same lock and is called asynchronously via dsl_sync_task_nowait(). Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Issue #6900 Closes #8083
* Make vdev_set_deferred_resilver() recursiveTom Caputi2018-11-071-1/+8
| | | | | | | | | | | | | | | vdev_clear() can call vdev_set_deferred_resilver() with a non-leaf vdev to setup a deferred resilver. However, this function is currently written to only handle leaf vdevs. This bug was introduced with deferred resilvers in 80a91e74. This patch makes this function recursive so that it can find appropriate vdevs to resilver and set vdev_resilver_deferred on them. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Issue #7732 Closes #8082
* Fix libudev dependency in libzutilDon Brady2018-11-062-1/+9
| | | | | | | | | | ZFS should be able to build without libudev installed. The recent change for libzutil inadvertently broke that. Make the libudev code conditional in zutil_import.c to resolve the build failure. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #8097
* zpool: bogus error for invalid dedupditto valueLOLi2018-11-061-3/+15
| | | | | | | | | | | | | | | When provided with an invalid 'dedupditto' value zpool prints a misleading error message: $ sudo zpool set dedupditto=99 pp cannot set property for 'pp': property 'dedupditto'(14) not defined Fix this by printing a meaningful error description for unsupported 'dedupditto' values. Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8079
* ztest: reduce gangblock creationBrian Behlendorf2018-11-052-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to validate the gang block code ztest is configured to artificially force a fraction of large blocks to be written as gang blocks. The default setting chosen for this was to write 25% of all blocks 32k or larger using gang blocks. The confluence of an unrealistically large number of gang blocks, the aggressive fault injection done by ztest, and the split segment reconstruction logic introduced by device removal has resulted in the following type of failure: zdb -bccsv -G -d ... exit code 3 Specifically, zdb was unable to open the pool because it was unable to reconstruct a damaged block. Manual investigation of multiple failures clearly showed that the block could be reconstructed. However, due to the large number of damaged segments (>35) it could not be done in the allotted time. Furthermore, the large number of gang blocks was determined to be the reason for the unrealistically large number of damaged segments. In order to make this situation less likely, this change both increases the forced gang block size to 64k and reduces the frequency to 3% of blocks. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8080
* Add libzutil for libzfs or libzpool consumersDon Brady2018-11-0539-3394/+3723
| | | | | | | | | | | Adds a libzutil for utility functions that are common to libzfs and libzpool consumers (most of what was in libzfs_import.c). This removes the need for utilities to link against both libzpool and libzfs. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #8050