aboutsummaryrefslogtreecommitdiffstats
path: root/tests
Commit message (Collapse)AuthorAgeFilesLines
* Fix ASSERT in zfs_receive_one()LOLi2018-12-043-2/+109
| | | | | | | | | | | | | | | This commit fixes the following ASSERT in zfs_receive_one() when receiving a send stream from a root dataset with the "-e" option: $ sudo zfs snap source@snap $ sudo zfs send source@snap | sudo zfs recv -e destination/recv chopprefix > drrb->drr_toname ASSERT at libzfs_sendrecv.c:3804:zfs_receive_one() Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8121
* Detect IO errors during device removalBrian Behlendorf2018-12-046-5/+241
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Detect IO errors during device removal While device removal cannot verify the checksums of individual blocks during device removal, it can reasonably detect hard IO errors from the leaf vdevs. Failure to perform this error checking can result in device removal completing successfully, but moving no data which will permanently corrupt the pool. Situation 1: faulted/degraded vdevs In the configuration shown below, the removal of mirror-0 will permanently corrupt the pool. Device removal will preferentially copy data from 'vdev1 -> vdev3' and from 'vdev2 -> vdev4'. Which in this case will result in nothing being copied since one vdev in each of those groups in unavailable. However, device removal will complete successfully since all IO errors are ignored. tank DEGRADED 0 0 0 mirror-0 DEGRADED 0 0 0 /var/tmp/vdev1 FAULTED 0 0 0 external fault /var/tmp/vdev2 ONLINE 0 0 0 mirror-1 DEGRADED 0 0 0 /var/tmp/vdev3 ONLINE 0 0 0 /var/tmp/vdev4 FAULTED 0 0 0 external fault This issue is resolved by updating the source child selection logic to exclude unreadable leaf vdevs. Additionally, unwritable destination child vdevs which can never succeed are skipped to prevent generating a large number of write IO errors. Situation 2: individual hard IO errors During removal if an unexpected hard IO error is encountered when either reading or writing the child vdev the entire removal operation is cancelled. While it may be possible to reconstruct the data after removal that cannot be guaranteed. The only strictly safe thing to do is to cancel the removal. As a future improvement we may want to instead suspend the removal process and allow the damaged region to be retried. But that work is left for another time, hard IO errors during the removal process are expected to be exceptionally rare. Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #6900 Closes #8161
* Remove races from scrub / resilver testsTom Caputi2018-11-2812-78/+34
| | | | | | | | | | | | | | | | | | | | | | Currently, several tests in the ZFS Test Suite that attempt to test scrub and resilver behavior occasionally fail. A big reason for this is that these tests use a combination of zinject and zfs_scan_vdev_limit to attempt to slow these operations enough to verify their test commands. This method works most of the time, but provides no guarantees and leads to flaky behavior. This patch adds a new tunable, zfs_scan_suspend_progress, that ensures that scans make no progress, guaranteeing that tests can be run without racing. This patch also changes zfs_remove_max_bytes_pause to match this new tunable. This provides some consistency between these two similar tunables and ensures that the tunable will not misbehave on 32-bit systems. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #8111
* ZTS: fix "not found" errorsLOLi2018-11-2714-37/+28
| | | | | | | | | | | | | | | | | This commit fixes several "not found" errors caused by calling undefined or incorrect shell functions in the following ZFS Test Suite groups: * alloc_class * channel_program/lua_core * channel_program/synctask_core * cli_root/zpool_import * cli_user/misc Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: bunder2015 <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8152
* Move strlcat, strlcpy, and strnlenBrian Behlendorf2018-11-201-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | Move strlcat() and strlcpy() from .c source files in to the libspl string.h header. By changing these compatibility functions to static inline functions they can included as needed without requiring linking with the libspl.so library. Remove strnlen() which is barely used in the source, and has been provided by glibc since v2.10. Finally, convert four instances of strncpy() to strlcpy() in libzfs_input_check.c which were causing build warnings when compiling with gcc 8.2.1. For example: libzfs_input_check.c: In function ‘zfs_destroy’: libzfs_input_check.c:651:9: error: ‘strncpy’ specified bound \ 4096 equals destination size [-Werror=stringop-truncation] (void) strncpy(zc.zc_name, dataset, sizeof (zc.zc_name)); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8116
* zpool: allow split with whole-disk devicesLOLi2018-11-2010-3/+115
| | | | | | | | | | This change allows 'zpool split' to work with whole-disk devices and updates the ZFS Test Suite with a new script to exercise this functionality. Reviewed by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6643 Closes #8133
* ZTS: Fix parsing of zpool status in checksum testJohn Wren Kennedy2018-11-201-3/+7
| | | | | | | | | | | | | | filetest_001_pos consumes the output using read -r, assigning each field to a variable. The problem comes when a vdev is marked degraded, which appends extra fields to the line. This causes the trailing text to be treated as part of the `cksum` variable. Using awk instead of read -r allows us to extract the checksum error count from the output whether the vdev is degraded or not. Reviewed-by: loli10K <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: John Wren Kennedy <[email protected]> Closes #8136
* ZTS: "checksum" test group needs "lscpu"LOLi2018-11-201-0/+1
| | | | | | | | | | | This change adds "lscpu" to the list of commands used by the ZFS Test Suite: this is required by the "checksum" test group to read the CPU frequency which is used in EdonR, Skein and SHA2 performance tests. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #8139
* OpenZFS 8115 - parallel zfs mountSebastien Roy2018-11-155-7/+265
| | | | | | | | | | | | | | | | | | | | | | | | Porting Notes: * Use thread pools (tpool) API instead of introducing taskq interfaces to libzfs. * Use pthread_mutext for locks as mutex_t isn't available. * Ignore alternative libshare initialization since OpenZFS-7955 is not present on zfsonlinux. Authored by: Sebastien Roy <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Prashanth Sreenivasa <[email protected]> Authored by: Brian Behlendorf <[email protected]> Approved by: Matt Ahrens <[email protected]> Ported-by: Don Brady <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/8115 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a3f0e2b569 Closes #8092
* zed: detect and offline physically removed devicesloli10K2018-11-094-8/+184
| | | | | | | | | | | | | | | | This commit adds a new test case to the ZFS Test Suite to verify ZED can detect when a device is physically removed from a running system: the device will be offlined if a spare is not available in the pool. We implement this by using the existing libudev functionality and without relying solely on the FM kernel module capabilities which have been observed to be unreliable with some kernels. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Don Brady <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #1537 Closes #7926
* Add zpool status -s (slow I/Os) and -p (parseable)Tony Hutter2018-11-084-3/+81
| | | | | | | | | | | | | | | | | | This patch adds a new slow I/Os (-s) column to zpool status to show the number of VDEV slow I/Os. This is the number of I/Os that didn't complete in zio_slow_io_ms milliseconds. It also adds a new parsable (-p) flag to display exact values. NAME STATE READ WRITE CKSUM SLOW testpool ONLINE 0 0 0 - mirror-0 ONLINE 0 0 0 - loop0 ONLINE 0 0 0 20 loop1 ONLINE 0 0 0 0 Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7756 Closes #6885
* Update zfs_admin_snapshot value (disabled)George Melikov2018-11-085-13/+28
| | | | | | | | | | | | | It's disabled by default, update code and tests to reflect the documentation. Minor cleanup in delegate_common.kshlib. Reviewed-by: Gregor Kopka <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #7835 Closes #8045
* ZTS: Fix and reenable zfs_rename testsTom Caputi2018-11-073-9/+8
| | | | | | | | | | | | | | | zfs_rename_006_pos has been flaky in the past because it was missing a call to block_device_wait to ensure the zvols it creates are present before running dd. Whenever this this happened, zfs_rename_009_neg would also fail because the first test would leak a zvol clone that it did not know how to clean up. This patch fixes the root cause and reenables the test. It also fixes some minor grammar errors. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #5647 Closes #5648 Closes #8088
* ZTS: Fix test zfs_mount_006_posPaul Zuchowski2018-11-072-19/+38
| | | | | | | | | | | For Linux, place a file in the mount point folder so it will be considered "busy". Fix the while loop so it doesn't rm in directories above the testdir. Add Linux-specific code to test overlay on|off. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #4990 Closes #8081
* ZTS: Fix posix ACL tests that should passPaul Zuchowski2018-10-313-7/+48
| | | | | | | | | Make sure tests have proper include files. Make sure underlying "chmod" style permissions don't interfere with ACLs. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #8069
* ZTS: change `$(cat)` to `$(<)` for speedupGeorge Melikov2018-10-3115-40/+40
| | | | | | | | | It's better to use ksh/bash built in methods, rather than spawn new processes every time. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Wren Kennedy <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #8071
* zdb -k does not work on Linux when used with -eSerapheim Dimitropoulos2018-10-302-5/+32
| | | | | | | | | | | | This minor bug was introduced with the port of the feature from OpenZFS to ZoL. This patch fixes the issue that was caused by a minor re-ordering from the original code. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Tim Chase <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Serapheim Dimitropoulos <[email protected]> Closes #8001
* ZTS: Fix auto_replace_001_pos testBrian Behlendorf2018-10-292-81/+37
| | | | | | | | | | | | | | | | | | | | | | | The root cause of these failures is that udev can notify the ZED of newly created partition before its links are created. Handle this by allowing an auto-replace to briefly wait until udev confirms the links exist. Distill this test case down to its essentials so it can be run reliably. What we need to check is that: 1) A new disk, in the same physical location, is automatically brought online when added to the system, 2) It completes the replacement process, and 3) The pool is now ONLINE and healthy. There is no need to remove the scsi_debug module. After exporting the pool the disk can be zeroed, removed, and then re-added to the system as a new disk. Reviewed by: loli10K <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8051
* Fix flake8 "invalid escape sequence 'x'" warningBrian Behlendorf2018-10-241-2/+3
| | | | | | | | | | | | | | | | From, https://lintlyci.github.io/Flake8Rules/rules/W605.html As of Python 3.6, a backslash-character pair that is not a valid escape sequence now generates a DeprecationWarning. Although this will eventually become a SyntaxError, that will not be for several Python releases. Note 'float_pobj' was simply removed from arcstat.py since it was entirely unused. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Richard Elling <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8056
* ZTS: Update project quota testsBrian Behlendorf2018-10-232-10/+10
| | | | | | | | | | | | | e2fsprogs v1.44.1, which provides lsattr, added a new attribute for ext3 called "verity". It is reported after the project quota flag as a 'V' character in the `lsattr` output. Update projectid_001_pos.ksh and projecttree_001_pos.ksh to use a pattern which will match the expected output in both cases. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8043
* Defer new resilvers until the current one endsTom Caputi2018-10-1813-5/+289
| | | | | | | | | | | | | | | | | | | | | | | | | Currently, if a resilver is triggered for any reason while an existing one is running, zfs will immediately restart the existing resilver from the beginning to include the new drive. This causes problems for system administrators when a drive fails while another is already resilvering. In this case, the optimal thing to do to reduce risk of data loss is to wait for the current resilver to end before immediately replacing the second failed drive, which allows the system to operate with two incomplete drives for the minimum amount of time. This patch introduces the resilver_defer feature that essentially does this for the admin without forcing them to wait and monitor the resilver manually. The change requires an on-disk feature since we must mark drives that are part of a deferred resilver in the vdev config to ensure that we do not assume they are done resilvering when an existing resilver completes. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: @mmaybee Signed-off-by: Tom Caputi <[email protected]> Closes #7732
* zpool: allow sharing of spare device among poolsLOLi2018-10-174-1/+123
| | | | | | | | | | | ZFS allows, by default, sharing of spare devices among different pools; this commit simply restores this functionality for disk devices and adds an additional tests case to the ZFS Test Suite to prevent future regression. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7999
* deadlock between mm_sem and tx assign in zfs_write() and page faultilbsmart2018-10-162-44/+104
| | | | | | | | | | | | | | | | | | | The bug time sequence: 1. thread #1, `zfs_write` assign a txg "n". 2. In a same process, thread #2, mmap page fault (which means the `mm_sem` is hold) occurred, `zfs_dirty_inode` open a txg failed, and wait previous txg "n" completed. 3. thread #1 call `uiomove` to write, however page fault is occurred in `uiomove`, which means it need `mm_sem`, but `mm_sem` is hold by thread #2, so it stuck and can't complete, then txg "n" will not complete. So thread #1 and thread #2 are deadlocked. Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Grady Wong <[email protected]> Closes #7939
* Fix changelist mounted-dataset iterationAlek P2018-10-105-4/+157
| | | | | | | | | | | Commit 0c6d093 caused a regression in the inherit codepath. The fix is to restrict the changelist iteration on mountpoints and add proper handling for 'legacy' mountpoints Reviewed by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Signed-off-by: Alek Pinchuk <[email protected]> Closes #7988 Closes #7991
* Print "(repairing)" in zpool status againTony Hutter2018-10-093-2/+78
| | | | | | | | | | | | | | | | | | | | Historically, zpool status prints "(repairing)" for any drives that have errors during a scrub: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 /tmp/file1 ONLINE 13 0 0 (repairing) /tmp/file2 ONLINE 0 0 0 /tmp/file3 ONLINE 0 0 0 This was accidentally broken in "OpenZFS 9166 - zfs storage pool checkpoint" (d2734cc). This patch adds it back in. Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7779 Closes #7978
* Verify 'zfs destroy' will unshare the datasetPrakash Surya2018-10-034-22/+104
| | | | | | | | | | | | This change adds a new test case to the zfs-test suite to verify that when 'zfs destroy' is used on a shared dataset, the dataset will be unshared after the destroy operation completes. Reviewed by: loli10K <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Prakash Surya <[email protected]> Closes #7941
* Fix "zfs destroy" when "sharenfs=on" is usedPrakash Surya2018-10-032-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using "zfs destroy" on a dataset that is using "sharenfs=on" and has been automatically exported (by libzfs), the dataset will not be automatically unexported as it should be. This workflow appears to have been broken by this commit: 3fd3e56cfd543d7d7a1bf502bfc0db6e24139668 In that change, the "zfs_unmount" function was modified to use the "mnt.mnt_special" field when determining the mount point that is being unmounted, rather than "mnt.mnt_mountp". As a result, when "mntpt" is passed into "zfs_unshare_proto", it's value is now the dataset name rather than the mountpoint. Thus, when this value is used with the "is_shared" function (via "zfs_unshare_proto") it will not find a match (since that function assumes it'll be passed the mountpoint) and incorrectly reports that the dataset is not shared. This can be easily reproduced with the following commands: $ sudo zpool create tank xvdb $ sudo zfs create -o sharenfs=on tank/fish $ sudo zfs destroy tank/fish $ sudo zfs list -r tank NAME USED AVAIL REFER MOUNTPOINT tank 97.5K 7.27G 24K /tank $ sudo exportfs /tank/fish <world> $ sudo cat /etc/dfs/sharetab /tank/fish - nfs rw,crossmnt At this point, the "tank/fish" filesystem doesn't exist, but it's still listed as exported when looking at "exportfs" and "/etc/dfs/sharetab". Also note, this change brings us back in-sync with the illumos code, as it pertains to this one line; on illumos, "mnt.mnt_mountp" is used. Reviewed by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: George Wilson <[email protected]> Signed-off-by: Prakash Surya <[email protected]> Issue #6143 Closes #7941
* changelist should be able to iter on mountsAlek P2018-10-023-2/+192
| | | | | | | | | | | | | Modified changelist_gather()ing for the mountpoint property. Now instead of iterating on all dataset descendants, we read /proc/self/mounts and iterate on the mounted descendant datasets only. Switched changelist implementation from a uu_list_* to uu_avl_* in order to reduce changlist code-path's worst case time complexity. Reviewed by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alek Pinchuk <[email protected]> Closes #7967
* ZTS: Fix snapshot_009_pos, snapshot_010_posBrian Behlendorf2018-10-014-10/+9
| | | | | | | | | | | | | | | | Mitigate the likelihood of the newly created volumes being busy when the 'zfs destroy -r' is issued by waiting for udev to settle. Since this is not a iron clad fix I've added the test case to the known list of possible failures and referenced issue #7961. Finally, in the case this test does fail fix the cleanup logic so subsequent tests won't incorrectly fail. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7961 Closes #7962
* Fixes for procfs files backed by linked listsJohn Gallagher2018-09-2610-14/+294
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are some issues with the way the seq_file interface is implemented for kstats backed by linked lists (zfs_dbgmsgs and certain per-pool debugging info): * We don't account for the fact that seq_file sometimes visits a node multiple times, which results in missing messages when read through procfs. * We don't keep separate state for each reader of a file, so concurrent readers will receive incorrect results. * We don't account for the fact that entries may have been removed from the list between read syscalls, so reading from these files in procfs can cause the system to crash. This change fixes these issues and adds procfs_list, a wrapper around a linked list which abstracts away the details of implementing the seq_file interface for a list and exposing the contents of the list through procfs. Reviewed by: Don Brady <[email protected]> Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: John Gallagher <[email protected]> External-issue: LX-1211 Closes #7819
* Fix flake 8 style warningsGregor Kopka2018-09-262-35/+42
| | | | | | | | | | | | | | Ran zts-report.py and test-runner.py from ./tests/test-runner/bin/ through the 2to3 (https://docs.python.org/2/library/2to3.html). Checked the result, fixed: - 'maxint' -> 'maxsize' that 2to3 missed. - 'cmp=' parameter for a 'sorted()' with a 'key=' version. - try/except wrapping of configparser import as there are still python 2.7 systems that lack a compatibility shim Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Gregor Kopka <[email protected]> Closes #7925 Closes #7952
* Revert "Fix flake 8 style warnings"Brian Behlendorf2018-09-242-37/+35
| | | | | | | | This reverts commit b8fd4310c54444eecb66140d99a6156f4353b29b which accidentally introduced a regression for some versions of python. Signed-off-by: Brian Behlendorf <[email protected]> Issue #7929
* ZTS: Fix removal_resume_exportLOLi2018-09-242-48/+4
| | | | | | | | | | | | | This change simplify the test case removing part of the logic which was introducing a race condition and thus causing spurious failures: we use attempt_during_removal() from removal.kshlib instead which has been observed to be more stable. Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7894 Closes #7913
* Fix flake 8 style warningsGregor Kopka2018-09-242-35/+37
| | | | | | | | | | | | | Ran zts-report.py and test-runner.py from ./tests/test-runner/bin/ through the 2to3 (https://docs.python.org/2/library/2to3.html). Checked the result, fixed: - 'maxint' -> 'maxsize' that 2to3 missed. - 'cmp=' parameter for a 'sorted()' with a 'key=' version. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: John Wren Kennedy <[email protected]> Signed-off-by: Gregor Kopka <[email protected]> Closes #7925 Closes #7929
* zpool should detect invalid fs property on createLOLi2018-09-131-16/+36
| | | | | | | | | | | This change improve the handling of invalid filesystem properties when specified at pool creation: this is useful when 'zpool create -n' (dry run) is executed to detect invalid fs-level options (-O) before the actual command is run. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7620 Closes #7878
* Add removal_resume_export to zts-report.pyBrian Behlendorf2018-09-132-1/+2
| | | | | | | | | | | | | | | Add the removal_resume_export test case to the possible failure section of the zts-report.py and reference the Github issue. In the CI environment this test has proven to be unreliable due to the way it detects the removal thread. This is a flaw in the test and not device removal so update the result summary accordingly. Additionally, increase the allowed timeout in an effort to reduce the observed rate of false positves. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7895 Issue #7894
* zpool split can create a corrupted poolRoman Strashkin2018-09-123-2/+113
| | | | | | | | | | | Added vdev_resilver_needed() check to verify VDEVs are fully synced, so that after split the new pool will not be corrupted. Reviewed by: Pavel Zakharov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Roman Strashkin <[email protected]> Closes #7865 Closes #7881
* Fix 'zfs allow' for create time permissionsLOLi2018-09-061-1/+1
| | | | | | | | | | | | | | When no permission set is defined for a dataset the create time permissions are incorrectly shown as if they were a permission set. This change simply correct how allow permissions are displayed. This commit also fixes a small manpage formatting issue and adds the "zfs_allow_003_pos" test case to the ZFS Test Suite. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7519 Closes #7860
* Pool allocation classesDon Brady2018-09-0522-1/+888
| | | | | | | | | | | | | | | | | | | | Allocation Classes add the ability to have allocation classes in a pool that are dedicated to serving specific block categories, such as DDT data, metadata, and small file blocks. A pool can opt-in to this feature by adding a 'special' or 'dedup' top-level VDEV. Reviewed by: Pavel Zakharov <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Håkan Johansson <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Reviewed-by: DHE <[email protected]> Reviewed-by: Richard Elling <[email protected]> Reviewed-by: Gregor Kopka <[email protected]> Reviewed-by: Kash Pande <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #5182
* Add basic zfs ioc input nvpair validationDon Brady2018-09-028-2/+956
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We want newer versions of libzfs_core to run against an existing zfs kernel module (i.e. a deferred reboot or module reload after an update). Programmatically document, via a zfs_ioc_key_t, the valid arguments for the ioc commands that rely on nvpair input arguments (i.e. non legacy commands from libzfs_core). Automatically verify the expected pairs before dispatching a command. This initial phase focuses on the non-legacy ioctls. A follow-on change can address the legacy ioctl input from the zfs_cmd_t. The zfs_ioc_key_t for zfs_keys_channel_program looks like: static const zfs_ioc_key_t zfs_keys_channel_program[] = { {"program", DATA_TYPE_STRING, 0}, {"arg", DATA_TYPE_UNKNOWN, 0}, {"sync", DATA_TYPE_BOOLEAN_VALUE, ZK_OPTIONAL}, {"instrlimit", DATA_TYPE_UINT64, ZK_OPTIONAL}, {"memlimit", DATA_TYPE_UINT64, ZK_OPTIONAL}, }; Introduce four input errors to identify specific input failures (in addition to generic argument value errors like EINVAL, ERANGE, EBADF, and E2BIG). ZFS_ERR_IOC_CMD_UNAVAIL the ioctl number is not supported by kernel ZFS_ERR_IOC_ARG_UNAVAIL an input argument is not supported by kernel ZFS_ERR_IOC_ARG_REQUIRED a required input argument is missing ZFS_ERR_IOC_ARG_BADTYPE an input argument has an invalid type Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #7780
* Add zfs module feature and property info to sysfsDon Brady2018-09-0211-0/+406
| | | | | | | | | | | | | | | | | | | | | This extends our sysfs '/sys/module/zfs' entry to include feature and property attributes. The primary consumer of this information is user processes, like the zfs CLI, that need to know what the current loaded ZFS module supports. The libzfs binary will consult this information when instantiating the zfs and zpool property tables and the pool features table. This introduces 4 kernel objects (dirs) into '/sys/module/zfs' with corresponding attributes (files): features.runtime features.pool properties.dataset properties.pool Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #7706
* ZTS: Fix EBUSY volume destroy failuresBrian Behlendorf2018-08-3126-52/+41
| | | | | | | | | | | | It's possible for an unrelated process, like blkid, to have the volume open when 'zfs destroy' is run. Switch the cleanup functions to the destroy_dataset() helper which handles this case by retrying the destroy when the dataset is busy. This was done not only for volumes but also for file systems for consistency. Reviewed-by: Richard Elling <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7854
* ZTS: pool_checkpoint path cleanupbunder20152018-08-301-1/+1
| | | | | | | | | Removing hardcoded paths in pool_checkpoint.kshlib Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: bunder2015 <[email protected]> Closes #7840
* ZTS: Fix DEV_DSKDIR trim from diskRichard Elling2018-08-301-3/+3
| | | | | | Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Elling <[email protected]> Closes #7848
* ZTS: zvol_swap_003 path cleanupbunder20152018-08-301-2/+2
| | | | | | | | | Removing hardcoded paths in zvol_swap_003 Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: bunder2015 <[email protected]> Closes #7839
* ZTS: path cleanupbernie19952018-08-3062-164/+180
| | | | | | | | | Removing hardcoded paths in many scripts. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: bernie1995 <[email protected]> Issue #7507 Closes #7843
* ZTS: Fix zfs_create_013_posBrian Behlendorf2018-08-301-2/+1
| | | | | | | | | | It's possible for an unrelated process, like blkid, to have the volume open when 'zfs destroy' is run. Switch the cleanup function to the destroy_dataset() helper which handles this case by retrying the destroy when the dataset is busy. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7847
* OpenZFS 9403 - assertion failed in arc_buf_destroy()Tom Caputi2018-08-294-3/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Assertion failed in arc_buf_destroy() when concurrently reading block with checksum error. Porting notes: * The ability to zinject decompression errors has been added, but this only works at the zio_decompress() level, where we have all of the info we need to match against the user's zinject options. * The decompress_fault test has been added to test the new zinject functionality * We attempted to set zio_decompress_fail_fraction to (1 << 18) in ztest for further test coverage. Although this did uncover a few low priority issues, this unfortuantely also causes ztest to ASSERT in many locations where the code is working correctly since it is designed to fail on IO errors. Developers can manually set this variable with the '-o' option to find and debug issues. Authored by: Matt Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Approved by: Matt Ahrens <[email protected]> Ported-by: Tom Caputi <[email protected]> OpenZFS-issue: https://illumos.org/issues/9403 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/fa98e487a9 Closes #7822
* Always wait for txg sync when umounting datasetTom Caputi2018-08-271-4/+16
| | | | | | | | | | | | | | | | | | | | | | | | | Currently, when unmounting a filesystem, ZFS will only wait for a txg sync if the dataset is dirty and not readonly. However, this can be problematic in cases where a dataset is remounted readonly immediately before being unmounted, which often happens when the system is being shut down. Since encrypted datasets require that all I/O is completed before the dataset is disowned, this issue causes problems when write I/Os leak into the txgs after the dataset is disowned, which can happen when sync=disabled. While looking into fixes for this issue, it was discovered that dsl_dataset_is_dirty() does not return B_TRUE when the dataset has been removed from the txg dirty datasets list, but has not actually been processed yet. Furthermore, the implementation is comletely different from dmu_objset_is_dirty(), adding to the confusion. Rather than relying on this function, this patch forces the umount code path (and the remount readonly code path) to always perform a txg sync on read-write datasets and removes the function altogether. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #7753 Closes #7795
* Direct IO supportBrian Behlendorf2018-08-2713-0/+454
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Direct IO via the O_DIRECT flag was originally introduced in XFS by IRIX for database workloads. Its purpose was to allow the database to bypass the page and buffer caches to prevent unnecessary IO operations (e.g. readahead) while preventing contention for system memory between the database and kernel caches. On Illumos, there is a library function called directio(3C) that allows user space to provide a hint to the file system that Direct IO is useful, but the file system is free to ignore it. The semantics are also entirely a file system decision. Those that do not implement it return ENOTTY. Since the semantics were never defined in any standard, O_DIRECT is implemented such that it conforms to the behavior described in the Linux open(2) man page as follows. 1. Minimize cache effects of the I/O. By design the ARC is already scan-resistant which helps mitigate the need for special O_DIRECT handling. Data which is only accessed once will be the first to be evicted from the cache. This behavior is in consistent with Illumos and FreeBSD. Future performance work may wish to investigate the benefits of immediately evicting data from the cache which has been read or written with the O_DIRECT flag. Functionally this behavior is very similar to applying the 'primarycache=metadata' property per open file. 2. O_DIRECT _MAY_ impose restrictions on IO alignment and length. No additional alignment or length restrictions are imposed. 3. O_DIRECT _MAY_ perform unbuffered IO operations directly between user memory and block device. No unbuffered IO operations are currently supported. In order to support features such as transparent compression, encryption, and checksumming a copy must be made to transform the data. 4. O_DIRECT _MAY_ imply O_DSYNC (XFS). O_DIRECT does not imply O_DSYNC for ZFS. Callers must provide O_DSYNC to request synchronous semantics. 5. O_DIRECT _MAY_ disable file locking that serializes IO operations. Applications should avoid mixing O_DIRECT and normal IO or mmap(2) IO to the same file. This is particularly true for overlapping regions. All I/O in ZFS is locked for correctness and this locking is not disabled by O_DIRECT. However, concurrently mixing O_DIRECT, mmap(2), and normal I/O on the same file is not recommended. This change is implemented by layering the aops->direct_IO operations on the existing AIO operations. Code already existed in ZFS on Linux for bypassing the page cache when O_DIRECT is specified. References: * http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch02s09.html * https://blogs.oracle.com/roch/entry/zfs_and_directio * https://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO's_Semantics * https://illumos.org/man/3c/directio Reviewed-by: Richard Elling <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #224 Closes #7823