summaryrefslogtreecommitdiffstats
path: root/tests/test-runner
Commit message (Collapse)AuthorAgeFilesLines
* Scrub mirror children without BPsBrian Behlendorf2022-07-141-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When scrubbing a raidz/draid pool, which contains a replacing or sparing mirror with multiple online children, only one child will be read. This is not normally a serious concern because the DTL records are used to determine where a good copy of the data is. As long as the data can be read from one child the mirror vdev will use it to repair gaps in any of its children. Furthermore, even if the data which was read is corrupt the raidz code will detect this and issue its own repair I/O to correct the damage in the mirror vdev. However, in the scenario where the DTL is wrong due to silent data corruption (say due to overwriting one child) and the scrub happens to read from a child with good data, then the other damaged mirror child will not be detected nor repaired. While this is possible for both raidz and draid vdevs, it's most pronounced when using draid. This is because by default the zed will sequentially rebuild a draid pool to a distributed spare, and the distributed spare half of the mirror is always preferred since it delivers better performance. This means the damaged half of the mirror will go undetected even after scrubbing. For system administrations this behavior is non-intuitive and in a worst case scenario could result in the only good copy of the data being unknowingly detached from the mirror. This change resolves the issue by reading all replacing/sparing mirror children when scrubbing. When the BP isn't available for verification, then compare the data buffers from each child. They must all be identical, if not there's silent damage and an error is returned to prompt the top-level vdev to issue a repair I/O to rewrite the data on all of the mirror children. Since we can't tell which child was wrong a checksum error is logged against the replacing or sparing mirror vdev. Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13555
* ZTS: Retry in import_rewind_config_changed.kshBrian Behlendorf2022-03-021-2/+0
| | | | | | | | | | | | | | | | | | | | As explained by the disclaimer in the test case, "This test can fail since nothing guarantees that old MOS blocks aren't overwritten." This behavior is expected and correct, but results in a flaky test case which is problematic for the CI. The best we can do to resolve this is to retry the sub-test which failed when the MOS blocks have clearly been overwritten. When testing failures were rare enough that a single retry should normally be sufficient. However, we allow up to five for good measure. Reviewed by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13119
* ZTS: Modify receive-o-x_props_override.ksh exceptionBrian Behlendorf2022-03-011-2/+2
| | | | | | | | | | | | As previously noted in #12272 the receive-o-x_props_override.ksh test reliably fails on FreeBSD. Since we don't expect this test to pass move the exception from the "maybe" to "known" section. This way we don't retry the FAILED test when it is not expected to pass. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13167
* ZTS: Fix zpool_expand_001_posBrian Behlendorf2022-02-161-2/+0
| | | | | | | | | | | | | | | | | The dRAID section of the zpool_expand_001_pos test would reliably fail because the calculated expansion size assumed the dRAID top-level vdev was created with a distributed spare. Create the vdev as expected to resolve the test failure. This test case flaw was accidentally caused by changing the default number of dRAID distributed spares from one to zero while dRAID was being developed. Additionally, remove zpool_expand_005_pos from the list of possible faulty tests. It appears to be passing consistently in my testing. Reviewed by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13091
* ZTS: Update enospc_002_pos test caseBrian Behlendorf2022-02-161-1/+0
| | | | | | | | | | | The on-disk cost of creating a snapshot or bookmark is sufficiently low that it is difficult to make it reliably fail even when the pool is "full". In order to avoid false positives remove these two checks from the test case. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13060
* ZTS: Fix rollback_003_pos.kshBrian Behlendorf2022-02-161-1/+0
| | | | | | | | | | | | | | | | | | Under Linux when rolling back a mounted filesystem negative dentries may not be dropped from the cache. This can result in an ENOENT being incorrectly returned on first access. Issuing a `df` before the unmount results in the negative dentries being invalidated and side steps the issue. This is solely a workaround for the test case on Linux and not correct behavior. The core issue of invalidating negative dentries needs to be handled with a kernel side change. This is being tracked as issue #6143. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12898 Issue #6143
* Update zts-report.py with additional testsBrian Behlendorf2022-02-161-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following test cases may still occasionally fail and are being added to the "maybe" list for Linux until they can be updated to be entirely reliable. cli_root/zfs_rename/zfs_rename_002_pos.ksh cli_root/zpool_reopen/zpool_reopen_003_pos.ksh refreserv/refreserv_raidz These 6 tests consistently fail only on Fedora 31+, the failures are related to the kernel rescanning the partition table on loopback devices which is no longer reliable unless partprobe is used. In order to enable the Fedora bot by default they are also being added to the list until the tests can be updated. Any significant regression in functionality covered by these tests will still be detected by the FreeBSD builders. alloc_class/alloc_class_009_pos alloc_class/alloc_class_010_pos cli_root/zpool_expand/zpool_expand_001_pos cli_root/zpool_expand/zpool_expand_005_pos rsend/rsend_007_pos rsend/rsend_010_pos rsend/rsend_011_pos snapshot/rollback_003_pos Signed-off-by: Brian Behlendorf <[email protected]> Closes #10489
* Exclude zvol_misc_volmode for nowRich Ercolani2022-02-161-0/+1
| | | | | | | | | | | | It keeps failing, on changes which aren't related at all. So until someone runs down why, I'd like it to stop being the sole reason for CI failures. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #12733
* ZTS: Add known exceptionsBrian Behlendorf2022-02-161-0/+3
| | | | | | | | | | | | | | | | | | Add the following test failures to the exception list for FreeBSD to ensure we notice new unexpected failures. pool_checkpoint/checkpoint_big_rewind pool_checkpoint/checkpoint_indirect And the following for Linux. zvol/zvol_misc/zvol_misc_snapdev Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #12621 Issue #12622 Issue #12623 Closes #12624
* ZTS: Minimize udev_wait in zvol_misc testsRyan Moeller2022-02-161-1/+0
| | | | | | | | | | | | | | | | | | | The zvol_misc tests, in particular zvol_misc_volmode, make use of a common udev_wait function to wait for zvol devices in /dev to quiesce on Linux. On other platforms this function currently only sleeps for one second before returning. This is insufficient, and zvol_misc_volmode has been flaky on FreeBSD as a result. Replace udev_wait with block_device_wait, passing through the optional device parameter where possible. Rearrange a few checks to strengthen the verifications we are making and avoid unnecessarily sleeping. We must keep udev_wait in a couple places to pass in Github CI workflows. Remove zvol_misc_volmode from the maybe failing tests on FreeBSD in zts-report.py. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #12583
* ZTS: Enable punch-hole tests on FreeBSDKa Ho Ng2022-02-161-0/+8
| | | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ka Ho Ng <[email protected]> Sponsored-by: The FreeBSD Foundation Closes #12458
* ZTS: Fix refreserv_raidz.kshBrian Behlendorf2022-02-161-1/+0
| | | | | | | | | | The rerefreserv_raidz test was failing on Linux because the sync being issued doesn't guarantee a pool sync. Switch to using the sync_pool function and remove the ZTS exception for Linux. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12897
* ZTS: rsend_007_pos failuresBrian Behlendorf2022-02-161-9/+0
| | | | | | | | | | | | | | | | | | | | | | | The rsend_007_pos test reliably fails on Linux in the cleanup function. This is caused by an unmount error when attempting to recursively destroy the newly received datasets. Invoking `df` prior to the `zfs destroy` interestingly avoids the unmont error. Why this should matter is unclear and should be investigated. However, this minor tweak may allow us to remove the ZTS rsend exceptions. The subsequent rsend_010_pos and rsend_011_pos failures were a result of this initial failure. The other "maybe" failures I was unable to reproduce and have not been recently observed in the master branch. Reviewed-by: Tony Nguyen <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #5665 Closes #6086 Closes #6087 Closes #6446 Closes #12876
* ZTS: alloc_class.ksh must wait for the process to exitBrian Behlendorf2022-02-101-5/+0
| | | | | | | | | | | | | | | The alloc_class_* tests may fail on Linux with an EBUSY error if `zfs destroy` is run before the `dd` process has had a chance to terminate. Wait on the pid after the `kill -9` to make sure. When testing I didn't observe any failures for the alloc_class tests. Remove them from the exceptions list, the CI was used to verify the tests pass on all platforms. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12873
* ZTS: import_rewind_device_replaced reliably failsBrian Behlendorf2021-12-081-2/+2
| | | | | | | | | | | | | | | The import_rewind_device_replaced.ksh test was never entirely reliable because it depends on MOS data not being overwritten. The MOS data is not protected by the snapshot so occasional failures were always expected. However, this test is now failing reliably on all platforms indicating something has changed in the code since the test was marked "maybe". Convert the test to a "known" failure until the root cause is identified and resolved. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12821
* Linux 5.13 compat: retry zvol_open() when contendedBrian Behlendorf2021-12-061-1/+0
| | | | | | | | | | | | | | | | | | | | | | | Due to a possible lock inversion the zvol open call path on Linux needs to be able to retry in the case where the spa_namespace_lock cannot be acquired. For Linux 5.12 an older kernel this was accomplished by returning -ERESTARTSYS from zvol_open() to request that blkdev_get() drop the bdev->bd_mutex lock, reaquire it, then call the open callback again. However, as of the 5.13 kernel this behavior was removed. Therefore, for 5.12 and older kernels we preserved the existing retry logic, but for 5.13 and newer kernels we retry internally in zvol_open(). This should always succeed except in the case where a pool's vdev are layed on zvols, in which case it may fail. To handle this case vdev_disk_open() has been updated to retry when opening a device when -ERESTARTSYS is returned. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #12301 Closes #12759
* Add zfs-test facility to automatically rerun failing testsPaul Dagnelie2021-12-062-24/+107
| | | | | | | | | | | | This was a project proposed as part of the Quality theme for the hackthon for the 2021 OpenZFS Developer Summit. The idea is to improve the usability of the automated tests that get run when a PR is created by having failing tests automatically rerun in order to make flaky tests less impactful. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #12740
* Exclude zfs_copies_003_pos on LinuxBrian Behlendorf2021-11-121-0/+1
| | | | | | | | | | | This test case may fail on 5.13 and newer Linux kernels if the /dev/zvol/ device is not created by udev. Reviewed-by: Rich Ercolani <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #12301 Closes #12738
* ZTS: Remove exceptions for flaky zhack on FreeBSDRyan Moeller2021-09-141-7/+0
| | | | | | | | | | Issue #11854 has been resolved, so we can remove the exceptions for it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #12527
* ZTS: Add tests for creation timeRyan Moeller2021-09-141-0/+7
| | | | | | | | | Reviewed-by: Tony Nguyen <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #12432
* Fixes in persistent L2ARCGeorge Amanakis2021-09-141-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In l2arc_add_vdev() first decide whether the device is eligible for L2ARC rebuild or whole device trim and then add it to the list of cache devices. Otherwise l2arc_feed_thread() might already start writing on the device invalidating previous content as l2ad_hand = l2ad_start. However l2arc_rebuild_vdev() needs the device present in the cache device list to figure out its l2arc_dev_t. Fix this by moving most of l2arc_rebuild_vdev() in a new function l2arc_rebuild_dev() which does not need to search in the cache device list. In contrast to l2arc_add_vdev() we do not have to worry about l2arc_feed_thread() invalidating previous content when onlining a cache device. The device parameters (l2ad*) are not cleared when offlining the device and writing new buffers will not invalidate all previous content. In worst case only buffers that have not had their log block written to the device will be lost. Retire persist_l2arc_00{4,5,8} tests since they cover code already covered by the remaining ones. Test persist_l2arc_006 is renamed to persist_l2arc_004 and persist_l2arc_007 is renamed to persist_l2arc_005. Fix a typo in persist_l2arc_004, and remove an assertion that is not always true from l2arc_arcstats_pos. Also update an assertion in persist_l2arc_005 and explain why in a comment. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #12365
* ZED: Match added disk by pool/vdev GUID if found (#12217)Ryan Moeller2021-09-141-0/+1
| | | | | | | | | | This enables ZED to auto-online vdevs that are not wholedisk managed by ZFS. Signed-off-by: Ryan Moeller <[email protected]> Reviewed-by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
* ZTS: Add known exceptionsBrian Behlendorf2021-06-241-0/+3
| | | | | | | | | | | | | | | The receive-o-x_props_override test case reliably fails on the FreeBSD main builders (but not on Linux), until the root cause is understood add this test to the FreeBSD exception list. On Linux the alloc_class_012_pos test case may occasionally fail. This is a known false positive which has also been added to the Linux exception list until the test can be made entirely reliable. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12272
* Widen mancheck to all of man and test-runnerнаб2021-06-091-43/+54
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12125
* test-runner.1: moderniseнаб2021-06-091-325/+222
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12125
* ZTS: Add known exceptionsBrian Behlendorf2021-05-271-0/+9
| | | | | | | | | | | The following seven tests been observed to occasionally fail during CI testing. This commit adds them to the list of known somewhat flaky test cases. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12023
* ZTS: Add known exceptionsBrian Behlendorf2021-05-101-0/+3
| | | | | | | | | | Both the zpool_initialize_import_export and checkpoint_discard_busy test cases a known to occasionally fail. Add them to the list of known possible failures and reference the appropriate issue on the tracker. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11949
* ZTS: fix removal_condense_export test caseBrian Behlendorf2021-04-141-1/+0
| | | | | | | | | | | | | It's been observed in the CI that the required 25% of obsolete bytes in the mapping can be to high a threshold for this test resulting in condensing never being triggered and a test failure. To prevent these failures make the existing zfs_condense_indirect_obsolete_pct tuning available so the obsolete percentage can be reduced from 25% to 5% during this test. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11869
* ZTS: Add known exceptionsBrian Behlendorf2021-04-141-0/+3
| | | | | | | | | | | The fault/auto_spare_shared, l2arc/persist_l2arc_007_pos, and alloc_class/alloc_class_013_pos test cases are not entirely reliable and may occasionally fail resulting in a false positive in the CI. Add these tests to known list of possible failures until they can be made 100% reliable. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11890
* ZTS: Improve cleanup in removal_with_exportRyan Moeller2021-04-141-1/+0
| | | | | | | | | Kill the removal operation on every platform, not just Linux. The test has been fixed and is now stable on FreeBSD. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11856
* ZTS: Tests using zhack may fail on FreeBSDRyan Moeller2021-04-141-0/+7
| | | | | | | | | | | | As described in #11854, zhack is occasionally segfaulting on FreeBSD. Debugging this is proving to be tricky. To avoid false positives in the CI add entries for the tests that use zhack in zts-report to accept that they may occasionally fail on FreeBSD. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Issue #11854 Closes #11855
* ZTS: inheritance/inherit_001_pos is flakyRyan Moeller2021-04-071-0/+1
| | | | | | | | Add inheritance/inherit_001_pos to the maybe fails on FreeBSD list. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11830
* zts-report.py: ignore some skipped tests in Github CIGeorge Melikov2021-02-021-0/+34
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #11554
* logapi: cat output file instead of printingWill Andrews2021-01-251-3/+3
| | | | | | | | | This avoids globbing together multiple lines in the log, if you happen to specify LOGAPI_DEBUG because you want to see it. Signed-off-by: Will Andrews <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11515
* Add basic io_uring testMatthew Macy2021-01-231-0/+1
| | | | | | | | Provide a basic test coverage for io_uring I/O. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #11497
* special device removal space accounting fixesMatthew Ahrens2020-12-171-2/+0
| | | | | | | | | | | | | | | | The space in special devices is not included in spa_dspace (or dsl_pool_adjustedsize(), or the zfs `available` property). Therefore there is always at least as much free space in the normal class, as there is allocated in the special class(es). And therefore, there is always enough free space to remove a special device. However, the checks for free space when removing special devices did not take this into account. This commit corrects that. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #11329
* Distributed Spare (dRAID) FeatureBrian Behlendorf2020-11-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new top-level vdev type called dRAID, which stands for Distributed parity RAID. This pool configuration allows all dRAID vdevs to participate when rebuilding to a distributed hot spare device. This can substantially reduce the total time required to restore full parity to pool with a failed device. A dRAID pool can be created using the new top-level `draid` type. Like `raidz`, the desired redundancy is specified after the type: `draid[1,2,3]`. No additional information is required to create the pool and reasonable default values will be chosen based on the number of child vdevs in the dRAID vdev. zpool create <pool> draid[1,2,3] <vdevs...> Unlike raidz, additional optional dRAID configuration values can be provided as part of the draid type as colon separated values. This allows administrators to fully specify a layout for either performance or capacity reasons. The supported options include: zpool create <pool> \ draid[<parity>][:<data>d][:<children>c][:<spares>s] \ <vdevs...> - draid[parity] - Parity level (default 1) - draid[:<data>d] - Data devices per group (default 8) - draid[:<children>c] - Expected number of child vdevs - draid[:<spares>s] - Distributed hot spares (default 0) Abbreviated example `zpool status` output for a 68 disk dRAID pool with two distributed spares using special allocation classes. ``` pool: tank state: ONLINE config: NAME STATE READ WRITE CKSUM slag7 ONLINE 0 0 0 draid2:8d:68c:2s-0 ONLINE 0 0 0 L0 ONLINE 0 0 0 L1 ONLINE 0 0 0 ... U25 ONLINE 0 0 0 U26 ONLINE 0 0 0 spare-53 ONLINE 0 0 0 U27 ONLINE 0 0 0 draid2-0-0 ONLINE 0 0 0 U28 ONLINE 0 0 0 U29 ONLINE 0 0 0 ... U42 ONLINE 0 0 0 U43 ONLINE 0 0 0 special mirror-1 ONLINE 0 0 0 L5 ONLINE 0 0 0 U5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 L6 ONLINE 0 0 0 U6 ONLINE 0 0 0 spares draid2-0-0 INUSE currently in use draid2-0-1 AVAIL ``` When adding test coverage for the new dRAID vdev type the following options were added to the ztest command. These options are leverages by zloop.sh to test a wide range of dRAID configurations. -K draid|raidz|random - kind of RAID to test -D <value> - dRAID data drives per group -S <value> - dRAID distributed hot spares -R <value> - RAID parity (raidz or dRAID) The zpool_create, zpool_import, redundancy, replacement and fault test groups have all been updated provide test coverage for the dRAID feature. Co-authored-by: Isaac Huang <[email protected]> Co-authored-by: Mark Maybee <[email protected]> Co-authored-by: Don Brady <[email protected]> Co-authored-by: Matthew Ahrens <[email protected]> Co-authored-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10102
* ZTS: ztest may cause mmp tests failuresBrian Behlendorf2020-08-171-0/+2
| | | | | | | | | | | | The mmp_exported_import and mmp_inactive_import tests depend on ztest simulating an active pool. If ztest unexpectedly terminates due to an unrelated issue the test case will fail. Since ztest is not yet 100% reliable I've added these tests to the maybe exception list. They can be removed when the issues with ztest are resolved or if the test cases are updated to handle these unexpected failures. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10726
* ZTS: zvol_misc_volmode is flaky on FreeBSDRyan Moeller2020-07-311-0/+1
| | | | | | | Mark this as a known issue. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10655
* Linux 4.10 compat: has_capability()Brian Behlendorf2020-07-191-0/+2
| | | | | | | | | | | | Stock kernels older than 4.10 do not export the has_capability() function which is required by commit e59a377. To avoid breaking the build on older kernels revert to the safe legacy behavior and return EACCES when privileges cannot be checked. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10565 Closes #10573
* Update zts-report.py with additional testsBrian Behlendorf2020-07-151-0/+4
| | | | | | | | | | | | | | | The following test cases have been observed to fail frequently enough to be a problem when reporting CI results. Until they can be updated to be entirely reliable add them to the zts-report.py script. alloc_class/alloc_class_011_neg cli_root/zpool_import/zpool_import_012_pos mmp/mmp_on_uberblocks rsend/send_partial_dataset Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10578
* Centralize variable substitutionArvind Sankar2020-07-144-15/+9
| | | | | | | | | | | | A bunch of places need to edit files to incorporate the configured paths i.e. bindir, sbindir etc. Move this logic into a common file. Create arc_summary by copying arc_summary[23] as appropriate at build time instead of install time. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10559
* Remove dependency on sharetab file and refactor sharing logicGeorge Wilson2020-07-132-1/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | == Motivation and Context The current implementation of 'sharenfs' and 'sharesmb' relies on the use of the sharetab file. The use of this file is os-specific and not required by linux or freebsd. Currently the code must maintain updates to this file which adds complexity and presents a significant performance impact when sharing many datasets. In addition, concurrently running 'zfs sharenfs' command results in missing entries in the sharetab file leading to unexpected failures. == Description This change removes the sharetab logic from the linux and freebsd implementation of 'sharenfs' and 'sharesmb'. It still preserves an os-specific library which contains the logic required for sharing NFS or SMB. The following entry points exist in the vastly simplified libshare library: - sa_enable_share -- shares a dataset but may not commit the change - sa_disable_share -- unshares a dataset but may not commit the change - sa_is_shared -- determine if a dataset is shared - sa_commit_share -- notify NFS/SMB subsystem to commit the shares - sa_validate_shareopts -- determine if sharing options are valid The sa_commit_share entry point is provided as a performance enhancement and is not required. The sa_enable_share/sa_disable_share may commit the share as part of the implementation. Libshare provides a framework for both NFS and SMB but some operating systems may not fully support these protocols or all features of the protocol. NFS Operation: For linux, libshare updates /etc/exports.d/zfs.exports to add and remove shares and then commits the changes by invoking 'exportfs -r'. This file, is automatically read by the kernel NFS implementation which makes for better integration with the NFS systemd service. For FreeBSD, libshare updates /etc/zfs/exports to add and remove shares and then commits the changes by sending a SIGHUP to mountd. SMB Operation: For linux, libshare adds and removes files in /var/lib/samba/usershares by calling the 'net' command directly. There is no need to commit the changes. FreeBSD does not support SMB. == Performance Results To test sharing performance we created a pool with an increasing number of datasets and invoked various zfs actions that would enable and disable sharing. The performance testing was limited to NFS sharing. The following tests were performed on an 8 vCPU system with 128GB and a pool comprised of 4 50GB SSDs: Scale testing: - Share all filesystems in parallel -- zfs sharenfs=on <dataset> & - Unshare all filesystems in parallel -- zfs sharenfs=off <dataset> & Functional testing: - share each filesystem serially -- zfs share -a - unshare each filesystem serially -- zfs unshare -a - reset sharenfs property and unshare -- zfs inherit -r sharenfs <pool> For 'zfs sharenfs=on' scale testing we saw an average reduction in time of 89.43% and for 'zfs sharenfs=off' we saw an average reduction in time of 83.36%. Functional testing also shows a huge improvement: - zfs share -- 97.97% reduction in time - zfs unshare -- 96.47% reduction in time - zfs inhert -r sharenfs -- 99.01% reduction in time Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Bryant G. Ly <[email protected]> Signed-off-by: George Wilson <[email protected]> External-Issue: DLPX-68690 Closes #1603 Closes #7692 Closes #7943 Closes #10300
* filesystem_limit/snapshot_limit is incorrectly enforced against rootMatthew Ahrens2020-07-111-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The filesystem_limit and snapshot_limit properties limit the number of filesystems or snapshots that can be created below this dataset. According to the manpage, "The limit is not enforced if the user is allowed to change the limit." Two types of users are allowed to change the limit: 1. Those that have been delegated the `filesystem_limit` or `snapshot_limit` permission, e.g. with `zfs allow USER filesystem_limit DATASET`. This works properly. 2. A user with elevated system privileges (e.g. root). This does not work - the root user will incorrectly get an error when trying to create a snapshot/filesystem, if it exceeds the `_limit` property. The problem is that `priv_policy_ns()` does not work if the `cred_t` is not that of the current process. This happens when `dsl_enforce_ds_ss_limits()` is called in syncing context (as part of a sync task's check func) to determine the permissions of the corresponding user process. This commit fixes the issue by passing the `task_struct` (typedef'ed as a `proc_t`) to syncing context, and then using `has_capability()` to determine if that process is privileged. Note that we still need to pass the `cred_t` to syncing context so that we can check if the user was delegated this permission with `zfs allow`. This problem only impacts Linux. Wrappers are added to FreeBSD but it continues to use `priv_check_cred()`, which works on arbitrary `cred_t`. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #8226 Closes #10545
* pam: implement a zfs_key pam modulefelixdoerre2020-06-241-0/+1
| | | | | | | | | | | | | | | | | Implements a pam module for automatically loading zfs encryption keys for home datasets. The pam module: - loads a zfs key and mounts the dataset when a session opens. - unmounts the dataset and unloads the key when the session closes. - when the user is logged on and changes the password, the module changes the encryption key. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: @jengelh <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Felix Dörre <[email protected]> Closes #9886 Closes #9903
* Fix check for sed --in-placeArvind Sankar2020-06-241-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The test added in commit 4313a5b4c51e ("Detect if sed supports --in-place") doesn't work at least on my system (autoconfig-2.69). The issue is that SED has already been found and cached before this function is evaluated, with the result that the test is completely skipped. ... checking for a sed that does not truncate output... /usr/bin/sed ... checking for sed --in-place... (cached) /usr/bin/sed The first test is executed by libtool.m4. This looks to have been around in libtool for at least 15 years or so, not sure why this was not encountered at the time of the original commit. Fix this by caching the value of the ac_inplace flag rather than the path to SED. Also use $SED and add AC_REQUIRE to ensure that we use the sed that was located by the standard configure test. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10493
* Update zts-report.py with additional testsBrian Behlendorf2020-06-221-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following test cases may still occasionally fail and are being added to the "maybe" list for Linux until they can be updated to be entirely reliable. cli_root/zfs_rename/zfs_rename_002_pos.ksh cli_root/zpool_reopen/zpool_reopen_003_pos.ksh refreserv/refreserv_raidz These 6 tests consistently fail only on Fedora 31+, the failures are related to the kernel rescanning the partition table on loopback devices which is no longer reliable unless partprobe is used. In order to enable the Fedora bot by default they are also being added to the list until the tests can be updated. Any significant regression in functionality covered by these tests will still be detected by the FreeBSD builders. alloc_class/alloc_class_009_pos alloc_class/alloc_class_010_pos cli_root/zpool_expand/zpool_expand_001_pos cli_root/zpool_expand/zpool_expand_005_pos rsend/rsend_007_pos rsend/rsend_010_pos rsend/rsend_011_pos snapshot/rollback_003_pos Signed-off-by: Brian Behlendorf <[email protected]> Closes #10489
* flake8 E741 variable name warningBrian Behlendorf2020-05-141-3/+3
| | | | | | | | | | | | | | | Update the zts-report.py script to conform to the flake8 E741 rule. "Variables named I, O, and l can be very hard to read. This is because the letter I and the letter l are easily confused, and the letter O and the number 0 can be easily confused." - https://www.flake8rules.com/rules/E741.html Reviewed-by: George Melikov <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10323
* Add FreeBSD support to OpenZFSMatthew Macy2020-04-141-2/+3
| | | | | | | | | | | | | | | | | | Add the FreeBSD platform code to the OpenZFS repository. As of this commit the source can be compiled and tested on FreeBSD 11 and 12. Subsequent commits are now required to compile on FreeBSD and Linux. Additionally, they must pass the ZFS Test Suite on FreeBSD which is being run by the CI. As of this commit 1230 tests pass on FreeBSD and there are no unexpected failures. Reviewed-by: Sean Eric Fagan <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Ryan Moeller <[email protected]> Signed-off-by: Matt Macy <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #898 Closes #8987
* ZTS: Fix and change testcase cache_010_negalex2020-04-131-1/+0
| | | | | | | | | | | | | | | | Commit 379ca9c removed the requirement on aux devices to be block devices only but the test case cache_010_neg was not updated, making it fail consistently. This change changes the test to check that cache devices _can_ be anything that presents a block interface. The testcase is renamed to cache_010_pos and the exceptions for known failure removed from the test runner. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reported-by: Richard Elling <[email protected]> Signed-off-by: Alex John <[email protected]> Closes #10172