aboutsummaryrefslogtreecommitdiffstats
path: root/tests/test-runner/bin
Commit message (Collapse)AuthorAgeFilesLines
* Speed up WB_SYNC_NONE when a WB_SYNC_ALL occurs simultaneouslyShaan Nobee2022-05-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Page writebacks with WB_SYNC_NONE can take several seconds to complete since they wait for the transaction group to close before being committed. This is usually not a problem since the caller does not need to wait. However, if we're simultaneously doing a writeback with WB_SYNC_ALL (e.g via msync), the latter can block for several seconds (up to zfs_txg_timeout) due to the active WB_SYNC_NONE writeback since it needs to wait for the transaction to complete and the PG_writeback bit to be cleared. This commit deals with 2 cases: - No page writeback is active. A WB_SYNC_ALL page writeback starts and even completes. But when it's about to check if the PG_writeback bit has been cleared, another writeback with WB_SYNC_NONE starts. The sync page writeback ends up waiting for the non-sync page writeback to complete. - A page writeback with WB_SYNC_NONE is already active when a WB_SYNC_ALL writeback starts. The WB_SYNC_ALL writeback ends up waiting for the WB_SYNC_NONE writeback. The fix works by carefully keeping track of active sync/non-sync writebacks and committing when beneficial. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Shaan Nobee <[email protected]> Closes #12662 Closes #12790
* ZTS: Retry auto_spare_multiple.kshBrian Behlendorf2022-04-131-0/+1
| | | | | | | | | | The auto_spare_multiple.ksh test may incorrectly fail for a similar reason as the auto_spare_shared.ksh test. Add it to known list of exceptions which should be retried to prevent failures in the CI. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13318
* ZTS: Retry redundancy_draid_spare[1,3].kshBrian Behlendorf2022-04-131-1/+2
| | | | | | | | | | | | The redundancy_draid_spare1.ksh and redundancy_draid_spare3.ksh test cases are a little to strict for the sequential resilver case. While unlikely it is possible that a handful of correctable checksum errors will be reported resulting in a test failure. Update the zts-report.py script to allow this the test case to be retried if requested. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13318
* tests: zts-report: add #13215 to known list for FreeBSDнаб2022-04-011-0/+1
| | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13259
* tests: zts-report: issue numbers are numbersнаб2022-04-011-34/+34
| | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13259
* tests: zts-report: simplifyнаб2022-04-011-25/+12
| | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13259
* tests: zts-report: prune unused reasonsнаб2022-04-011-9/+0
| | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13259
* tests: zts-report: only known-SKIP zfs_unshare_002_pos on Linuxнаб2022-04-011-1/+1
| | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13259
* tests: clean out unused/single-use/useless commands from the listнаб2022-04-011-3/+3
| | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13259
* tests: zfs_unshare_006: log_unsupported iff usershares are actually offнаб2022-04-011-1/+1
| | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13259
* tests: don't use share/unshare exportfs aliases, support FreeBSD NFSнаб2022-04-011-8/+0
| | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13259
* ZTS: Log test name to /dev/kmsg on LinuxTony Hutter2022-03-231-9/+24
| | | | | | | | | Add a -K option to the test suite to log each test name to /dev/kmsg (on Linux), so if there's a kernel warning we'll be able to match it up to a particular test. Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #13227
* ZTS: Modify receive-o-x_props_override.ksh exceptionBrian Behlendorf2022-03-011-2/+2
| | | | | | | | | | | As previously noted in #12272 the receive-o-x_props_override.ksh test reliably fails on FreeBSD. Since we don't expect this test to pass move the exception from the "maybe" to "known" section. This way we don't retry the FAILED test when it is not expected to pass. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13167
* Add Linux kmemleak support to ZTSDamian Szuberski2022-02-241-9/+45
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Kmemleak `clear` is invoked right before every test case run. - Kmemleak `scan` is requested right after each test case is finished. - Kmemleak instrumentation is not used for setup/cleanup/pretest/posttest/failsafe stages to shorten the test case execution time. - Kmemleak periodic scan is disabled (`scan=0`) before the test suite run to avoid interfering with the on-demand scan results. - There are unavoidable potential false positives coming from kernel areas other than OpenZFS module. - The ZTS with kmemleak enabled duration is increased by ~50%. Example run ``` Running Time: 07:12:13 Percent passed: 98.3% unreferenced object 0xffff9da82aea5410 (size 80): comm "kworker/u32:10", pid 942206, jiffies 4296749716 (age 2615.516s) hex dump (first 32 bytes): 00 30 30 00 00 00 00 00 ff 8f 30 00 00 00 00 00 .00.......0..... 51 e6 77 05 a8 9d ff ff 00 00 00 00 00 00 00 00 Q.w............. backtrace: [<000000005cf1fea2>] alloc_extent_state+0x1d/0xb0 [btrfs] [<0000000083f78ae5>] set_extent_bit+0x2ff/0x670 [btrfs] [<00000000de29249e>] lock_extent_bits+0x6b/0xa0 [btrfs] [<00000000b241f424>] lock_and_cleanup_extent_if_need+0xaf/0x1c0 [btrfs] [<0000000093ca72b5>] btrfs_buffered_write+0x297/0x7d0 [btrfs] [<000000002c2938c8>] btrfs_file_write_iter+0x127/0x390 [btrfs] [<00000000b888f720>] do_iter_readv_writev+0x152/0x1b0 [<00000000320f0bcc>] do_iter_write+0x7c/0x1c0 [<000000000b5a8fe0>] lo_write_bvec+0x62/0x150 [loop] [<000000009aa03c73>] loop_process_work+0x250/0xbd0 [loop] [<00000000c7487d8a>] process_one_work+0x1f1/0x390 [<000000000b236831>] worker_thread+0x53/0x3e0 [<0000000023cb3e57>] kthread+0x127/0x150 [<000000002d48676a>] ret_from_fork+0x22/0x30 ``` Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #13084
* ZTS: Retry in import_rewind_config_changed.kshBrian Behlendorf2022-02-201-2/+0
| | | | | | | | | | | | | | | | | | | As explained by the disclaimer in the test case, "This test can fail since nothing guarantees that old MOS blocks aren't overwritten." This behavior is expected and correct, but results in a flaky test case which is problematic for the CI. The best we can do to resolve this is to retry the sub-test which failed when the MOS blocks have clearly been overwritten. When testing failures were rare enough that a single retry should normally be sufficient. However, we allow up to five for good measure. Reviewed by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13119
* ZTS: Fix vdev_zaps_004_pos.kshBrian Behlendorf2022-02-171-1/+0
| | | | | | | | | | When attaching a vdev to a mirror wait for the resilver to complete before invoking `zdb` to inspect the pool. This ensures the pool is essentially idle which allows `zdb` to open the imported pool reliably. Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13112 Closes #6935
* ZTS: Fix zpool_expand_001_posBrian Behlendorf2022-02-131-2/+0
| | | | | | | | | | | | | | | | | The dRAID section of the zpool_expand_001_pos test would reliably fail because the calculated expansion size assumed the dRAID top-level vdev was created with a distributed spare. Create the vdev as expected to resolve the test failure. This test case flaw was accidentally caused by changing the default number of dRAID distributed spares from one to zero while dRAID was being developed. Additionally, remove zpool_expand_005_pos from the list of possible faulty tests. It appears to be passing consistently in my testing. Reviewed by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13091
* ZTS: Update enospc_002_pos test caseBrian Behlendorf2022-02-041-1/+0
| | | | | | | | | | | The on-disk cost of creating a snapshot or bookmark is sufficiently low that it is difficult to make it reliably fail even when the pool is "full". In order to avoid false positives remove these two checks from the test case. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13060
* Fix test-runner on FreeBSDнаб2022-01-211-2/+2
| | | | | | | | | | | CLOCK_MONOTONIC_RAW is only a thing on Linux and macOS. I'm not actually sure why the previous hardcoding of a constant didn't error out, but when we removed it, it sure does now. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Rich Ercolani <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #12995
* Removed Python 2 and Python 3.5- supportDamian Szuberski2022-01-132-16/+8
| | | | | | | | | | | | Deprecation of Python versions below 3.6 gives opportunity to unify the build and install requirements for OpenZFS packages. The minimal supported Python version is 3.6 as this is the most recent Python package CentOS/RHEL 7 users can get. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #12925
* ZTS: Fix rollback_003_pos.kshBrian Behlendorf2021-12-221-1/+0
| | | | | | | | | | | | | | | | | | Under Linux when rolling back a mounted filesystem negative dentries may not be dropped from the cache. This can result in an ENOENT being incorrectly returned on first access. Issuing a `df` before the unmount results in the negative dentries being invalidated and side steps the issue. This is solely a workaround for the test case on Linux and not correct behavior. The core issue of invalidating negative dentries needs to be handled with a kernel side change. This is being tracked as issue #6143. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12898 Issue #6143
* ZTS: Fix refreserv_raidz.kshBrian Behlendorf2021-12-221-1/+0
| | | | | | | | | | The rerefreserv_raidz test was failing on Linux because the sync being issued doesn't guarantee a pool sync. Switch to using the sync_pool function and remove the ZTS exception for Linux. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12897
* ZTS: rsend_007_pos failuresBrian Behlendorf2021-12-211-9/+0
| | | | | | | | | | | | | | | | | | | | | | | The rsend_007_pos test reliably fails on Linux in the cleanup function. This is caused by an unmount error when attempting to recursively destroy the newly received datasets. Invoking `df` prior to the `zfs destroy` interestingly avoids the unmont error. Why this should matter is unclear and should be investigated. However, this minor tweak may allow us to remove the ZTS rsend exceptions. The subsequent rsend_010_pos and rsend_011_pos failures were a result of this initial failure. The other "maybe" failures I was unable to reproduce and have not been recently observed in the master branch. Reviewed-by: Tony Nguyen <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #5665 Closes #6086 Closes #6087 Closes #6446 Closes #12876
* ZTS: alloc_class.ksh must wait for the process to exitBrian Behlendorf2021-12-171-5/+0
| | | | | | | | | | | | | | The alloc_class_* tests may fail on Linux with an EBUSY error if `zfs destroy` is run before the `dd` process has had a chance to terminate. Wait on the pid after the `kill -9` to make sure. When testing I didn't observe any failures for the alloc_class tests. Remove them from the exceptions list, the CI was used to verify the tests pass on all platforms. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12873
* ZTS: import_rewind_device_replaced reliably failsBrian Behlendorf2021-12-061-2/+2
| | | | | | | | | | | | | | The import_rewind_device_replaced.ksh test was never entirely reliable because it depends on MOS data not being overwritten. The MOS data is not protected by the snapshot so occasional failures were always expected. However, this test is now failing reliably on all platforms indicating something has changed in the code since the test was marked "maybe". Convert the test to a "known" failure until the root cause is identified and resolved. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12821
* Linux 5.13 compat: retry zvol_open() when contendedBrian Behlendorf2021-12-011-1/+0
| | | | | | | | | | | | | | | | | | | | | | | Due to a possible lock inversion the zvol open call path on Linux needs to be able to retry in the case where the spa_namespace_lock cannot be acquired. For Linux 5.12 an older kernel this was accomplished by returning -ERESTARTSYS from zvol_open() to request that blkdev_get() drop the bdev->bd_mutex lock, reaquire it, then call the open callback again. However, as of the 5.13 kernel this behavior was removed. Therefore, for 5.12 and older kernels we preserved the existing retry logic, but for 5.13 and newer kernels we retry internally in zvol_open(). This should always succeed except in the case where a pool's vdev are layed on zvols, in which case it may fail. To handle this case vdev_disk_open() has been updated to retry when opening a device when -ERESTARTSYS is returned. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #12301 Closes #12759
* Add zfs-test facility to automatically rerun failing testsPaul Dagnelie2021-12-012-24/+107
| | | | | | | | | | | | This was a project proposed as part of the Quality theme for the hackthon for the 2021 OpenZFS Developer Summit. The idea is to improve the usability of the automated tests that get run when a PR is created by having failing tests automatically rerun in order to make flaky tests less impactful. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #12740
* Exclude zfs_copies_003_pos on LinuxBrian Behlendorf2021-11-101-0/+1
| | | | | | | | | | | This test case may fail on 5.13 and newer Linux kernels if the /dev/zvol/ device is not created by udev. Reviewed-by: Rich Ercolani <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #12301 Closes #12738
* Exclude zvol_misc_volmode for nowRich Ercolani2021-11-081-0/+1
| | | | | | | | | | | | It keeps failing, on changes which aren't related at all. So until someone runs down why, I'd like it to stop being the sole reason for CI failures. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #12733
* ZTS: Add known exceptionsBrian Behlendorf2021-10-111-0/+3
| | | | | | | | | | | | | | | | | | Add the following test failures to the exception list for FreeBSD to ensure we notice new unexpected failures. pool_checkpoint/checkpoint_big_rewind pool_checkpoint/checkpoint_indirect And the following for Linux. zvol/zvol_misc/zvol_misc_snapdev Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #12621 Issue #12622 Issue #12623 Closes #12624
* ZTS: Minimize udev_wait in zvol_misc testsRyan Moeller2021-10-011-1/+0
| | | | | | | | | | | | | | | | | | | The zvol_misc tests, in particular zvol_misc_volmode, make use of a common udev_wait function to wait for zvol devices in /dev to quiesce on Linux. On other platforms this function currently only sleeps for one second before returning. This is insufficient, and zvol_misc_volmode has been flaky on FreeBSD as a result. Replace udev_wait with block_device_wait, passing through the optional device parameter where possible. Rearrange a few checks to strengthen the verifications we are making and avoid unnecessarily sleeping. We must keep udev_wait in a couple places to pass in Github CI workflows. Remove zvol_misc_volmode from the maybe failing tests on FreeBSD in zts-report.py. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #12583
* ZTS: Remove exceptions for flaky zhack on FreeBSDRyan Moeller2021-09-011-7/+0
| | | | | | | | | Issue #11854 has been resolved, so we can remove the exceptions for it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #12527
* ZTS: Enable punch-hole tests on FreeBSDKa Ho Ng2021-08-301-0/+8
| | | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ka Ho Ng <[email protected]> Sponsored-by: The FreeBSD Foundation Closes #12458
* ZTS: Add tests for creation timeRyan Moeller2021-08-171-0/+7
| | | | | | | | | Reviewed-by: Tony Nguyen <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #12432
* Fixes in persistent L2ARCGeorge Amanakis2021-07-261-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | In l2arc_add_vdev() first decide whether the device is eligible for L2ARC rebuild or whole device trim and then add it to the list of cache devices. Otherwise l2arc_feed_thread() might already start writing on the device invalidating previous content as l2ad_hand = l2ad_start. However l2arc_rebuild_vdev() needs the device present in the cache device list to figure out its l2arc_dev_t. Fix this by moving most of l2arc_rebuild_vdev() in a new function l2arc_rebuild_dev() which does not need to search in the cache device list. In contrast to l2arc_add_vdev() we do not have to worry about l2arc_feed_thread() invalidating previous content when onlining a cache device. The device parameters (l2ad*) are not cleared when offlining the device and writing new buffers will not invalidate all previous content. In worst case only buffers that have not had their log block written to the device will be lost. Retire persist_l2arc_00{4,5,8} tests since they cover code already covered by the remaining ones. Test persist_l2arc_006 is renamed to persist_l2arc_004 and persist_l2arc_007 is renamed to persist_l2arc_005. Fix a typo in persist_l2arc_004, and remove an assertion that is not always true from l2arc_arcstats_pos. Also update an assertion in persist_l2arc_005 and explain why in a comment. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #12365
* ZED: Match added disk by pool/vdev GUID if found (#12217)Ryan Moeller2021-06-301-0/+1
| | | | | | | | | This enables ZED to auto-online vdevs that are not wholedisk managed by ZFS. Signed-off-by: Ryan Moeller <[email protected]> Reviewed-by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
* ZTS: Add known exceptionsBrian Behlendorf2021-06-231-0/+3
| | | | | | | | | | | | | | | The receive-o-x_props_override test case reliably fails on the FreeBSD main builders (but not on Linux), until the root cause is understood add this test to the FreeBSD exception list. On Linux the alloc_class_012_pos test case may occasionally fail. This is a known false positive which has also been added to the Linux exception list until the test can be made entirely reliable. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12272
* ZTS: Add known exceptionsBrian Behlendorf2021-05-111-0/+9
| | | | | | | | | | | The following seven tests been observed to occasionally fail during CI testing. This commit adds them to the list of known somewhat flaky test cases. Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12023
* ZTS: Add known exceptionsBrian Behlendorf2021-04-271-0/+3
| | | | | | | | | | Both the zpool_initialize_import_export and checkpoint_discard_busy test cases a known to occasionally fail. Add them to the list of known possible failures and reference the appropriate issue on the tracker. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11949
* ZTS: fix removal_condense_export test caseBrian Behlendorf2021-04-111-1/+0
| | | | | | | | | | | | | It's been observed in the CI that the required 25% of obsolete bytes in the mapping can be to high a threshold for this test resulting in condensing never being triggered and a test failure. To prevent these failures make the existing zfs_condense_indirect_obsolete_pct tuning available so the obsolete percentage can be reduced from 25% to 5% during this test. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11869
* ZTS: Add known exceptionsBrian Behlendorf2021-04-111-0/+3
| | | | | | | | | | | The fault/auto_spare_shared, l2arc/persist_l2arc_007_pos, and alloc_class/alloc_class_013_pos test cases are not entirely reliable and may occasionally fail resulting in a false positive in the CI. Add these tests to known list of possible failures until they can be made 100% reliable. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11890
* ZTS: Improve cleanup in removal_with_exportRyan Moeller2021-04-081-1/+0
| | | | | | | | | Kill the removal operation on every platform, not just Linux. The test has been fixed and is now stable on FreeBSD. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11856
* ZTS: Tests using zhack may fail on FreeBSDRyan Moeller2021-04-081-0/+7
| | | | | | | | | | | | As described in #11854, zhack is occasionally segfaulting on FreeBSD. Debugging this is proving to be tricky. To avoid false positives in the CI add entries for the tests that use zhack in zts-report to accept that they may occasionally fail on FreeBSD. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Issue #11854 Closes #11855
* ZTS: inheritance/inherit_001_pos is flakyRyan Moeller2021-04-021-0/+1
| | | | | | | | Add inheritance/inherit_001_pos to the maybe fails on FreeBSD list. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11830
* zts-report.py: ignore some skipped tests in Github CIGeorge Melikov2021-02-021-0/+34
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #11554
* Add basic io_uring testMatthew Macy2021-01-231-0/+1
| | | | | | | | Provide a basic test coverage for io_uring I/O. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #11497
* special device removal space accounting fixesMatthew Ahrens2020-12-171-2/+0
| | | | | | | | | | | | | | | | The space in special devices is not included in spa_dspace (or dsl_pool_adjustedsize(), or the zfs `available` property). Therefore there is always at least as much free space in the normal class, as there is allocated in the special class(es). And therefore, there is always enough free space to remove a special device. However, the checks for free space when removing special devices did not take this into account. This commit corrects that. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #11329
* Distributed Spare (dRAID) FeatureBrian Behlendorf2020-11-131-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new top-level vdev type called dRAID, which stands for Distributed parity RAID. This pool configuration allows all dRAID vdevs to participate when rebuilding to a distributed hot spare device. This can substantially reduce the total time required to restore full parity to pool with a failed device. A dRAID pool can be created using the new top-level `draid` type. Like `raidz`, the desired redundancy is specified after the type: `draid[1,2,3]`. No additional information is required to create the pool and reasonable default values will be chosen based on the number of child vdevs in the dRAID vdev. zpool create <pool> draid[1,2,3] <vdevs...> Unlike raidz, additional optional dRAID configuration values can be provided as part of the draid type as colon separated values. This allows administrators to fully specify a layout for either performance or capacity reasons. The supported options include: zpool create <pool> \ draid[<parity>][:<data>d][:<children>c][:<spares>s] \ <vdevs...> - draid[parity] - Parity level (default 1) - draid[:<data>d] - Data devices per group (default 8) - draid[:<children>c] - Expected number of child vdevs - draid[:<spares>s] - Distributed hot spares (default 0) Abbreviated example `zpool status` output for a 68 disk dRAID pool with two distributed spares using special allocation classes. ``` pool: tank state: ONLINE config: NAME STATE READ WRITE CKSUM slag7 ONLINE 0 0 0 draid2:8d:68c:2s-0 ONLINE 0 0 0 L0 ONLINE 0 0 0 L1 ONLINE 0 0 0 ... U25 ONLINE 0 0 0 U26 ONLINE 0 0 0 spare-53 ONLINE 0 0 0 U27 ONLINE 0 0 0 draid2-0-0 ONLINE 0 0 0 U28 ONLINE 0 0 0 U29 ONLINE 0 0 0 ... U42 ONLINE 0 0 0 U43 ONLINE 0 0 0 special mirror-1 ONLINE 0 0 0 L5 ONLINE 0 0 0 U5 ONLINE 0 0 0 mirror-2 ONLINE 0 0 0 L6 ONLINE 0 0 0 U6 ONLINE 0 0 0 spares draid2-0-0 INUSE currently in use draid2-0-1 AVAIL ``` When adding test coverage for the new dRAID vdev type the following options were added to the ztest command. These options are leverages by zloop.sh to test a wide range of dRAID configurations. -K draid|raidz|random - kind of RAID to test -D <value> - dRAID data drives per group -S <value> - dRAID distributed hot spares -R <value> - RAID parity (raidz or dRAID) The zpool_create, zpool_import, redundancy, replacement and fault test groups have all been updated provide test coverage for the dRAID feature. Co-authored-by: Isaac Huang <[email protected]> Co-authored-by: Mark Maybee <[email protected]> Co-authored-by: Don Brady <[email protected]> Co-authored-by: Matthew Ahrens <[email protected]> Co-authored-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10102
* ZTS: ztest may cause mmp tests failuresBrian Behlendorf2020-08-171-0/+2
| | | | | | | | | | | | The mmp_exported_import and mmp_inactive_import tests depend on ztest simulating an active pool. If ztest unexpectedly terminates due to an unrelated issue the test case will fail. Since ztest is not yet 100% reliable I've added these tests to the maybe exception list. They can be removed when the issues with ztest are resolved or if the test cases are updated to handle these unexpected failures. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10726
* ZTS: zvol_misc_volmode is flaky on FreeBSDRyan Moeller2020-07-311-0/+1
| | | | | | | Mark this as a known issue. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10655