diff options
author | Tom Caputi <[email protected]> | 2018-11-28 23:47:09 -0500 |
---|---|---|
committer | Brian Behlendorf <[email protected]> | 2018-11-28 20:47:09 -0800 |
commit | c40a1124e1d1010b665909ad31d2904630018f6f (patch) | |
tree | 10e4f67bf0127d85dfa5085ca0ff09bf37605d6c /cmd/ztest | |
parent | c71c8c715b7a4f6b842f8f04c18a93086012e2a0 (diff) |
Fix consistency of ztest_device_removal_active
ztest currently uses the boolean flag ztest_device_removal_active
to protect some tests that may not run successfully if they occur
at the same time as ztest_device_removal(). Unfortunately, in the
event that ztest is in the middle of a device removal when it
decides to issue a SIGKILL, the device removal will be
automatically restarted (without setting the flag) when the pool
is re-imported on the next run. This patch corrects this by
ensuring that any in-progress removals are completed before running
further tests after the re-import.
This patch also makes a few small changes to prevent race conditions
involving the creation and destruction of spa->spa_vdev_removal,
since this field is not protected by any locks. Some checks that
may run concurrently with setting / unsetting this field have been
updated to check spa->spa_removing_phys.sr_state instead. The most
significant change here is that spa_removal_get_stats() no longer
accounts for in-flight work done, since that could result in a NULL
pointer dereference.
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed-by: Serapheim Dimitropoulos <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #8105
Diffstat (limited to 'cmd/ztest')
-rw-r--r-- | cmd/ztest/ztest.c | 22 |
1 files changed, 21 insertions, 1 deletions
diff --git a/cmd/ztest/ztest.c b/cmd/ztest/ztest.c index eab8940fb..111d45b9d 100644 --- a/cmd/ztest/ztest.c +++ b/cmd/ztest/ztest.c @@ -3573,7 +3573,7 @@ ztest_device_removal(ztest_ds_t *zd, uint64_t id) */ txg_wait_synced(spa_get_dsl(spa), 0); - while (spa->spa_vdev_removal != NULL) + while (spa->spa_removing_phys.sr_state == DSS_SCANNING) txg_wait_synced(spa_get_dsl(spa), 0); } else { mutex_exit(&ztest_vdev_lock); @@ -6887,6 +6887,26 @@ ztest_run(ztest_shared_t *zs) } zs->zs_enospc_count = 0; + /* + * If we were in the middle of ztest_device_removal() and were killed + * we need to ensure the removal and scrub complete before running + * any tests that check ztest_device_removal_active. The removal will + * be restarted automatically when the spa is opened, but we need to + * initate the scrub manually if it is not already in progress. Note + * that we always run the scrub whenever an indirect vdev exists + * because we have no way of knowing for sure if ztest_device_removal() + * fully completed its scrub before the pool was reimported. + */ + if (spa->spa_removing_phys.sr_state == DSS_SCANNING || + spa->spa_removing_phys.sr_prev_indirect_vdev != -1) { + while (spa->spa_removing_phys.sr_state == DSS_SCANNING) + txg_wait_synced(spa_get_dsl(spa), 0); + + (void) spa_scan(spa, POOL_SCAN_SCRUB); + while (dsl_scan_scrubbing(spa_get_dsl(spa))) + txg_wait_synced(spa_get_dsl(spa), 0); + } + run_threads = umem_zalloc(ztest_opts.zo_threads * sizeof (kthread_t *), UMEM_NOFAIL); |