From a11c7aaec9c10f22f3259545e2697005cfd19863 Mon Sep 17 00:00:00 2001 From: Pavel Zakharov Date: Wed, 19 Oct 2016 17:46:08 -0400 Subject: OpenZFS 9187 - racing condition between vdev label and spa_last_synced_txg in vdev_validate ztest failed with uncorrectable IO error despite having the fix for 7163. Both sides of the mirror have CANT_OPEN_BAD_LABEL, which also distinguishes it from that issue. Definitely seems like a racing condition between the vdev_validate and spa_sync: 1. Thread A (spa_sync): vdev label is updated to latest txg 2. Thread B (vdev_validate): vdev label's txg is compared to spa_last_synced_txg and is ahead. 3. Thread A (spa_sync): spa_last_synced_txg is updated to latest txg. Solution: do not check txg in vdev_validate unless config lock is held. Authored by: Pavel Zakharov Reviewed by: George Wilson Reviewed by: Matt Ahrens Reviewed-by: Giuseppe Di Natale Approved by: Robert Mustacchi Ported-by: Brian Behlendorf OpenZFS-issue: https://illumos.org/issues/9187 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/805fda72 Closes #7529 --- module/zfs/vdev.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) (limited to 'module/zfs/vdev.c') diff --git a/module/zfs/vdev.c b/module/zfs/vdev.c index 2a28b1c9d..cebac3bb9 100644 --- a/module/zfs/vdev.c +++ b/module/zfs/vdev.c @@ -1692,8 +1692,11 @@ vdev_validate(vdev_t *vd) /* * If we are performing an extreme rewind, we allow for a label that * was modified at a point after the current txg. + * If config lock is not held do not check for the txg. spa_sync could + * be updating the vdev's label before updating spa_last_synced_txg. */ - if (spa->spa_extreme_rewind || spa_last_synced_txg(spa) == 0) + if (spa->spa_extreme_rewind || spa_last_synced_txg(spa) == 0 || + spa_config_held(spa, SCL_CONFIG, RW_WRITER) != SCL_CONFIG) txg = UINT64_MAX; else txg = spa_last_synced_txg(spa); -- cgit v1.2.3