diff options
author | Matthew Ahrens <[email protected]> | 2022-12-22 11:48:49 -0800 |
---|---|---|
committer | GitHub <[email protected]> | 2022-12-22 11:48:49 -0800 |
commit | 018f26041d67447de28c293c35e1f664ce3c1abb (patch) | |
tree | 30d8ffa1ef9a5e61281da311a582139eed401765 /cmd | |
parent | 29e1b089c14bf0f09d16801652cd82376fa3feb2 (diff) |
deadlock between spa_errlog_lock and dp_config_rwlock
There is a lock order inversion deadlock between `spa_errlog_lock` and
`dp_config_rwlock`:
A thread in `spa_delete_dataset_errlog()` is running from a sync task.
It is holding the `dp_config_rwlock` for writer (see
`dsl_sync_task_sync()`), and waiting for the `spa_errlog_lock`.
A thread in `dsl_pool_config_enter()` is holding the `spa_errlog_lock`
(see `spa_get_errlog_size()`) and waiting for the `dp_config_rwlock` (as
reader).
Note that this was introduced by #12812.
This commit address this by defining the lock ordering to be
dp_config_rwlock first, then spa_errlog_lock / spa_errlist_lock.
spa_get_errlog() and spa_get_errlog_size() can acquire the locks in this
order, and then process_error_block() and get_head_and_birth_txg() can
verify that the dp_config_rwlock is already held.
Additionally, a buffer overrun in `spa_get_errlog()` is corrected. Many
code paths didn't check if `*count` got to zero, instead continuing to
overwrite past the beginning of the userspace buffer at `uaddr`.
Tested by having some errors in the pool (via `zinject -t data
/path/to/file`), one thread running `zpool iostat 0.001`, and another
thread runs `zfs destroy` (in a loop, although it hits the first time).
This reproduces the problem easily without the fix, and works with the
fix.
Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: George Amanakis <[email protected]>
Reviewed-by: George Wilson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #14239
Closes #14289
Diffstat (limited to 'cmd')
-rw-r--r-- | cmd/zpool/zpool_main.c | 32 | ||||
-rw-r--r-- | cmd/ztest.c | 2 |
2 files changed, 7 insertions, 27 deletions
diff --git a/cmd/zpool/zpool_main.c b/cmd/zpool/zpool_main.c index af76d20f2..04e80efbb 100644 --- a/cmd/zpool/zpool_main.c +++ b/cmd/zpool/zpool_main.c @@ -8599,37 +8599,17 @@ status_callback(zpool_handle_t *zhp, void *data) if (nvlist_lookup_uint64(config, ZPOOL_CONFIG_ERRCOUNT, &nerr) == 0) { - nvlist_t *nverrlist = NULL; - - /* - * If the approximate error count is small, get a - * precise count by fetching the entire log and - * uniquifying the results. - */ - if (nerr > 0 && nerr < 100 && !cbp->cb_verbose && - zpool_get_errlog(zhp, &nverrlist) == 0) { - nvpair_t *elem; - - elem = NULL; - nerr = 0; - while ((elem = nvlist_next_nvpair(nverrlist, - elem)) != NULL) { - nerr++; - } - } - nvlist_free(nverrlist); - (void) printf("\n"); - - if (nerr == 0) - (void) printf(gettext("errors: No known data " - "errors\n")); - else if (!cbp->cb_verbose) + if (nerr == 0) { + (void) printf(gettext( + "errors: No known data errors\n")); + } else if (!cbp->cb_verbose) { (void) printf(gettext("errors: %llu data " "errors, use '-v' for a list\n"), (u_longlong_t)nerr); - else + } else { print_error_log(zhp); + } } if (cbp->cb_dedup_stats) diff --git a/cmd/ztest.c b/cmd/ztest.c index 605ce5cea..4a031dac2 100644 --- a/cmd/ztest.c +++ b/cmd/ztest.c @@ -6313,7 +6313,7 @@ ztest_scrub_impl(spa_t *spa) while (dsl_scan_scrubbing(spa_get_dsl(spa))) txg_wait_synced(spa_get_dsl(spa), 0); - if (spa_get_errlog_size(spa) > 0) + if (spa_approx_errlog_size(spa) > 0) return (ECKSUM); ztest_pool_scrubbed = B_TRUE; |