aboutsummaryrefslogtreecommitdiffstats
path: root/tests/test-runner/bin
diff options
context:
space:
mode:
authorBrian Behlendorf <[email protected]>2022-07-14 10:21:29 -0700
committerGitHub <[email protected]>2022-07-14 10:21:29 -0700
commit3920d7f3250007f7591c34060c0afbba6f5f174a (patch)
tree42dc93bd1cbead0a3419297b8e066d5389aad895 /tests/test-runner/bin
parent6c3c5fcfbe27d9193cd131753cc7e47ee2784621 (diff)
Scrub mirror children without BPs
When scrubbing a raidz/draid pool, which contains a replacing or sparing mirror with multiple online children, only one child will be read. This is not normally a serious concern because the DTL records are used to determine where a good copy of the data is. As long as the data can be read from one child the mirror vdev will use it to repair gaps in any of its children. Furthermore, even if the data which was read is corrupt the raidz code will detect this and issue its own repair I/O to correct the damage in the mirror vdev. However, in the scenario where the DTL is wrong due to silent data corruption (say due to overwriting one child) and the scrub happens to read from a child with good data, then the other damaged mirror child will not be detected nor repaired. While this is possible for both raidz and draid vdevs, it's most pronounced when using draid. This is because by default the zed will sequentially rebuild a draid pool to a distributed spare, and the distributed spare half of the mirror is always preferred since it delivers better performance. This means the damaged half of the mirror will go undetected even after scrubbing. For system administrations this behavior is non-intuitive and in a worst case scenario could result in the only good copy of the data being unknowingly detached from the mirror. This change resolves the issue by reading all replacing/sparing mirror children when scrubbing. When the BP isn't available for verification, then compare the data buffers from each child. They must all be identical, if not there's silent damage and an error is returned to prompt the top-level vdev to issue a repair I/O to rewrite the data on all of the mirror children. Since we can't tell which child was wrong a checksum error is logged against the replacing or sparing mirror vdev. Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13555
Diffstat (limited to 'tests/test-runner/bin')
-rwxr-xr-xtests/test-runner/bin/zts-report.py.in2
1 files changed, 0 insertions, 2 deletions
diff --git a/tests/test-runner/bin/zts-report.py.in b/tests/test-runner/bin/zts-report.py.in
index 559e98dd0..71b0cc8d6 100755
--- a/tests/test-runner/bin/zts-report.py.in
+++ b/tests/test-runner/bin/zts-report.py.in
@@ -244,8 +244,6 @@ maybe = {
'pyzfs/pyzfs_unittest': ['SKIP', python_deps_reason],
'pool_checkpoint/checkpoint_discard_busy': ['FAIL', '11946'],
'projectquota/setup': ['SKIP', exec_reason],
- 'redundancy/redundancy_004_neg': ['FAIL', '7290'],
- 'redundancy/redundancy_draid_spare3': ['SKIP', known_reason],
'removal/removal_condense_export': ['FAIL', known_reason],
'reservation/reservation_008_pos': ['FAIL', '7741'],
'reservation/reservation_018_pos': ['FAIL', '5642'],