aboutsummaryrefslogtreecommitdiffstats
path: root/tests/test-runner/bin
diff options
context:
space:
mode:
authorBrian Behlendorf <[email protected]>2022-06-23 10:36:28 -0700
committerGitHub <[email protected]>2022-06-23 10:36:28 -0700
commitad8b9f940c1e39a38af61934737c1e4cf8ab5c08 (patch)
tree45d0e726b635f673ed5dd70a20cdfaea0dc628f0 /tests/test-runner/bin
parentdeb1213098e2dc10e6eee5e5c57bb40584e096a6 (diff)
Scrub mirror children without BPs
When scrubbing a raidz/draid pool, which contains a replacing or sparing mirror with multiple online children, only one child will be read. This is not normally a serious concern because the DTL records are used to determine where a good copy of the data is. As long as the data can be read from one child the mirror vdev will use it to repair gaps in any of its children. Furthermore, even if the data which was read is corrupt the raidz code will detect this and issue its own repair I/O to correct the damage in the mirror vdev. However, in the scenario where the DTL is wrong due to silent data corruption (say due to overwriting one child) and the scrub happens to read from a child with good data, then the other damaged mirror child will not be detected nor repaired. While this is possible for both raidz and draid vdevs, it's most pronounced when using draid. This is because by default the zed will sequentially rebuild a draid pool to a distributed spare, and the distributed spare half of the mirror is always preferred since it delivers better performance. This means the damaged half of the mirror will go undetected even after scrubbing. For system administrations this behavior is non-intuitive and in a worst case scenario could result in the only good copy of the data being unknowingly detached from the mirror. This change resolves the issue by reading all replacing/sparing mirror children when scrubbing. When the BP isn't available for verification, then compare the data buffers from each child. They must all be identical, if not there's silent damage and an error is returned to prompt the top-level vdev to issue a repair I/O to rewrite the data on all of the mirror children. Since we can't tell which child was wrong a checksum error is logged against the replacing or sparing mirror vdev. Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13555
Diffstat (limited to 'tests/test-runner/bin')
-rwxr-xr-xtests/test-runner/bin/zts-report.py.in3
1 files changed, 0 insertions, 3 deletions
diff --git a/tests/test-runner/bin/zts-report.py.in b/tests/test-runner/bin/zts-report.py.in
index ddb9bb7ee..bf7cf22b6 100755
--- a/tests/test-runner/bin/zts-report.py.in
+++ b/tests/test-runner/bin/zts-report.py.in
@@ -226,9 +226,6 @@ maybe = {
'pyzfs/pyzfs_unittest': ['SKIP', python_deps_reason],
'pool_checkpoint/checkpoint_discard_busy': ['FAIL', 11946],
'projectquota/setup': ['SKIP', exec_reason],
- 'redundancy/redundancy_004_neg': ['FAIL', 7290],
- 'redundancy/redundancy_draid_spare1': ['FAIL', known_reason],
- 'redundancy/redundancy_draid_spare3': ['FAIL', known_reason],
'removal/removal_condense_export': ['FAIL', known_reason],
'reservation/reservation_008_pos': ['FAIL', 7741],
'reservation/reservation_018_pos': ['FAIL', 5642],