| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For draid vdevs it was possible to initiate both the
sequential and healing resilver at same time.
This fixes the following two scenarios.
1) There's a window where a sequential rebuild can
be started via ZED even if a healing resilver has been
scheduled.
- This is fixed by adding additional check in
spa_vdev_attach() for any scheduled resilver and return
appropriate error code when a resilver is already in
progress.
2) It was possible for zpool clear to start a healing
resilver when it wasn't needed at all. This occurs because
during a vdev_open() the device is presumed to be healthy not
until the device is validated by vdev_validate() and it's set
unavailable. However, by this point an async resilver will
have already been requested if the DTL isn't empty.
- This is fixed by cancelling the SPA_ASYNC_RESILVER
request immediately at the end of vdev_reopen() when a resilver
is unneeded.
Finally, added a testcase in ZTS for verification.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Dipak Ghosh <[email protected]>
Signed-off-by: Akash B <[email protected]>
Closes #14881
Closes #14892
|
|
|
|
|
|
|
| |
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Youzhong Yang <[email protected]>
Closes #14891
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 555ef90 did some general code refactoring for
dmu_buf_will_not_fill() and dmu_buf_will_fill(). However, the db_mtx was
not held when update db->db_state in those code block. The rest of the
dbuf code always holds the db_mtx when updating db_state. This is
important because cv_wait() db_changed is used to check for db_state
changes.
Updating dmu_buf_will_not_fill() and dmu_buf_will_fill() to hold the
db_mtx when updating db_state.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Brian Atkinson <[email protected]>
Closes #14875
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before allowing the ZED to mark a vdev as REMOVED due to a
hotplug event confirm that it is non-responsive with probe.
Any device which can be successfully probed should be left
ONLINE to prevent a healthy pool from being incorrectly
SUSPENDED. This may occur for at least the following two
scenarios.
1) Drive expansion (zpool online -e) in VMware environments.
If, during the partition resize operation, a partition is
removed and re-created then udev will send a removed event.
2) Re-scanning the namespaces of an NVMe device (nvme ns-rescan)
may result in a udev remove and add event being delivered.
Finally, update the ZED to only kick in a spare when the
removal was successful.
Reviewed-by: Ameer Hamza <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #14859
Closes #14861
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added a flag '-e' in zpool scrub to scrub only blocks in error log. A
user can pause, resume and cancel the error scrub by passing additional
command line arguments -p -s just like a regular scrub. This involves
adding a new flag, creating new libzfs interfaces, a new ioctl, and the
actual iteration and read-issuing logic. Error scrubbing is executed in
multiple txg to make sure pool performance is not affected.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Co-authored-by: TulsiJain [email protected]
Signed-off-by: George Amanakis <[email protected]>
Closes #8995
Closes #12355
|
|
|
|
|
|
|
|
|
|
|
|
| |
zpool initialize functions well for touching every free byte...once.
But if we want to do it again, we're currently out of luck.
So let's add zpool initialize -u to clear it.
Co-authored-by: Rich Ercolani <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #12451
Closes #14873
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8eae2d214cfa53862833eeeda9a5c1e9d5ded47d caused Coverity to begin
complaining about "Improper use of negative value" in two places in
spa_sync_props() because Coverity correctly inferred from `prop ==
ZPOOL_PROP_INVAL` that prop could be -1 while both zpool_prop_to_name()
and zpool_prop_get_type() use it an array index, which is undefined
behavior.
Assuming that the system does not panic from an attempt to read invalid
memory, the case statement for ZPOOL_PROP_INVAL will ensure that only
user properties will reach this code when prop is ZPOOL_PROP_INVAL, such
that execution will continue safely. However, if we are unlucky enough
to read invalid memory, then the system will panic.
This issue predates the patch that caused coverity to begin complaining.
Thankfully, our userland tools do not pass nonsense to us, so this bug
should not be triggered unless a future userland tool attempts to set a
property that we do not understand.
Reported-by: Coverity (CID-1561129)
Reported-by: Coverity (CID-1561130)
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Amanakis <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14860
|
|
|
|
|
|
|
|
|
|
|
|
| |
6839ec6f1098c28ff7b772f1b31b832d05e6b567 placed code in
spa_remove_healed_errors() that uses a pointer after the kmem_free()
call that frees it.
Reported-by: Coverity (CID-1562375)
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Amanakis <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14860
|
|
|
|
|
|
|
|
|
| |
There is no sense to keep that memory allocated during the flush.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Prakash Surya <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14855
|
|
|
|
|
|
|
|
| |
Should not cause functional changes.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14854
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
|
|
|
|
|
|
|
|
|
|
|
| |
The dmu_buf_is_dirty() call doesn't make sense here for two reasons:
1. txg is 0 for unassigned tx, so it was a no-op.
2. It is equivalent of checking if we have dirty records and we are doing
this few lines earlier.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't know an easy way to shrink down dbuf size, so just deny block cloning
into dbufs that don't match our BP's size.
This fixes the following situation:
1. Create a small file, eg. 1kB of random bytes. Its dbuf will be 1kB.
2. Create a larger file, eg. 2kB of random bytes. Its dbuf will be 2kB.
3. Truncate the large file to 0. Its dbuf will remain 2kB.
4. Clone the small file into the large file. Small file's BP lsize is
1kB, but the large file's dbuf is 2kB.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
|
|
|
|
|
|
|
|
|
|
| |
Reimplement some of the block cloning vs dbuf logic, mostly to fix
situation where we clone a block and in the same transaction group
we want to partially overwrite the clone.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14825
|
|
|
|
|
|
|
|
|
| |
At least for RAIDZ zio_shrink() does not reduce zio size, but reduced
wsz in that case likely results in writing uninitialized memory.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14853
|
|
|
|
|
|
|
|
|
|
|
| |
Protect zvol_cdev_read with zv_suspend_lock to prevent concurrent
release of the dnode, avoiding panic when a snapshot is rolled back
in parallel during ongoing zvol read operation.
Reviewed-by: Chunwei Chen <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ameer Hamza <[email protected]>
Closes #14839
|
|
|
|
|
|
|
|
|
|
| |
Provides an interface which callers can use to declare a write when
the exact starting offset in not yet known. Since the full range
being updated is not available only the first L0 block at the
provided offset will be prefetched.
Reviewed-by: Olaf Faaland <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #14819
|
|
|
|
|
|
|
|
|
|
| |
l2arc_evict() performs the adjustment of the size of buffers to be
written on L2ARC unnecessarily. l2arc_write_size() is called right
before l2arc_evict() and performs those adjustments.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14828
|
|
|
|
|
|
|
|
|
|
| |
We only need to know if ZIO has any parent there. We do not care if
it has more than one, but use of zio_unique_parent() == NULL asserts
that.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14823
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case check_filesystem() does not error out and does not report
an error, remove that error block from error lists and logs
without requiring a scrub. This can happen when the original file and
all snapshots/clones referencing it have been removed.
Otherwise zpool status will still report that "Permanent errors have
been detected..." without actually reporting any of them.
To implement this change the functions introduced in corrective
receive were modified to take into account the head_errlog feature.
Before this change:
=============================
pool: test
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
config:
NAME STATE READ WRITE CKSUM
test ONLINE 0 0 0
/home/user/vdev_a ONLINE 0 0 2
errors: Permanent errors have been detected in the following files:
=============================
After this change:
=============================
pool: test
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are
unaffected.
action: Determine if the device needs to be replaced, and clear the
errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
config:
NAME STATE READ WRITE CKSUM
test ONLINE 0 0 0
/home/user/vdev_a ONLINE 0 0 2
errors: No known data errors
=============================
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14813
|
|
|
|
|
|
|
|
|
|
| |
For the head_errlog feature use dsl_dataset_hold_obj_flags() instead of
dsl_dataset_hold_obj() in order to enable access to the encryption keys
(if loaded). This enables reporting of errors in encrypted filesystems
which are not mounted but have their keys loaded.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14837
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a block pointer is corrupted (but the block containing it checksums
correctly, e.g. due to a bug that overwrites random memory), we can
often detect it before the block is read, with the `zfs_blkptr_verify()`
function, which is used in `arc_read()`, `zio_free()`, etc.
However, such corruption is not typically recoverable. To recover from
it we would need to detect the memory error before the block pointer is
written to disk.
This PR verifies BP's that are contained in indirect blocks and dnodes
before they are written to disk, in `dbuf_write_ready()`. This way,
we'll get a panic before the on-disk data is corrupted. This will help
us to diagnose what's causing the corruption, as well as being much
easier to recover from.
To minimize performance impact, only checks that can be done without
holding the spa_config_lock are performed.
Additionally, when corruption is detected, the raw words of the block
pointer are logged. (Note that `dprintf_bp()` is a no-op by default,
but if enabled it is not safe to use with invalid block pointers.)
Reviewed-by: Rich Ercolani <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Paul Zuchowski <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #14817
|
|
|
|
|
|
|
|
|
|
|
| |
- There is no reason to assert that added gang is not empty. It
may be weird to add an empty gang, but it is legal.
- When moving chain list from the added gang clear its size, or it
will trigger assertion in abd_verify() when that gang is freed.
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14816
|
|
|
|
|
|
|
|
|
| |
On kernel module unload, free all zfsdev state structures, except for
zfsdev_state_listhead, which is statically allocated.
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14824
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
spa_import() relies on a pool config fetched by spa_try_import() for
spare/cache devices. Import flags are not passed to spa_tryimport(),
which makes it return early due to a missing log device and missing
retrieving the cache device and spare eventually. Passing
ZFS_IMPORT_MISSING_LOG to spa_tryimport() makes it fetch the correct
configuration regardless of the missing log device.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ameer Hamza <[email protected]>
Closes #14794
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Integrate check_clones() into check_filesystem() and implement a list
instead of iterating recursively over the clones, thus eliminating the
risk of a stack overflow.
Also use kmem_zalloc() to allocate large structures in
process_error_log() reducing its stack size from ~700 to ~128 bytes.
Reviewed-by: Tino Reichardt <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #14744
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #14806
|
|
|
|
|
|
|
|
|
|
|
| |
This is a temporary measure until a better fix is sorted out.
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Rich Ercolani <[email protected]>
Signed-off-by: Mateusz Guzik <[email protected]>
Sponsored by: Rubicon Communications, LLC ("Netgate")
Closes #14785
Closes #14808
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when layering the ABD buffer of each split block on top of
an indirect vdev's ZIO ABD we don't specify the split block's ABD.
This results in those ABDs being incorrectly sized by inheriting
the size of their parent ABD which is larger than what each split
block needs.
The above behavior isn't causing any bugs currently but can lead
to unexpected ABD sizes for people analyzing and/or working on
the ZIO codepath. This patch fixes this behavior by properly setting
the ABD size for split block ZIOs.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Serapheim Dimitropoulos <[email protected]>
Closes #14804
|
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD/powerpc64 is all ELFv2 since FreeBSD 13, even big endian. The
existing sha256 and sha512 asm code assumes that BE is all ELFv1, and LE
is ELFv2. Minor changes to add ELFv2 in the BE side gets this working
correctly on FreeBSD with latest OpenZFS import.
Reviewed-by: Tino Reichardt <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Justin Hibbits <[email protected]>
Closes #14779
|
|
|
|
|
|
|
|
|
|
|
| |
TX_COMMIT has no on-disk representation and does not produce any more
dirty data. It should not wait for anything, and even just skipping
the checks if not waiting gives improvement noticeable in profiler.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Prakash Surya <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14798
|
|
|
|
|
|
|
|
|
|
| |
The x18 register isn't useable within FreeBSD kernel space, so we
have to fix the BLAKE3 aarch64 assembly for not using it.
The source files are here: https://github.com/mcmilk/BLAKE3-tests
Reviewed-by: Kyle Evans <[email protected]>
Signed-off-by: Tino Reichardt <[email protected]>
Closes #14728
|
|
|
|
|
|
|
|
|
| |
Resolve a missed checkstyle warning.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Mateusz Guzik <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #14799
|
|
|
|
|
|
|
|
|
|
|
| |
Gang ABDs without childred are legal, and they do have zero size.
For other ABD types zero size doesn't have much sense and likely
not working correctly now.
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14795
|
|
|
|
|
|
|
|
|
|
| |
Without the change:
/.zfs
/.zfs/snapshot
find: /.zfs: Invalid argument
Signed-off-by: Mateusz Guzik <[email protected]>
Closes #14774
|
|
|
|
|
| |
Signed-off-by: Mateusz Guzik <[email protected]>
Closes #14774
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 4c856fb333ac57d9b4a6ddd44407fd022a702f00 to
resolve a newly introduced deadlock which in practice in more
disruptive that the issue this commit intended to address.
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Mark Maybee <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #14775
Closes #14790
|
|
|
|
|
|
|
|
|
|
|
| |
Add loongarch64 definitions & lua module setjmp asm
LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V.
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Han Gao <[email protected]>
Signed-off-by: WANG Xuerui <[email protected]>
Closes #13422
|
|
|
|
|
|
|
|
|
|
| |
It became illegal to not have them as of
5f6df177758b9dff88e4b6069aeb2359e8b0c493 ("vfs: validate that vop
vectors provide all or none fplookup vops") upstream.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Mateusz Guzik <[email protected]>
Closes #14788
|
|
|
|
|
|
|
|
|
|
| |
Not complete, but already shaves on some locking.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Mateusz Guzik <[email protected]>
Sponsored by: Rubicon Communications, LLC ("Netgate")
Closes #14723
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
API contract requires VOPs to handle EXDEV internally, worst case by
falling back to the generic copy routine. This broke with the recent
changes.
While here whack custom loop to lock 2 vnodes with vn_lock_pair, which
provides the same functionality internally. write start/finish around
it plays no role so got eliminated.
One difference is that vn_lock_pair always takes an exclusive lock on
both vnodes. I did not patch around it because current code takes an
exclusive lock on the target vnode. zfs supports shared-locking for
writes, so this serializes different calls to the routine as is, despite
range locking inside. At the same time you may notice the source vnode
can get some traffic if only shared-locked, thus once more this goes
the safer route of exclusive-locking. Note this should be patched to
use shared-locking for both once the feature is considered stable.
Technically the switch to vn_lock_pair should be a separate change, but
it would only introduce churn immediately whacked by the rest of the
patch.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Mateusz Guzik <[email protected]>
Sponsored by: Rubicon Communications, LLC ("Netgate")
Closes #14723
|
|
|
|
|
|
|
|
|
|
|
| |
Noticed while attempting to change FreeBSD's boolean_t into an actual
bool: in include/sys/zfs_ioctl_impl.h, zfs_vfs_held() is declared to
return a boolean_t, but in module/os/freebsd/zfs/zfs_ioctl_os.c it is
defined to return an int. Make the definition match the declaration.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Dimitry Andric <[email protected]>
Closes #14776
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Usage:
zpool set org.freebsd:comment="this is my pool" poolname
Tests are based on zfs_set's user property tests.
Also stop truncating property values at MAXNAMELEN, use ZFS_MAXPROPLEN.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Mateusz Piotrowski <[email protected]>
Sponsored-by: Beckhoff Automation GmbH & Co. KG.
Sponsored-by: Klara Inc.
Closes #11680
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Building with Clang on Linux generates a warning that err could be
uninitialized if mnt_ns is a NULL pointer. However, mnt_ns should never
be NULL, so there is no need to put this behind an if statement. Taking
it outside of the if statement means that the possibility of err being
uninitialized goes from being always zero in a way that the compiler
could not realize to a way that is always zero in a way that the
compiler can realize.
Sponsored-By: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Youzhong Yang <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14738
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And add it to the AVZ, this is not backwards compatible with older pools
due to an assertion in spa_sync() that verifies the number of ZAPs of
all vdevs matches the number of ZAPs in the AVZ.
Granted, the assertion only applies to #DEBUG builds - still, a feature
flag is introduced to avoid the assertion, com.klarasystems:vdev_zaps_v2
Notably, this allows to get/set properties on the root vdev:
% zpool set user:prop=value <pool> root-0
Before this commit, it was already possible to get/set properties on
top-level vdevs with the syntax <type>-<vdev_id> (e.g. mirror-0):
% zpool set user:prop=value <pool> mirror-0
This syntax also applies to the root vdev as it is is of type 'root'
with a vdev_id of 0, root-0. The keyword 'root' as an alias for
'root-0'.
The following tests have been added:
- zpool get all properties from root vdev
- zpool set a property on root vdev
- verify root vdev ZAP is created
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Wing <[email protected]>
Sponsored-by: Seagate Technology
Submitted-by: Klara, Inc.
Closes #14405
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At our site we have seen cases when multi-modifier protection is enabled
(multihost=on) on our pool and the pool gets suspended due to a single
disk that is failing and responding very slowly. Our pools have 90 disks
in them and we expect disks to fail. The current version of MMP requires
that we wait for other writers before moving on. When a disk is
responding very slowly, we observed that waiting here was bad enough to
cause the pool to suspend. This change allows the MMP thread to bypass
waiting for other threads and reduces the chances the pool gets
suspended.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Herb Wartens <[email protected]>
Closes #14659
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Spare vdev should detach from the pool when a disk is reinserted.
However, spare detachment depends on the completion of resilvering,
and if resilver does not schedule, the spare vdev keeps attached to
the pool until the next resilvering. When a zfs pool contains
several disks (25+ mirror), resilvering does not always happen when
a disk is reinserted. In this patch, spare vdev is manually detached
from the pool when resilvering does not occur and it has been tested
on both Linux and FreeBSD.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ameer Hamza <[email protected]>
Closes #14722
|