aboutsummaryrefslogtreecommitdiffstats
path: root/cmd/zpool
Commit message (Collapse)AuthorAgeFilesLines
* zpool: Add slot power control, print power statusTony Hutter2023-12-215-29/+486
| | | | | | | | | | | | | | | | | | | | | Add `zpool` flags to control the slot power to drives. This assumes your SAS or NVMe enclosure supports slot power control via sysfs. The new `--power` flag is added to `zpool offline|online|clear`: zpool offline --power <pool> <device> Turn off device slot power zpool online --power <pool> <device> Turn on device slot power zpool clear --power <pool> [device] Turn on device slot power If the ZPOOL_AUTO_POWER_ON_SLOT env var is set, then the '--power' option is automatically implied for `zpool online` and `zpool clear` and does not need to be passed. zpool status also gets a --power option to print the slot power status. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mart Frauenlob <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #15662
* import: ignore return on hostid lookupsRob N2023-12-071-1/+2
| | | | | | | | | | | Just silencing a warning. Its totally fine for a hostid to not be there. Reported-by: Coverity (CID-1573336) Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ameer Hamza <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #15650
* zpool: flush output before sleepingRob N2023-12-051-6/+5
| | | | | | | | | | | | | | | | Several zpool commands (status, list, iostat) have modes that present some information, sleep a while, present the current state, sleep, etc. Some of those had ways to invoke them that when piped would appear to do nothing for a while, because non-terminals are block-buffered, not line-buffered, by default. Fix this by forcing a flush before sleeping. In particular, all of these buffered: - zpool status <pool> <interval> - zpool iostat -y<m> <pool> <interval> - zpool list <pool> <interval> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #15593
* ZTS: Test for all known zpool feature setsUmer Saleem2023-11-091-1/+0
| | | | | | | | | | | | | | | | | | zpool_create_features_007_pos only tested for compat-2020 feature set. It would be useful to test for all known features sets. If any additional feature is found enabled that is not present in compatibility list or feature set, it should be caught and reported earlier. This commit also removes encryption from openzfsonosx-1.8.1 compatibility list. Encryption enables bookmark_v2, since it is a dependency of encryption, but not listed in openzfsonoxx-1.8.1 compatibility list. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #15505
* RAID-Z expansion featureDon Brady2023-11-081-15/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This feature allows disks to be added one at a time to a RAID-Z group, expanding its capacity incrementally. This feature is especially useful for small pools (typically with only one RAID-Z group), where there isn't sufficient hardware to add capacity by adding a whole new RAID-Z group (typically doubling the number of disks). == Initiating expansion == A new device (disk) can be attached to an existing RAIDZ vdev, by running `zpool attach POOL raidzP-N NEW_DEVICE`, e.g. `zpool attach tank raidz2-0 sda`. The new device will become part of the RAIDZ group. A "raidz expansion" will be initiated, and the new device will contribute additional space to the RAIDZ group once the expansion completes. The `feature@raidz_expansion` on-disk feature flag must be `enabled` to initiate an expansion, and it remains `active` for the life of the pool. In other words, pools with expanded RAIDZ vdevs can not be imported by older releases of the ZFS software. == During expansion == The expansion entails reading all allocated space from existing disks in the RAIDZ group, and rewriting it to the new disks in the RAIDZ group (including the newly added device). The expansion progress can be monitored with `zpool status`. Data redundancy is maintained during (and after) the expansion. If a disk fails while the expansion is in progress, the expansion pauses until the health of the RAIDZ vdev is restored (e.g. by replacing the failed disk and waiting for reconstruction to complete). The pool remains accessible during expansion. Following a reboot or export/import, the expansion resumes where it left off. == After expansion == When the expansion completes, the additional space is available for use, and is reflected in the `available` zfs property (as seen in `zfs list`, `df`, etc). Expansion does not change the number of failures that can be tolerated without data loss (e.g. a RAIDZ2 is still a RAIDZ2 even after expansion). A RAIDZ vdev can be expanded multiple times. After the expansion completes, old blocks remain with their old data-to-parity ratio (e.g. 5-wide RAIDZ2, has 3 data to 2 parity), but distributed among the larger set of disks. New blocks will be written with the new data-to-parity ratio (e.g. a 5-wide RAIDZ2 which has been expanded once to 6-wide, has 4 data to 2 parity). However, the RAIDZ vdev's "assumed parity ratio" does not change, so slightly less space than is expected may be reported for newly-written blocks, according to `zfs list`, `df`, `ls -s`, and similar tools. Sponsored-by: The FreeBSD Foundation Sponsored-by: iXsystems, Inc. Sponsored-by: vStack Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Authored-by: Matthew Ahrens <[email protected]> Contributions-by: Fedor Uporov <[email protected]> Contributions-by: Stuart Maybee <[email protected]> Contributions-by: Thorsten Behrens <[email protected]> Contributions-by: Fmstrat <[email protected]> Contributions-by: Don Brady <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #15022
* Remove obsolete_counts from grub2 compatibility listUmer Saleem2023-11-071-1/+0
| | | | | | | | | | | | | | | | | | PR#15459 add all read-only compatible zpool features to grub2 compatibility list. 'obsolete_counts' is a read-only features that depends on 'device_removal' feature which is not read-only and is marked as ZFEATURE_FLAG_MOS. Creating a pool with grub2 compatibility enables 'device_removal' feature as well, which is not desired. This commit removes the 'obsolete_counts' feature from grub2 compatibility list, as GRUB only supports read-only compatible features. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #15499
* zed: misc vdev_enc_sysfs_path fixesTony Hutter2023-11-072-2/+14
| | | | | | | | | | | | | | | | | | There have been rare cases where the VDEV_ENC_SYSFS_PATH value that zed gets passed is stale. To mitigate this, dynamically check the sysfs path at the time of zed event processing, and use the dynamic value if possible. Note that there will be other times when we can not dynamically detect the sysfs path (like if a disk disappears) and have to rely on the old value for things like turning on the fault LED. That is to say, we can't just blindly use the dynamic path in every case. Also: - Add enclosure sysfs entry when running 'zpool add' - Fix 'slot' and 'enc' zpool.d scripts for nvme Reviewed-by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #15462
* Add all read-only compatible zpool features to grub2 compatibilityUmer Saleem2023-10-311-0/+10
| | | | | | | | | | | GRUB opens the boot pool in read-only mode. All read-only compatible features for zpool can be enabled and added to grub2 compatibility, as GRUB does not open the boot-pool for write. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #15459
* import: require force when cachefile hostid doesn't match on-diskRob Norris2023-10-061-4/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, if a cachefile is passed to zpool import, the cached config is mostly offered as-is to ZFS_IOC_POOL_TRYIMPORT->spa_tryimport(), and the results are taken as the canonical pool config and handed back to ZFS_IOC_POOL_IMPORT. In the course of its operation, spa_load() will inspect the pool and build a new config from what it finds on disk. However, it then regenerates a new config ready to import, and so rightly sets the hostid and hostname for the local host in the config it returns. Because of this, the "require force" checks always decide the pool is exported and last touched by the local host, even if this is not true, which is possible in a HA environment when MMP is not enabled. The pool may be imported on another head, but the import checks still pass here, so the pool ends up imported on both. (This doesn't happen when a cachefile isn't used, because the pool config is discovered in userspace in zpool_find_import(), and that does find the on-disk hostid and hostname correctly). Since the systemd zfs-import-cache.service unit uses cachefile imports, this can lead to a system returning after a crash with a "valid" cachefile on disk and automatically, quietly, importing a pool that has already been taken up by a secondary head. This commit causes the on-disk hostid and hostname to be included in the ZPOOL_CONFIG_LOAD_INFO item in the returned config, and then changes the "force" checks for zpool import to use them if present. This method should give no change in behaviour for old userspace on new kernels (they won't know to look for the new config items) and for new userspace on old kernels (the won't find the new config items). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes #15290
* Add zfs_prepare_disk script for disk firmware installTony Hutter2023-09-213-33/+47
| | | | | | | | | | Have libzfs call a special `zfs_prepare_disk` script before a disk is included into the pool. The user can edit this script to add things like a disk firmware update or a disk health check. Use of the script is totally optional. See the zfs_prepare_disk manpage for full details. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #15243
* cmd: add 'help' subcommand to zpool and zfsRob N2023-09-191-0/+31
| | | | | | | | | | | | | | | | 'program help subcommand' is a reasonably common pattern for multifunction command-line programs. This commit adds support for that style to the zpool and zfs commands. When run as 'zpool help [<topic>]' or 'zfs help [<topic>]', executes the 'man' program on the PATH with the most likely manpage name for the requested topic: "zpool-<topic>" or "zfs-<topic>" for subcommands, or "zpool<topic>" or "zfs<topic>" for the "concepts" and "props" topics. If no topic is supplied, uses the top "zpool" or "zfs" pages. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Kay Pedersen <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #15288
* Relax error reporting in zpool import and zpool splitUmer Saleem2023-09-011-11/+23
| | | | | | | | | | | | | | | | | For zpool import and zpool split, zpool_enable_datasets is called to mount and share all datasets in a pool. If there is an error while mounting or sharing any dataset in the pool, the status of import or split is reported as failure. However, the changes do show up in zpool list. This commit updates the error reporting in zpool import and zpool split path. More descriptive messages are shown to user in case there is an error during mount or share. Errors in mount or share do not effect the overall status of zpool import and zpool split. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #15216
* Do not report bytes skipped by scan as issued.Alexander Motin2023-06-301-39/+55
| | | | | | | | | | | | | | | | | | | | | Scan process may skip blocks based on their birth time, DVA, etc. Traditionally those blocks were accounted as issued, that caused reporting of hugely over-inflated numbers, having nothing to do with actual disk I/O. This change utilizes never used field in struct dsl_scan_phys to account such skipped bytes, allowing to report how much data were actually scrubbed/resilvered and what is the actual I/O speed. While formally it is an on-disk format change, it should be compatible both ways, so should not need a feature flag. This should partially address the same issue as c85ac731a0e, but from a different perspective, complementing it. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Akash B <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #15007
* Add compatibility symlinks for FreeBSD 12.{3,4} and 13.{0,1,2}Mike Swanson2023-05-261-0/+5
| | | | | | Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Mike Swanson <[email protected]> Closes #14902
* Adding new read-only compatible zpool features to compatibility.d/grub2Colm2023-05-261-0/+2
| | | | | | | | | | | | | | | | | GRUB2 is compatible with all "read-only compatible" features, so it is safe to add new features of this type to the grub2 compatibility list. We generally want to include all compatible features, to minimize the differences between grub2-compatible pools and no-compatibility pools. Adding new properties `livelist` and `zpool_checkpoint` accordingly. Also adding them to the man page which references this file as an example, for consistency. Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Colm Buckley <[email protected]> Closes #14893
* Update compatibility.d filesBrian Behlendorf2023-05-252-1/+45
| | | | | | | | | | | | | Add an openzfs-2.2 compatibility file for the next release. Edon-R support has been enabled for FreeBSD removing the need for different FreeBSD and Linux files. Symlinks for the -linux and -freebsd names are created for any scripts expecting that convention. Additionally, a symlink for ubunutu-22.04 was added. Signed-off-by: Brian Behlendorf <[email protected]> Closes #14833
* Teach zpool scrub to scrub only blocks in error logGeorge Amanakis2023-05-181-10/+101
| | | | | | | | | | | | | | | | Added a flag '-e' in zpool scrub to scrub only blocks in error log. A user can pause, resume and cancel the error scrub by passing additional command line arguments -p -s just like a regular scrub. This involves adding a new flag, creating new libzfs interfaces, a new ioctl, and the actual iteration and read-issuing logic. Error scrubbing is executed in multiple txg to make sure pool performance is not affected. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Co-authored-by: TulsiJain [email protected] Signed-off-by: George Amanakis <[email protected]> Closes #8995 Closes #12355
* Add the ability to uninitializeBrian Behlendorf2023-05-181-5/+17
| | | | | | | | | | | | zpool initialize functions well for touching every free byte...once. But if we want to do it again, we're currently out of luck. So let's add zpool initialize -u to clear it. Co-authored-by: Rich Ercolani <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #12451 Closes #14873
* Add support for zpool user propertiesAllan Jude2023-04-211-2/+16
| | | | | | | | | | | | | | | | Usage: zpool set org.freebsd:comment="this is my pool" poolname Tests are based on zfs_set's user property tests. Also stop truncating property values at MAXNAMELEN, use ZFS_MAXPROPLEN. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Mateusz Piotrowski <[email protected]> Sponsored-by: Beckhoff Automation GmbH & Co. KG. Sponsored-by: Klara Inc. Closes #11680
* Create zap for root vdevrob-wing2023-04-201-22/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And add it to the AVZ, this is not backwards compatible with older pools due to an assertion in spa_sync() that verifies the number of ZAPs of all vdevs matches the number of ZAPs in the AVZ. Granted, the assertion only applies to #DEBUG builds - still, a feature flag is introduced to avoid the assertion, com.klarasystems:vdev_zaps_v2 Notably, this allows to get/set properties on the root vdev: % zpool set user:prop=value <pool> root-0 Before this commit, it was already possible to get/set properties on top-level vdevs with the syntax <type>-<vdev_id> (e.g. mirror-0): % zpool set user:prop=value <pool> mirror-0 This syntax also applies to the root vdev as it is is of type 'root' with a vdev_id of 0, root-0. The keyword 'root' as an alias for 'root-0'. The following tests have been added: - zpool get all properties from root vdev - zpool set a property on root vdev - verify root vdev ZAP is created Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Wing <[email protected]> Sponsored-by: Seagate Technology Submitted-by: Klara, Inc. Closes #14405
* Values printed by zpool-iostat(8) should be right-alignedLow-power2023-04-181-4/+6
| | | | | | | | | This inappropriate left-alignment was introduced in 7bb7b1f. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: WHR <[email protected]> Closes #14751
* libzfs: add v2 iterator interfacesRob N2023-04-101-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | f6a0dac84 modified the zfs_iter_* functions to take a new "flags" parameter, and introduced a variety of flags to ask the kernel to limit the results in various ways, reducing the amount of work the caller needed to do to filter out things they didn't need. Unfortunately this change broke the ABI for existing clients (read: older versions of the `zfs` program), and was reverted 399b98198. dc95911d2 reintroduced the original patch, with the understanding that a backwards-compatible fix would be made before the 2.2 release branch was tagged. This commit is that fix. This introduces zfs_iter_*_v2 functions that have the new flags argument, and reverts the existing functions to not have the flags parameter, as they were before. The old functions are now reimplemented in terms of the new, with flags set to 0. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Original-patch-by: George Wilson <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: Klara, Inc. Closes #14597
* Storage device expansion "silently" fails on degraded vdevPaul Dagnelie2023-04-061-0/+22
| | | | | | | | | | | | When a vdev is degraded or faulted, we refuse to expand it when doing online -e. However, we also don't actually cause the online command to fail, even though the disk didn't expand. This is confusing and misleading, and can result in violated expectations. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes 14145
* Colorize zpool iostat outputTino Reichardt2023-03-241-1/+34
| | | | | | | | | | | | | | | | | | | Use a bold header and colorize the space suffixes in iostat by order of magnitude like this: - K is green - M is yellow - G is red - T is lightblue - P is magenta - E is cyan - 0 space is colored gray Reviewed-by: WHR <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ethan Coe-Renner <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #14621 Closes #14459
* nvpair: Constify string functionsRichard Yao2023-03-143-38/+40
| | | | | | | | | | | | | | After addressing coverity complaints involving `nvpair_name()`, the compiler started complaining about dropping const. This lead to a rabbit hole where not only `nvpair_name()` needed to be constified, but also `nvpair_value_string()`, `fnvpair_value_string()` and a few other static functions, plus variable pointers throughout the code. The result became a fairly big change, so it has been split out into its own patch. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14612
* Cleanup: Remove constant comparisons reported by CodeQLRichard Yao2023-03-081-9/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CodeQL's cpp/constant-comparison query from its security-and-extended query set reported 4 instances where we have comparions that always evaluate the same way. In `draid_config_by_type()`, we have an early `if (nparity == 0)` check that returns `EINVAL`, making a later `if (nparity == 0 || nparity > VDEV_DRAID_MAXPARITY)` partially redundant. The later check prints an error message when parity is 0, but the early check does not. This is not useful feedback, so we move the later check to the place where the early check runs to replace the early check. In `perform_thread_merge()`, we return when `num_threads == 0`. After that block, we do `if (num_threads > 0) {`, which will always be true. We remove the `if` statement. In `sa_modify_attrs()`, we have a loop condition that is `k != 2`, but at the end of the loop, we have `if (k == 0 && hdl->sa_spill)` followed by an else that does a break. The result is that k != 2 will never be evaluated when it is false. We drop the comparison. In `zap_leaf_array_read()`, we have a for loop condition that is `i < ZAP_LEAF_ARRAY_BYTES && len > 0`. However, that loop itself is in a loop that is `while (len > 0)` and while the value of len is decremented inside the loop, when `len == 0`, it will return, such that `len > 0` inside the loop condition will always be true. We drop that part of the condition. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14575
* Fix TOCTOU race in zpool_do_labelclear()Richard Yao2023-03-081-14/+20
| | | | | | | | | | | | | | | | | | Coverity reported a TOCTOU race in `zpool_do_labelclear()`. This is not believed to be a real security issue, but fixing it reduces the number of syscalls we do and will prevent other static analyzers from complaining about this. The code is expected to be equivalent. However, under rare circumstances, such as ELOOP, ENAMETOOLONG, ENOMEM, ENOTDIR and EOVERFLOW, we will display the error message that we currently display for the `open()` syscall rather than the one that we currently display for the `stat()` syscall. This is considered to be an improvement. Reported-by: Coverity (CID-1524188) Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14575
* Improve resilver ETAsBrian Behlendorf2023-01-251-11/+22
| | | | | | | | | | | | | | | | | | | | | | When resilvering the estimated time remaining is calculated using the average issue rate over the current pass. Where the current pass starts when a scan was started, or restarted, if the pool was exported/imported. For dRAID pools in particular this can result in wildly optimistic estimates since the issue rate will be very high while scanning when non-degraded regions of the pool are scanned. Once repair I/O starts being issued performance drops to a realistic number but the estimated performance is still significantly skewed. To address this we redefine a pass such that it starts after a scanning phase completes so the issue rate is more reflective of recent performance. Additionally, the zfs_scan_report_txgs module option can be set to reset the pass statistics more often. Reviewed-by: Akash B <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #14410
* zpool-set: print error message when pool or vdev is not validRob Wing2023-01-171-19/+17
| | | | | | | | | | Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Wing <[email protected]> Sponsored-by: Seagate Technology Submitted-by: Klara, Inc. Closes #14310
* zpool-set: update usage textRob Wing2023-01-171-1/+2
| | | | | | | | | | Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Wing <[email protected]> Sponsored-by: Seagate Technology Submitted-by: Klara, Inc. Closes #14310
* zpool: do guid-based comparison in is_vdev_cb()rob-wing2023-01-111-11/+4
| | | | | | | | | | | | | | is_vdev_cb() uses string comparison to find a matching vdev and will fallback to comparing the guid via a string. These changes drop the string comparison and compare the guids instead. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Rob Wing <[email protected]> Co-authored-by: Rob Wing <[email protected]> Sponsored-by: Seagate Technology Submitted-by: Klara, Inc. Closes #14311
* deadlock between spa_errlog_lock and dp_config_rwlockMatthew Ahrens2022-12-221-26/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a lock order inversion deadlock between `spa_errlog_lock` and `dp_config_rwlock`: A thread in `spa_delete_dataset_errlog()` is running from a sync task. It is holding the `dp_config_rwlock` for writer (see `dsl_sync_task_sync()`), and waiting for the `spa_errlog_lock`. A thread in `dsl_pool_config_enter()` is holding the `spa_errlog_lock` (see `spa_get_errlog_size()`) and waiting for the `dp_config_rwlock` (as reader). Note that this was introduced by #12812. This commit address this by defining the lock ordering to be dp_config_rwlock first, then spa_errlog_lock / spa_errlist_lock. spa_get_errlog() and spa_get_errlog_size() can acquire the locks in this order, and then process_error_block() and get_head_and_birth_txg() can verify that the dp_config_rwlock is already held. Additionally, a buffer overrun in `spa_get_errlog()` is corrected. Many code paths didn't check if `*count` got to zero, instead continuing to overwrite past the beginning of the userspace buffer at `uaddr`. Tested by having some errors in the pool (via `zinject -t data /path/to/file`), one thread running `zpool iostat 0.001`, and another thread runs `zfs destroy` (in a loop, although it hits the first time). This reproduces the problem easily without the fix, and works with the fix. Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: George Amanakis <[email protected]> Reviewed-by: George Wilson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #14239 Closes #14289
* zfs list: Allow more fields in ZFS_ITER_SIMPLE modeAllan Jude2022-12-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the fields to be listed and sorted by are constrained to those populated by dsl_dataset_fast_stat(), then zfs list is much faster, as it does not need to open each objset and reads its properties. A previous optimization by Pawel Dawidek (0cee24064a79f9c01fc4521543c37acea538405f) took advantage of this to make listing snapshot names sorted only by name much faster. However, it was limited to `-o name -s name`, this work extends this optimization to work with: - name - guid - createtxg - numclones - inconsistent - redacted - origin and could be further extended to any other properties supported by dsl_dataset_fast_stat() or similar, that do not require extra locking or reading from disk. This was committed before (9a9e2e343dfa2af28bf7910de77ae73aa006de62), but was reverted due to a regression when used with an older kernel. If the kernel does not populate zc->zc_objset_stats, we now fallback to getting the properties via the slower interface, to avoid problems with newer userland and older kernels. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #14110
* Fix potential buffer overflow in zpool commandRichard Yao2022-12-081-1/+7
| | | | | | | | | | | | | | | The ZPOOL_SCRIPTS_PATH environment variable can be passed here. This allows for arbitrarily long strings to be passed to sprintf(), which can overflow the buffer. I missed this in my earlier audit of the codebase. CodeQL's cpp/unbounded-write check caught this. Reviewed-by: Damian Szuberski <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14264
* Fix GCC 12 compilation errorsszubersk2022-11-301-1/+1
| | | | | | | | | Squelch false positives reported by GCC 12 with UBSan. Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #14150
* Expose libzutil error info in libpc_handle_tUmer Saleem2022-10-041-2/+7
| | | | | | | | | | | | | | | | | | | | In libzutil, for zpool_search_import and zpool_find_config, we use libpc_handle_t internally, which does not maintain error code and it is not exposed in the interface. Due to this, the error information is not propagated to the caller. Instead, an error message is printed on stderr. This commit adds lpc_error field in libpc_handle_t and exposes it in the interface, which can be used by the users of libzutil to get the appropriate error information and handle it accordingly. Users of the API can also control if they want to print the error message on stderr. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13969
* Fix userland resource leaksRichard Yao2022-09-231-0/+1
| | | | | | | | | | | | Coverity caught these. With the exception of the file descriptor leak in tests/zfs-tests/cmd/draid.c, they are all memory leaks. Also, there is a piece of dead code in zfs_get_enclosure_sysfs_path(). We delete it as cleanup. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13921
* zpool: Don't print "repairing" on force faulted drivesTony Hutter2022-09-231-2/+9
| | | | | | | | | | | If you force fault a drive that's resilvering, it's scan stats can get frozen in time, giving the false impression that it's being resilvered. This commit checks the vdev state to see if the vdev is healthy before reporting "resilvering" or "repairing" in zpool status. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #13927 Closes #13930
* Fix column width in 'zpool iostat -v' and 'zpool list -v'Samuel2022-09-061-4/+4
| | | | | | | | | This commit fixes a minor spacing issue caused when enumerating vdev names, which originated from #13031 Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Akash B <[email protected]> Signed-off-by: Samuel Wycliffe <[email protected]> Closes #13811
* zpool: fix redundancy check after vdev removalStéphane Lesimple2022-08-041-2/+7
| | | | | | | | | | | | | | The presence of indirect vdevs was confusing get_redundancy(), which considered a pool with e.g. only mirror top-level vdevs and at least one indirect vdev (due to the removal of a previous vdev) as already having a broken redundancy, which is not the case. This lead to the possibility of compromising the redundancy of a pool by adding mismatched vdevs without requiring the use of `-f`, and with no visible notice or warning. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Stéphane Lesimple <[email protected]> Closes #13705 Closes #13711
* Replace dead opensolaris.org license linkTino Reichardt2022-07-117-7/+7
| | | | | | | | | The commit replaces all findings of the link: http://www.opensolaris.org/os/licensing with this one: https://opensource.org/licenses/CDDL-1.0 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13619
* Enable -Wwrite-stringsнаб2022-06-293-46/+42
| | | | | | | | Also, fix leak from ztest_global_vars_to_zdb_args() Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13348
* Replace ZPROP_INVAL with ZPROP_USERPROP where it means a user propertyAllan Jude2022-06-141-2/+2
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Sponsored-by: Klara Inc. Closes #12676
* Replace EXTRA_DIST with dist_noinst_DATABrian Behlendorf2022-05-261-1/+1
| | | | | | | | | | | | | | | The EXTRA_DIST variable is ignored when used in the FALSE conditional of a Makefile.am. This results in the `make dist` target omitting these files from the generated tarball unless CONFIG_USER is defined. This issue can be avoided by switching to use the dist_noinst_DATA variable which is handled as expected by autoconf. This change also adds support for --with-config=dist as an alias for --with-config=srpm and updates the GitHub workflows to use it. Reviewed-by: Ahelenia Ziemiańska <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13459 Closes #13505
* libzfs: return (allocated) strings instead of filling buffersнаб2022-05-181-5/+1
| | | | | | | | | This also expands the zfs version output from 127 characters to However Many Are Actually Set Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13330
* zpool: max_width: monomorphise subtype iterationнаб2022-05-161-31/+12
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13413
* Replace libzfs sharing _nfs() and _smb() APIs with protocol listsнаб2022-05-121-2/+2
| | | | | | | | | | With the additional benefit of removing all the _all() functions and treating a NULL list as "all" ‒ the remaining all function is for all /datasets/, which is consistent with the rest of the API Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13165
* autoconf: use include directives instead of recursing down cmdнаб2022-05-102-97/+95
| | | | | | | | | | | | | | | | | No installation diff, dist lost -zfs-2.1.99/cmd/fsck_zfs/fsck.zfs which was distributed erroneously, since it's generated Also clean gitrev on clean Also add -e 'any possible bashisms' to default checkbashisms flags, and fully parallelise it and shellcheck, and it works out-of-tree, too Also align the Release in the dist META file correctly Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13316
* autoconf: use include directives instead of recursing down libнаб2022-05-101-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As a bonus, this also adds zfs-mount-generator (previously undescended down) and libzstd (not included) to CppCheck As a bonus bonus, abigail rules work out-of-tree, too Against current trunk: $ diff -U0 ./destdir.listing ~/store/code/zfs/destdir.listing -destdir/usr/local/include/libspl/sscanf.h $ diff --color -U0 ./zfs-2.1.99.tar.gz.listing ../oot/zfs-2.1.99.tar.gz.listing | grep -v @@ | grep -v /Makefile -zfs-2.1.99/config/Abigail.am -zfs-2.1.99/lib/libspl/include/util/ -zfs-2.1.99/lib/libspl/include/util/sscanf.h $ diff --color -U0 ./zfs-2.1.99.tar.gz.listing ../oot/zfs-2.1.99.tar.gz.listing | grep -v @@ | grep /Makefile -zfs-2.1.99/lib/libavl/Makefile.in -zfs-2.1.99/lib/libefi/Makefile.in -zfs-2.1.99/lib/libicp/Makefile.in -zfs-2.1.99/lib/libnvpair/Makefile.in -zfs-2.1.99/lib/libshare/Makefile.in -zfs-2.1.99/lib/libspl/include/Makefile.in -zfs-2.1.99/lib/libspl/include/os/freebsd/Makefile.am -zfs-2.1.99/lib/libspl/include/os/freebsd/Makefile.in -zfs-2.1.99/lib/libspl/include/os/freebsd/sys/Makefile.am -zfs-2.1.99/lib/libspl/include/os/freebsd/sys/Makefile.in -zfs-2.1.99/lib/libspl/include/os/linux/Makefile.am -zfs-2.1.99/lib/libspl/include/os/linux/Makefile.in -zfs-2.1.99/lib/libspl/include/os/linux/sys/Makefile.am -zfs-2.1.99/lib/libspl/include/os/linux/sys/Makefile.in -zfs-2.1.99/lib/libspl/include/os/Makefile.am -zfs-2.1.99/lib/libspl/include/os/Makefile.in -zfs-2.1.99/lib/libspl/include/rpc/Makefile.am -zfs-2.1.99/lib/libspl/include/rpc/Makefile.in -zfs-2.1.99/lib/libspl/include/sys/dktp/Makefile.am -zfs-2.1.99/lib/libspl/include/sys/dktp/Makefile.in -zfs-2.1.99/lib/libspl/include/sys/Makefile.am -zfs-2.1.99/lib/libspl/include/sys/Makefile.in -zfs-2.1.99/lib/libspl/include/util/Makefile.am -zfs-2.1.99/lib/libspl/include/util/Makefile.in -zfs-2.1.99/lib/libspl/Makefile.in -zfs-2.1.99/lib/libtpool/Makefile.in -zfs-2.1.99/lib/libunicode/Makefile.in -zfs-2.1.99/lib/libuutil/Makefile.in -zfs-2.1.99/lib/libzfsbootenv/Makefile.in -zfs-2.1.99/lib/libzfs_core/Makefile.in -zfs-2.1.99/lib/libzfs/Makefile.in -zfs-2.1.99/lib/libzpool/Makefile.in -zfs-2.1.99/lib/libzstd/Makefile.in -zfs-2.1.99/lib/libzutil/Makefile.in -zfs-2.1.99/lib/Makefile.in Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13316
* Workaround broken VDEV_UPATHRich Ercolani2022-05-103-10/+34
| | | | | | | | | | | Sometimes, for reasons I haven't looked into yet, VDEV_UPATH gets set to /dev/(null), breaking all these scripts. It'd be nice to have a fallback case to avoid total failure. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #13436