aboutsummaryrefslogtreecommitdiffstats
path: root/lib/libzfs/libzfs.abi
Commit message (Collapse)AuthorAgeFilesLines
* Add slow disk diagnosis to ZEDDon Brady2024-02-081-1/+3
| | | | | | | | | | | | | | | | | | | | | Slow disk response times can be indicative of a failing drive. ZFS currently tracks slow I/Os (slower than zio_slow_io_ms) and generates events (ereport.fs.zfs.delay). However, no action is taken by ZED, like is done for checksum or I/O errors. This change adds slow disk diagnosis to ZED which is opt-in using new VDEV properties: VDEV_PROP_SLOW_IO_N VDEV_PROP_SLOW_IO_T If multiple VDEVs in a pool are undergoing slow I/Os, then it skips the zpool_vdev_degrade(). Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Rob Wing <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #15469
* zpool: Add slot power control, print power statusTony Hutter2023-12-211-17/+80
| | | | | | | | | | | | | | | | | | | | | Add `zpool` flags to control the slot power to drives. This assumes your SAS or NVMe enclosure supports slot power control via sysfs. The new `--power` flag is added to `zpool offline|online|clear`: zpool offline --power <pool> <device> Turn off device slot power zpool online --power <pool> <device> Turn on device slot power zpool clear --power <pool> [device] Turn on device slot power If the ZPOOL_AUTO_POWER_ON_SLOT env var is set, then the '--power' option is automatically implied for `zpool online` and `zpool clear` and does not need to be passed. zpool status also gets a --power option to print the slot power status. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mart Frauenlob <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #15662
* RAID-Z expansion featureDon Brady2023-11-081-112/+259
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This feature allows disks to be added one at a time to a RAID-Z group, expanding its capacity incrementally. This feature is especially useful for small pools (typically with only one RAID-Z group), where there isn't sufficient hardware to add capacity by adding a whole new RAID-Z group (typically doubling the number of disks). == Initiating expansion == A new device (disk) can be attached to an existing RAIDZ vdev, by running `zpool attach POOL raidzP-N NEW_DEVICE`, e.g. `zpool attach tank raidz2-0 sda`. The new device will become part of the RAIDZ group. A "raidz expansion" will be initiated, and the new device will contribute additional space to the RAIDZ group once the expansion completes. The `feature@raidz_expansion` on-disk feature flag must be `enabled` to initiate an expansion, and it remains `active` for the life of the pool. In other words, pools with expanded RAIDZ vdevs can not be imported by older releases of the ZFS software. == During expansion == The expansion entails reading all allocated space from existing disks in the RAIDZ group, and rewriting it to the new disks in the RAIDZ group (including the newly added device). The expansion progress can be monitored with `zpool status`. Data redundancy is maintained during (and after) the expansion. If a disk fails while the expansion is in progress, the expansion pauses until the health of the RAIDZ vdev is restored (e.g. by replacing the failed disk and waiting for reconstruction to complete). The pool remains accessible during expansion. Following a reboot or export/import, the expansion resumes where it left off. == After expansion == When the expansion completes, the additional space is available for use, and is reflected in the `available` zfs property (as seen in `zfs list`, `df`, etc). Expansion does not change the number of failures that can be tolerated without data loss (e.g. a RAIDZ2 is still a RAIDZ2 even after expansion). A RAIDZ vdev can be expanded multiple times. After the expansion completes, old blocks remain with their old data-to-parity ratio (e.g. 5-wide RAIDZ2, has 3 data to 2 parity), but distributed among the larger set of disks. New blocks will be written with the new data-to-parity ratio (e.g. a 5-wide RAIDZ2 which has been expanded once to 6-wide, has 4 data to 2 parity). However, the RAIDZ vdev's "assumed parity ratio" does not change, so slightly less space than is expected may be reported for newly-written blocks, according to `zfs list`, `df`, `ls -s`, and similar tools. Sponsored-by: The FreeBSD Foundation Sponsored-by: iXsystems, Inc. Sponsored-by: vStack Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Authored-by: Matthew Ahrens <[email protected]> Contributions-by: Fedor Uporov <[email protected]> Contributions-by: Stuart Maybee <[email protected]> Contributions-by: Thorsten Behrens <[email protected]> Contributions-by: Fmstrat <[email protected]> Contributions-by: Don Brady <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #15022
* zed: misc vdev_enc_sysfs_path fixesTony Hutter2023-11-071-0/+7
| | | | | | | | | | | | | | | | | | There have been rare cases where the VDEV_ENC_SYSFS_PATH value that zed gets passed is stale. To mitigate this, dynamically check the sysfs path at the time of zed event processing, and use the dynamic value if possible. Note that there will be other times when we can not dynamically detect the sysfs path (like if a disk disappears) and have to rely on the old value for things like turning on the fault LED. That is to say, we can't just blindly use the dynamic path in every case. Also: - Add enclosure sysfs entry when running 'zpool add' - Fix 'slot' and 'enc' zpool.d scripts for nvme Reviewed-by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #15462
* zvol: Implement zvol threading as a PropertyAmeer Hamza2023-10-311-1/+2
| | | | | | | | | | | | Currently, zvol threading can be switched through the zvol_request_sync module parameter system-wide. By making it a zvol property, zvol threading can be switched per zvol. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #15409
* Add prefetch property Brian Behlendorf2023-10-241-1/+2
| | | | | | | | | | | | | | | | | | | | | | ZFS prefetch is currently governed by the zfs_prefetch_disable tunable. However, this is a module-wide settings - if a specific dataset benefits from prefetch, while others have issue with it, an optimal solution does not exists. This commit introduce the "prefetch" tri-state property, which enable granular control (at dataset/volume level) for prefetching. This patch does not remove the zfs_prefetch_disable, which remains a system-wide switch for enable/disable prefetch. However, to avoid duplication, it would be preferable to deprecate and then remove the module tunable. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Ameer Hamza <[email protected]> Signed-off-by: Gionatan Danti <[email protected]> Co-authored-by: Gionatan Danti <[email protected]> Closes #15237 Closes #15436
* Add '-u' - nomount flag for zfs setUmer Saleem2023-10-021-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | This commit adds '-u' flag for zfs set operation. With this flag, mountpoint, sharenfs and sharesmb properties can be updated without actually mounting or sharing the dataset. Previously, if dataset was unmounted, and mountpoint property was updated, dataset was not mounted after the update. This behavior is changed in #15240. We mount the dataset whenever mountpoint property is updated, regardless if it's mounted or not. To provide the user with option to keep the dataset unmounted and still update the mountpoint without mounting the dataset, '-u' flag can be used. If any of mountpoint, sharenfs or sharesmb properties are updated with '-u' flag, the property is set to desired value but the operation to (re/un)mount and/or (re/un)share the dataset is not performed and dataset remains as it was before. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #15322
* Add zfs_prepare_disk script for disk firmware installTony Hutter2023-09-211-0/+4
| | | | | | | | | | Have libzfs call a special `zfs_prepare_disk` script before a disk is included into the pool. The user can edit this script to add things like a disk firmware update or a disk health check. Use of the script is totally optional. See the zfs_prepare_disk manpage for full details. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #15243
* Increase limit of redaction list by using spill blockPaul Dagnelie2023-08-261-4/+5
| | | | | | | | | | | | | | | | | | | | Currently redaction bookmarks and their associated redaction lists have a relatively low limit of 36 redaction snapshots. This is imposed by the number of snapshot GUIDs that fit in the bonus buffer of the redaction list object. While this is more than enough for most use cases, there are some limited cases where larger numbers would be useful to support. We tweak the redaction list creation code to use a spill block if the number of redaction snapshots is above the amount that would fit in the bonus buffer. We also make a small change to allow spill blocks to be use for types of data besides SA. In order to fully leverage this logic, we also change the redaction code to use vmem_alloc, to handle extremely large allocations if needed. Finally, small tweaks were made to the zfs commands and the test suite. Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #15018
* Teach zpool scrub to scrub only blocks in error logGeorge Amanakis2023-05-181-1/+2
| | | | | | | | | | | | | | | | Added a flag '-e' in zpool scrub to scrub only blocks in error log. A user can pause, resume and cancel the error scrub by passing additional command line arguments -p -s just like a regular scrub. This involves adding a new flag, creating new libzfs interfaces, a new ioctl, and the actual iteration and read-issuing logic. Error scrubbing is executed in multiple txg to make sure pool performance is not affected. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Co-authored-by: TulsiJain [email protected] Signed-off-by: George Amanakis <[email protected]> Closes #8995 Closes #12355
* Add the ability to uninitializeBrian Behlendorf2023-05-181-1/+2
| | | | | | | | | | | | zpool initialize functions well for touching every free byte...once. But if we want to do it again, we're currently out of luck. So let's add zpool initialize -u to clear it. Co-authored-by: Rich Ercolani <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #12451 Closes #14873
* Add support for zpool user propertiesAllan Jude2023-04-211-61/+70
| | | | | | | | | | | | | | | | Usage: zpool set org.freebsd:comment="this is my pool" poolname Tests are based on zfs_set's user property tests. Also stop truncating property values at MAXNAMELEN, use ZFS_MAXPROPLEN. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Mateusz Piotrowski <[email protected]> Sponsored-by: Beckhoff Automation GmbH & Co. KG. Sponsored-by: Klara Inc. Closes #11680
* Create zap for root vdevrob-wing2023-04-201-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And add it to the AVZ, this is not backwards compatible with older pools due to an assertion in spa_sync() that verifies the number of ZAPs of all vdevs matches the number of ZAPs in the AVZ. Granted, the assertion only applies to #DEBUG builds - still, a feature flag is introduced to avoid the assertion, com.klarasystems:vdev_zaps_v2 Notably, this allows to get/set properties on the root vdev: % zpool set user:prop=value <pool> root-0 Before this commit, it was already possible to get/set properties on top-level vdevs with the syntax <type>-<vdev_id> (e.g. mirror-0): % zpool set user:prop=value <pool> mirror-0 This syntax also applies to the root vdev as it is is of type 'root' with a vdev_id of 0, root-0. The keyword 'root' as an alias for 'root-0'. The following tests have been added: - zpool get all properties from root vdev - zpool set a property on root vdev - verify root vdev ZAP is created Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Wing <[email protected]> Sponsored-by: Seagate Technology Submitted-by: Klara, Inc. Closes #14405
* libzfs: add v2 iterator interfacesRob N2023-04-101-23/+79
| | | | | | | | | | | | | | | | | | | | | | | | | f6a0dac84 modified the zfs_iter_* functions to take a new "flags" parameter, and introduced a variety of flags to ask the kernel to limit the results in various ways, reducing the amount of work the caller needed to do to filter out things they didn't need. Unfortunately this change broke the ABI for existing clients (read: older versions of the `zfs` program), and was reverted 399b98198. dc95911d2 reintroduced the original patch, with the understanding that a backwards-compatible fix would be made before the 2.2 release branch was tagged. This commit is that fix. This introduces zfs_iter_*_v2 functions that have the new flags argument, and reverts the existing functions to not have the flags parameter, as they were before. The old functions are now reimplemented in terms of the new, with flags set to 0. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Original-patch-by: George Wilson <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: Klara, Inc. Closes #14597
* Fix "Add colored output to zfs list"Tino Reichardt2023-04-051-0/+4
| | | | | | | | | | | Running `zfs list -o avail rpool` resulted in a core dump. This commit will fix this. Run the needed overhead only, when `use_color()` is true. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #14712
* nvpair: Constify string functionsRichard Yao2023-03-141-14/+20
| | | | | | | | | | | | | | After addressing coverity complaints involving `nvpair_name()`, the compiler started complaining about dropping const. This lead to a rabbit hole where not only `nvpair_name()` needed to be constified, but also `nvpair_value_string()`, `fnvpair_value_string()` and a few other static functions, plus variable pointers throughout the code. The result became a fairly big change, so it has been split out into its own patch. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14612
* zcommon: Refactor FPU state handling in fletcher4Attila Fülöp2023-03-141-8/+11
| | | | | | | | | | | | | | | | | | Currently calls to kfpu_begin() and kfpu_end() are split between the init() and fini() functions of the particular SIMD implementation. This was done in #14247 as an optimization measure for the ABD adapter. Unfortunately the split complicates FPU handling on platforms that use a local FPU state buffer, like Windows and macOS. To ease porting, we introduce a boolean struct member in fletcher_4_ops_t, indicating use of the FPU, and move the FPU state handling from the SIMD implementations to the call sites. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #14600
* ABI files: sync with new Ubuntu release in CIGeorge Melikov2023-03-101-1841/+5265
| | | | | | | | | | We may try to build ZFS inside container too, but let's just sync them for now. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #14605
* Implementation of block cloning for ZFSPawel Jakub Dawidek2023-03-101-5/+9
| | | | | | | | | | | | | | | Block Cloning allows to manually clone a file (or a subset of its blocks) into another (or the same) file by just creating additional references to the data blocks without copying the data itself. Those references are kept in the Block Reference Tables (BRTs). The whole design of block cloning is documented in module/zfs/brt.c. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Pawel Jakub Dawidek <[email protected]> Closes #13392
* Configure zed's diagnosis engine with vdev propertiesrob-wing2023-01-231-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce four new vdev properties: checksum_n checksum_t io_n io_t These properties can be used for configuring the thresholds of zed's diagnosis engine and are interpeted as <N> events in T <seconds>. When this property is set to a non-default value on a top-level vdev, those thresholds will also apply to its leaf vdevs. This behavior can be overridden by explicitly setting the property on the leaf vdev. Note that, these properties do not persist across vdev replacement. For this reason, it is advisable to set the property on the top-level vdev instead of the leaf vdev. The default values for zed's diagnosis engine (10 events, 600 seconds) remains unchanged. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Rob Wing <[email protected]> Sponsored-by: Seagate Technology LLC Closes #13805
* Use setproctitle to report progress of zfs sendAmeer Hamza2023-01-171-0/+13
| | | | | | | | | | | | | | | | | This allows parsing of zfs send progress by checking the process title. Doing so requires some changes to the send code in libzfs_sendrecv.c; primarily these changes move some of the accounting around, to allow for the code to be verbose as normal, or set the process title. Unlike BSD, setproctitle() isn't standard in Linux; thus, borrowed it from libbsd with slight modifications. Authored-by: Sean Eric Fagan <[email protected]> Co-authored-by: Ryan Moeller <[email protected]> Co-authored-by: Ameer Hamza <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #14376
* zfs list: Allow more fields in ZFS_ITER_SIMPLE modeAllan Jude2022-12-131-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the fields to be listed and sorted by are constrained to those populated by dsl_dataset_fast_stat(), then zfs list is much faster, as it does not need to open each objset and reads its properties. A previous optimization by Pawel Dawidek (0cee24064a79f9c01fc4521543c37acea538405f) took advantage of this to make listing snapshot names sorted only by name much faster. However, it was limited to `-o name -s name`, this work extends this optimization to work with: - name - guid - createtxg - numclones - inconsistent - redacted - origin and could be further extended to any other properties supported by dsl_dataset_fast_stat() or similar, that do not require extra locking or reading from disk. This was committed before (9a9e2e343dfa2af28bf7910de77ae73aa006de62), but was reverted due to a regression when used with an older kernel. If the kernel does not populate zc->zc_objset_stats, we now fallback to getting the properties via the slower interface, to avoid problems with newer userland and older kernels. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #14110
* Allow to control failfastMariusz Zaborski2022-11-101-1/+2
| | | | | | | | | | | | | | | | | | | | | Linux defaults to setting "failfast" on BIOs, so that the OS will not retry IOs that fail, and instead report the error to ZFS. In some cases, such as errors reported by the HBA driver, not the device itself, we would wish to retry rather than generating vdev errors in ZFS. This new property allows that. This introduces a per vdev option to disable the failfast option. This also introduces a global module parameter to define the failfast mask value. Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Allan Jude <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Mariusz Zaborski <[email protected]> Sponsored-by: Seagate Technology LLC Submitted-by: Klara, Inc. Closes #14056
* Add membar_sync abi changeUmer Saleem2022-10-041-0/+1
| | | | | | | | | | It appears membar_sync was not present in libzfs.abi with other membar_* functions. This commit updates libzfs.abi for membar_sync. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13969
* Expose libzutil error info in libpc_handle_tUmer Saleem2022-10-041-4/+28
| | | | | | | | | | | | | | | | | | | | In libzutil, for zpool_search_import and zpool_find_config, we use libpc_handle_t internally, which does not maintain error code and it is not exposed in the interface. Due to this, the error information is not propagated to the caller. Instead, an error message is printed on stderr. This commit adds lpc_error field in libpc_handle_t and exposes it in the interface, which can be used by the users of libzutil to get the appropriate error information and handle it accordingly. Users of the API can also control if they want to print the error message on stderr. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13969
* zed: mark disks as REMOVED when they are removedAmeer Hamza2022-09-281-0/+6
| | | | | | | | | | | | | ZED does not take any action for disk removal events if there is no spare VDEV available. Added zpool_vdev_remove_wanted() in libzfs and vdev_remove_wanted() in vdev.c to remove the VDEV through ZED on removal event. This means that if you are running zed and remove a disk, it will be properly marked as REMOVED. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #13797
* Make zfs-share service resilient to stale exportsDon Brady2022-09-091-16/+39
| | | | | | | | | | | | | | | | The are a few cases where stale entries in /etc/exports.d/zfs.exports will cause the nfs-server service to fail when starting up. Since the nfs-server startup consumes /etc/exports.d/zfs.exports, the zfs-share service (which rebuilds the list of zfs exports) should run before the nfs-server service. To make the zfs-share service resilient to stale exports, this change truncates the zfs config file as part of the zfs share -a operation. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #13775
* libzfs: Remove unused zpool_get_physpath()Ryan Moeller2022-08-041-7/+0
| | | | | | | | | | This is an oddly specific function that has never had any consumers in the history of this repo. Get rid of it and the pile of helper functions that exist for it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13724
* Add snapshots_changed as propertyUmer Saleem2022-08-021-1/+2
| | | | | | | | | | | | | | | | | | | Make dd_snap_cmtime property persistent across mount and unmount operations by storing in ZAP and restore the value from ZAP on hold into dd_snap_cmtime instead of updating it. Expose dd_snap_cmtime as 'snapshots_changed' property that provides a mechanism to quickly determine whether snapshot list for dataset has changed without having to mount a dataset or iterate the snapshot list. It specifies the time at which a snapshot for a dataset was last created or deleted. This allows us to be more efficient how often we query snapshots. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13635
* Add Linux namespace delegation supportWill Andrews2022-06-101-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows ZFS datasets to be delegated to a user/mount namespace Within that namespace, only the delegated datasets are visible Works very similarly to Zones/Jailes on other ZFS OSes As a user: ``` $ unshare -Um $ zfs list no datasets available $ echo $$ 1234 ``` As root: ``` # zfs list NAME ZONED MOUNTPOINT containers off /containers containers/host off /containers/host containers/host/child off /containers/host/child containers/host/child/gchild off /containers/host/child/gchild containers/unpriv on /unpriv containers/unpriv/child on /unpriv/child containers/unpriv/child/gchild on /unpriv/child/gchild # zfs zone /proc/1234/ns/user containers/unpriv ``` Back to the user namespace: ``` $ zfs list NAME USED AVAIL REFER MOUNTPOINT containers 129M 47.8G 24K /containers containers/unpriv 128M 47.8G 24K /unpriv containers/unpriv/child 128M 47.8G 128M /unpriv/child ``` Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Will Andrews <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Mateusz Piotrowski <[email protected]> Co-authored-by: Allan Jude <[email protected]> Co-authored-by: Mateusz Piotrowski <[email protected]> Sponsored-by: Buddy <https://buddy.works> Closes #12263
* Introduce BLAKE3 checksums as an OpenZFS featureTino Reichardt2022-06-081-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds BLAKE3 checksums to OpenZFS, it has similar performance to Edon-R, but without the caveats around the latter. Homepage of BLAKE3: https://github.com/BLAKE3-team/BLAKE3 Wikipedia: https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE3 Short description of Wikipedia: BLAKE3 is a cryptographic hash function based on Bao and BLAKE2, created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O'Hearn. It was announced on January 9, 2020, at Real World Crypto. BLAKE3 is a single algorithm with many desirable features (parallelism, XOF, KDF, PRF and MAC), in contrast to BLAKE and BLAKE2, which are algorithm families with multiple variants. BLAKE3 has a binary tree structure, so it supports a practically unlimited degree of parallelism (both SIMD and multithreading) given enough input. The official Rust and C implementations are dual-licensed as public domain (CC0) and the Apache License. Along with adding the BLAKE3 hash into the OpenZFS infrastructure a new benchmarking file called chksum_bench was introduced. When read it reports the speed of the available checksum functions. On Linux: cat /proc/spl/kstat/zfs/chksum_bench On FreeBSD: sysctl kstat.zfs.misc.chksum_bench This is an example output of an i3-1005G1 test system with Debian 11: implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1196 1602 1761 1749 1762 1759 1751 skein-generic 546 591 608 615 619 612 616 sha256-generic 240 300 316 314 304 285 276 sha512-generic 353 441 467 476 472 467 426 blake3-generic 308 313 313 313 312 313 312 blake3-sse2 402 1289 1423 1446 1432 1458 1413 blake3-sse41 427 1470 1625 1704 1679 1607 1629 blake3-avx2 428 1920 3095 3343 3356 3318 3204 blake3-avx512 473 2687 4905 5836 5844 5643 5374 Output on Debian 5.10.0-10-amd64 system: (Ryzen 7 5800X) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1840 2458 2665 2719 2711 2723 2693 skein-generic 870 966 996 992 1003 1005 1009 sha256-generic 415 442 453 455 457 457 457 sha512-generic 608 690 711 718 719 720 721 blake3-generic 301 313 311 309 309 310 310 blake3-sse2 343 1865 2124 2188 2180 2181 2186 blake3-sse41 364 2091 2396 2509 2463 2482 2488 blake3-avx2 365 2590 4399 4971 4915 4802 4764 Output on Debian 5.10.0-9-powerpc64le system: (POWER 9) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1213 1703 1889 1918 1957 1902 1907 skein-generic 434 492 520 522 511 525 525 sha256-generic 167 183 187 188 188 187 188 sha512-generic 186 216 222 221 225 224 224 blake3-generic 153 152 154 153 151 153 153 blake3-sse2 391 1170 1366 1406 1428 1426 1414 blake3-sse41 352 1049 1212 1174 1262 1258 1259 Output on Debian 5.10.0-11-arm64 system: (Pi400) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 487 603 629 639 643 641 641 skein-generic 271 299 303 308 309 309 307 sha256-generic 117 127 128 130 130 129 130 sha512-generic 145 165 170 172 173 174 175 blake3-generic 81 29 71 89 89 89 89 blake3-sse2 112 323 368 379 380 371 374 blake3-sse41 101 315 357 368 369 364 360 Structurally, the new code is mainly split into these parts: - 1x cross platform generic c variant: blake3_generic.c - 4x assembly for X86-64 (SSE2, SSE4.1, AVX2, AVX512) - 2x assembly for ARMv8 (NEON converted from SSE2) - 2x assembly for PPC64-LE (POWER8 converted from SSE2) - one file for switching between the implementations Note the PPC64 assembly requires the VSX instruction set and the kfpu_begin() / kfpu_end() calls on PowerPC were updated accordingly. Reviewed-by: Felix Dörre <[email protected]> Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Co-authored-by: Rich Ercolani <[email protected]> Closes #10058 Closes #12918
* libzfs: return (allocated) strings instead of filling buffersнаб2022-05-181-1384/+1380
| | | | | | | | | This also expands the zfs version output from 127 characters to However Many Are Actually Set Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13330
* libzfs: constify zfs_strip_partition(), zfs_strip_path()наб2022-05-161-3/+3
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13413
* Replace libzfs sharing _nfs() and _smb() APIs with protocol listsнаб2022-05-121-99/+90
| | | | | | | | | | With the additional benefit of removing all the _all() functions and treating a NULL list as "all" ‒ the remaining all function is for all /datasets/, which is consistent with the rest of the API Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13165
* Improve zpool status output, list all affected datasetsGeorge Amanakis2022-04-251-5/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, determining which datasets are affected by corruption is a manual process. The primary difficulty in reporting the list of affected snapshots is that since the error was initially found, the snapshot where the error originally occurred in, may have been deleted. To solve this issue, we add the ID of the head dataset of the original snapshot which the error was detected in, to the stored error report. Then any time a filesystem is deleted, the errors associated with it are deleted as well. Any time a clone promote occurs, we modify reports associated with the original head to refer to the new head. The stored error reports are identified by this head ID, the birth time of the block which the error occurred in, as well as some information about the error itself are also stored. Once this information is stored, we can find the set of datasets affected by an error by walking back the list of snapshots in the given head until we find one with the appropriate birth txg, and then traverse through the snapshots of the clone family, terminating a branch if the block was replaced in a given snapshot. Then we report this information back to libzfs, and to the zpool status command, where it is displayed as follows: pool: test state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A scan: scrub repaired 0B in 00:00:00 with 800 errors on Fri Dec 3 08:27:57 2021 config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 sdb ONLINE 0 0 1.58K errors: Permanent errors have been detected in the following files: test@1:/test.0.0 /test/test.0.0 /test/1clone/test.0.0 A new feature flag is introduced to mark the presence of this change, as well as promotion and backwards compatibility logic. This is an updated version of #9175. Rebase required fixing the tests, updating the ABI of libzfs, updating the man pages, fixing bugs, fixing the error returns, and updating the old on-disk error logs to the new format when activating the feature. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Co-authored-by: TulsiJain <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #9175 Closes #12812
* Flex non-pretty-printed properties and raw-/pretty-print remaining onesнаб2022-03-041-48/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: nabijaczleweli@tarta:~/store/code/zfs$ /sbin/zpool list -Td -o name,size,alloc,free,ckpoint,expandsz,guid,load_guid,frag,cap,dedup,health,altroot,guid,dedupditto,load_guid,maxblocksize,maxdnodesize 2>/dev/null Sun 20 Feb 03:57:44 CET 2022 NAME SIZE ALLOC FREE CKPOINT EXPANDSZ GUID LOAD_GUID FRAG CAP DEDUP HEALTH ALTROOT GUID DEDUPDITTO LOAD_GUID MAXBLOCKSIZE MAXDNODESIZE filling 25.5T 6.52T 18.9T - 64M 11512889483096932869 11656109927366648364 1% 25% 1.00x ONLINE - 11512889483096932869 0 11656109927366648364 1048576 16384 tarta-boot 240M 50.6M 189M - - 2372068846917849656 7752280792179633787 12% 21% 1.00x ONLINE - 2372068846917849656 0 7752280792179633787 1048576 512 tarta-zoot 55.5G 6.42G 49.1G - - 12971868889665384604 8622632123393589527 17% 11% 1.00x ONLINE - 12971868889665384604 0 8622632123393589527 1048576 16384 nabijaczleweli@tarta:~/store/code/zfs$ /sbin/zfs list -o name,guid,keyguid,ivsetguid,createtxg,objsetid,pbkdf2iters,refratio -r tarta-zoot NAME GUID KEYGUID IVSETGUID CREATETXG OBJSETID PBKDF2ITERS REFRATIO tarta-zoot 1110930838977259561 659P - 1 54 0 1.03x tarta-zoot/PAGEFILE.SYS 2202570496672997800 3.20E - 2163 1539 0 1.07x tarta-zoot/dupa 16941280502417785695 9.81E - 2274707 1322 1000000000000 1.00x tarta-zoot/etc 17029963068508333530 12.9E - 3663 1087 0 1.52x tarta-zoot/home 3508163802370032575 8.50E - 3664 294 0 1.00x tarta-zoot/home/misio 7283672744014848555 13.0E - 3665 302 0 2.28x tarta-zoot/home/nabijaczleweli 12286744508078616303 5.15E - 3666 200 0 2.05x tarta-zoot/home/nabijaczleweli/tftp 13551632689932817643 5.16E - 3667 1095 0 1.00x tarta-zoot/home/root 5203106193060067946 15.4E - 3668 698 0 2.86x tarta-zoot/home/shared-config 8866040021005142194 14.5E - 3670 2069 0 1.20x tarta-zoot/home/tymek 9472751824283011822 4.56E - 3671 1202 0 1.32x tarta-zoot/oldboot 10460192444135730377 13.8E - 2268398 1232 0 1.01x tarta-zoot/opt 9945621324983170410 5.84E - 3672 1210 0 1.00x tarta-zoot/opt/icecc 13178238931846132425 9.04E - 3673 1103 0 2.83x tarta-zoot/opt/swtpm 10172962421514870859 4.13E - 825669 145132 0 1.87x tarta-zoot/srv 217179989022738337 3.90E - 3674 2469 0 1.00x tarta-zoot/usr 12214213243060765090 15.0E - 3675 2477 0 2.58x tarta-zoot/usr/local 7542700368693813134 941P - 3676 2484 0 2.33x tarta-zoot/var 13414177124447929530 10.2E - 3677 2492 0 1.57x tarta-zoot/var/lib 6969944550407159241 5.28E - 3678 2499 0 2.34x tarta-zoot/var/tmp 6399468088048343912 1.34E - 3679 1218 0 3.95x After: nabijaczleweli@tarta:~/store/code/zfs$ cmd/zpool/zpool list -Td -o name,size,alloc,free,ckpoint,expandsz,guid,load_guid,frag,cap,dedup,health,altroot,guid,dedupditto,load_guid,maxblocksize,maxdnodesize 2>/dev/null Sun 20 Feb 03:57:42 CET 2022 NAME SIZE ALLOC FREE CKPOINT EXPANDSZ GUID LOAD_GUID FRAG CAP DEDUP HEALTH ALTROOT GUID DEDUPDITTO LOAD_GUID MAXBLOCKSIZE MAXDNODESIZE filling 25.5T 6.52T 18.9T - 64M 11512889483096932869 11656109927366648364 1% 25% 1.00x ONLINE - 11512889483096932869 0 11656109927366648364 1M 16K tarta-boot 240M 50.6M 189M - - 2372068846917849656 7752280792179633787 12% 21% 1.00x ONLINE - 2372068846917849656 0 7752280792179633787 1M 512 tarta-zoot 55.5G 6.42G 49.1G - - 12971868889665384604 8622632123393589527 17% 11% 1.00x ONLINE - 12971868889665384604 0 8622632123393589527 1M 16K nabijaczleweli@tarta:~/store/code/zfs$ cmd/zfs/zfs list -o name,guid,keyguid,ivsetguid,createtxg,objsetid,pbkdf2iters,refratio -r tarta-zoot NAME GUID KEYGUID IVSETGUID CREATETXG OBJSETID PBKDF2ITERS REFRATIO tarta-zoot 1110930838977259561 741529699813639505 - 1 54 0 1.03x tarta-zoot/PAGEFILE.SYS 2202570496672997800 3689529982640017884 - 2163 1539 0 1.07x tarta-zoot/dupa 16941280502417785695 11312442953423259518 - 2274707 1322 1000000000000 1.00x tarta-zoot/etc 17029963068508333530 14852574366795347233 - 3663 1087 0 1.52x tarta-zoot/home 3508163802370032575 9802810070759776956 - 3664 294 0 1.00x tarta-zoot/home/misio 7283672744014848555 14983161489316798151 - 3665 302 0 2.28x tarta-zoot/home/nabijaczleweli 12286744508078616303 5937870537299886218 - 3666 200 0 2.05x tarta-zoot/home/nabijaczleweli/tftp 13551632689932817643 5950522828900813054 - 3667 1095 0 1.00x tarta-zoot/home/root 5203106193060067946 17718025091255443518 - 3668 698 0 2.86x tarta-zoot/home/shared-config 8866040021005142194 16716354482778968577 - 3670 2069 0 1.20x tarta-zoot/home/tymek 9472751824283011822 5251854710505749954 - 3671 1202 0 1.32x tarta-zoot/oldboot 10460192444135730377 15894065034622168157 - 2268398 1232 0 1.01x tarta-zoot/opt 9945621324983170410 6737735639539098405 - 3672 1210 0 1.00x tarta-zoot/opt/icecc 13178238931846132425 10425145983015238428 - 3673 1103 0 2.83x tarta-zoot/opt/swtpm 10172962421514870859 4764783754852521469 - 825669 145132 0 1.87x tarta-zoot/srv 217179989022738337 4492810461439647259 - 3674 2469 0 1.00x tarta-zoot/usr 12214213243060765090 17306702395865262834 - 3675 2477 0 2.58x tarta-zoot/usr/local 7542700368693813134 1059954157997659784 - 3676 2484 0 2.33x tarta-zoot/var 13414177124447929530 11764397504176937123 - 3677 2492 0 1.57x tarta-zoot/var/lib 6969944550407159241 6084753728494937404 - 3678 2499 0 2.34x tarta-zoot/var/tmp 6399468088048343912 1548692824635344277 - 3679 1218 0 3.95x Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13122 Closes #13125
* log xattr=sa create/remove/update to ZILJitendra Patidar2022-02-221-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As such, there are no specific synchronous semantics defined for the xattrs. But for xattr=on, it does log to ZIL and zil_commit() is done, if sync=always is set on dataset. This provides sync semantics for xattr=on with sync=always set on dataset. For the xattr=sa implementation, it doesn't log to ZIL, so, even with sync=always, xattrs are not guaranteed to be synced before xattr call returns to caller. So, xattr can be lost if system crash happens, before txg carrying xattr transaction is synced. This change adds xattr=sa logging to ZIL on xattr create/remove/update and xattrs are synced to ZIL (zil_commit() done) for sync=always. This makes xattr=sa behavior similar to xattr=on. Implementation notes: The actual logging is fairly straight-forward and does not warrant additional explanation. However, it has been 14 years since we last added new TX types to the ZIL [1], hence this is the first time we do it after the introduction of zpool features. Therefore, here is an overview of the feature activation and deactivation workflow: 1. The feature must be enabled. Otherwise, we don't log the new record type. This ensures compatibility with older software. 2. The feature is activated per-dataset, since the ZIL is per-dataset. 3. If the feature is enabled and dataset is not for zvol, any append to the ZIL chain will activate the feature for the dataset. Likewise for starting a new ZIL chain. 4. A dataset that doesn't have a ZIL chain has the feature deactivated. We ensure (3) by activating on the first zil_commit() after the feature was enabled. Since activating the features requires waiting for txg sync, the first zil_commit() after enabling the feature will be slower than usual. The downside is that this is really a conservative approximation: even if we never append a 'TX_SETSAXATTR' to the ZIL chain, we pay the penalty for feature activation. The upside is that the user is in control of when we pay the penalty, i.e., upon enabling the feature. We ensure (4) by hooking into zil_sync(), where ZIL destroy actually happens. One more piece on feature activation, since it's spread across multiple functions: zil_commit() zil_process_commit_list() if lwb == NULL // first zil_commit since zil_open zil_create() if no log block pointer in ZIL header: if feature enabled and not active: // CASE 1 enable, COALESCE txg wait with dmu_tx that allocated the log block else // log block was allocated earlier than this zil_open if feature enabled and not active: // CASE 2 enable, EXPLICIT txg wait else // already have an in-DRAM LWB if feature enabled and not active: // this happens when we enable the feature after zil_create // CASE 3 enable, EXPLICIT txg wait [1] https://github.com/illumos/illumos-gate/commit/da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0 Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jitendra Patidar <[email protected]> Closes #8768 Closes #9078
* Add `--enable-asan` and `--enable-ubsan` switchesDamian Szuberski2022-02-031-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | `configure` now accepts `--enable-asan` and `--enable-ubsan` switches which results in passing `-fsanitize=address` and `-fsanitize=undefined`, respectively, to the compiler. Those flags are enabled in GitHub workflows for ZTS and zloop. Errors reported by both instrumentations are corrected, except for: - Memory leak reporting is (temporarily) suppressed. The cost of fixing them is relatively high compared to the gains. - Checksum computing functions in `module/zcommon/zfs_fletcher*` have UBSan errors suppressed. It is completely impractical to enforce 64-byte payload alignment there due to performance impact. - There's no ASan heap poisoning in `module/zstd/lib/zstd.c`. A custom memory allocator is used there rendering that measure unfeasible. - Memory leaks detection has to be suppressed for `cmd/zvol_id`. `zvol_id` is run by udev with the help of `ptrace(2)`. Tracing is incompatible with memory leaks detection. Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #12928
* libefi: remove efi_type()наб2022-01-181-5/+0
| | | | | | | | | | | All it is right now is some #if 0ed Solaris code that returns ENOSYS, and is only applicable for the Solaris blockdev layer. In the Illumos gate, there's a single user: rmformat(1); I recommend a read of the manual as a blast from the past, but, well Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Issue #12844 Closes #12969
* module/*.ko: prune .data, global .rodataнаб2022-01-141-94/+92
| | | | | | | | | | | | Evaluated every variable that lives in .data (and globals in .rodata) in the kernel modules, and constified/eliminated/localised them appropriately. This means that all read-only data is now actually read-only data, and, if possible, at file scope. A lot of previously- global-symbols became inlinable (and inlined!) constants. Probably not in a big Wowee Performance Moment, but hey. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12899
* Revert "zfs list: Allow more fields in ZFS_ITER_SIMPLE mode"Paul Dagnelie2022-01-061-7/+1
| | | | | | | | | This reverts commit f6a0dac84af2fba9c306a3a307ea7aafcbe32d2b. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #12938
* zcommon: pre-iterate over sysfs instead of statting every featureнаб2021-12-161-0/+16
| | | | | | | | | | | | | | | | | | | | | If sufficient memory (<2K, realistically) is available, libzfs_init() can be significantly shorted by iterating over the correct sysfs directory before registrations, we can turn 168 stats into 15/18 syscalls (3 opens (6 if built in), 3 fstats, 6 getdentses, and 3 closes), a tenfoldish reduction; this is probably a bit faster, too. The list is always optional, and registration functions (and one-off users) can simply pass NULL, which will fall back to the previous mechanism Also, don't allocate in zfs_mod_supported_impl, and use use access() instead of stat(), since existence is really what we care about Also, fix pre-prop-checking compat in fallback for built-in ZFS Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12089
* zfs list: Allow more fields in ZFS_ITER_SIMPLE modeAllan Jude2021-12-161-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the fields to be listed and sorted by are constrained to those populated by dsl_dataset_fast_stat(), then zfs list is much faster, as it does not need to open each objset and reads its properties. A previous optimization by Pawel Dawidek (0cee24064a79f9c01fc4521543c37acea538405f) took advantage of this to make listing snapshot names sorted only by name much faster. However, it was limited to `-o name -s name`, this work extends this optimization to work with: - name - guid - createtxg - numclones - inconsistent - redacted - origin and could be further extended to any other properties supported by dsl_dataset_fast_stat() or similar, that do not require extra locking or reading from disk. Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pawel Jakub Dawidek <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #11080
* Add `const` to nvlist functions to properly expose their real behaviorPaul Dagnelie2021-12-061-84/+87
| | | | | | | | Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #12728
* Vdev Properties FeatureAllan Jude2021-11-301-1/+198
| | | | | | | | | | | | | | | | | | | | | | | | Add properties, similar to pool properties, to each vdev. This makes use of the existing per-vdev ZAP that was added as part of device evacuation/removal. A large number of read-only properties are exposed, many of the members of struct vdev_t, that provide useful statistics. Adds support for read-only "removing" vdev property. Adds the "allocating" property that defaults to "on" and can be set to "off" to prevent future allocations from that top-level vdev. Supports user-defined vdev properties. Includes support for properties.vdev in SYSFS. Co-authored-by: Allan Jude <[email protected]> Co-authored-by: Mark Maybee <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #11711
* Upgrade to libabigail 2.0.0Dimitri John Ledkov2021-11-091-2529/+2216
| | | | | | | | | Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Dimitri John Ledkov <[email protected]> Closes #12722 Closes #12739
* Simplify and document OpenZFS library dependenciesBrian Behlendorf2021-10-071-17/+1786
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For those not already familiar with the code base it can be a challenge to understand how the libraries are laid out. This has sometimes resulted in functionality being added in the wrong place. To help avoid that in the future this commit documents the high-level dependencies for easy reference in lib/Makefile.am. It also simplifies a few things. - Switched libzpool dependency on libzfs_core to libzutil. This change makes it clear libzpool should never depend on the ioctl() functionality provided by libzfs_core. - Moved zfs_ioctl_fd() from libzutil to libzfs_core and renamed it lzc_ioctl_fd(). Normal access to the kmods should all be funneled through the libzfs_core library. The sole exception is the pool_active() which was updated to not use lzc_ioctl_fd() to remove the libzfs_core dependency. - Removed libzfs_core dependency on libzutil. - Removed the lib/libzfs/os/freebsd/libzfs_ioctl_compat.c source file which was all dead code. - Removed libzfs_core dependency from mkbusy and ctime test utilities. It was only needed for some trivial wrapper functions and that code is easy to replicate to shed the unneeded dependency. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Don Brady <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12602
* Upstream: unmount snapshots before destroying them on macOSJorgen Lundman2021-09-201-0/+6
| | | | | | | | | | | | | Add function zfs_destroy_snaps_nvl_os() call. The main issue is that macOS needs to unmount any mounted snapshots before they can be destroyed. Other platforms can handle this in the kernel, but sending a storm of zed events to unmount seems undesirable when we can do it in userland to start with. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Co-authored-by: ilovezfs <[email protected]> Closes #12550
* Update ABI files via new libabigail versionGeorge Melikov2021-09-021-3960/+2703
| | | | | | | Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #12529
* Add zpool_disable_datasets_os() / zfs_unmount_os()Jorgen Lundman2021-08-311-0/+17
| | | | | | | | | | | | | zpool_disable_datasets_os(): macOS needs to do a bunch of work to kick everything off zvols. zfs_unmount_os(): This allows us to unmount any zvols that may be mounted. Like with zfs destroy foo/vol Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #12436