aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ZTS: Improve cleanup in zpool testsRyan Moeller2021-03-072-33/+43
| | | | | | | | | | | * Restore original kern.corefile value after the test. * Don't leave behind a frozen pool. * Clean up leftover vdev files. * Make zpool_002_pos and zpool_003_pos consistent in their handling of core files while here. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11694
* Clarify compressed zfs send/recv behaviormanfromafar2021-03-072-1/+16
| | | | | | | | | | | | | | | | | Docs for send and receive do not explain behavior when sending a compressed stream then receiving on a host that overrides compression with -o compress=value. The data from the send stream is written as it was from the send is the compressed form but the compression algorithm set on the receiver is the overridden version which causes some confusion as to what algorithm was actually used. Updated man docs to clarify behavior Reviewed-by: Brian Behlendorf <[email protected]> Reviewed By: Allan Jude <[email protected]> Signed-off-by: manfromafar <[email protected]> Closes #11690
* Intentionally allow ZFS_READONLY in zfs_writeRyan Moeller2021-03-072-7/+25
| | | | | | | | | | | | | | | | ZFS_READONLY represents the "DOS R/O" attribute. When that flag is set, we should behave as if write access were not granted by anything in the ACL. In particular: We _must_ allow writes after opening the file r/w, then setting the DOS R/O attribute, and writing some more. (Similar to how you can write after fchmod(fd, 0444).) Restore these semantics which were lost on FreeBSD when refactoring zfs_write. To my knowledge Linux does not actually expose this flag, but we'll need it to eventually so I've added the supporting checks. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11693
* Suppress cppcheck invalidSyntax warninigsBrian Behlendorf2021-03-052-0/+3
| | | | | | | | | | | For some reason cppcheck 1.90 is generating an invalidSyntax warning when the BF64_SET macro is used in the zstream source. The same warning is not reported by cppcheck 2.3, nor is their any evident problem with the expanded macro. This appears to be an issue with this version of cppcheck. This commit annotates the source to suppress the warning. Signed-off-by: Brian Behlendorf <[email protected]> Closes #11700
* Initialize ZIL buffersBrian Behlendorf2021-03-051-0/+1
| | | | | | | | | When populating a ZIL destination buffer ensure it is always zeroed before its contents are constructed. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11687
* Fix abd_get_offset_struct() may allocate new abdJorgen Lundman2021-03-051-1/+5
| | | | | | | | | Even when supplied with an abd to abd_get_offset_struct(), the call to abd_get_offset_impl() can allocate a different abd. Ensure to call abd_fini_struct() on the abd that is not used. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #11683
* FreeBSD module --enable-debug --enable-invariantsRyan Moeller2021-03-053-0/+42
| | | | | | | | | | | | Wire up the --enable-debug flag for configure to the FreeBSD module build. Add --enable-invariants. The running FreeBSD kernel config is used to detect whether to enable INVARIANTS if not explicitly specified with --enable-invariants or --disable-invariants. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11678
* zpool: use tab to intend continuation from removal statusThomas Lamprecht2021-03-051-4/+4
| | | | | | | | | | | | | | | | | Bring the output of the removal status in line with the other "fields" that zpool status outputs, and thus allows an parser to easier detect this as continuation of the 'remove:' output. Before: remove: Removal of vdev 0 copied 282G in 0h9m, completed on [...] 776K memory used for removed device mappings Now: remove: Removal of vdev 0 copied 282G in 0h9m, completed on [...] 776K memory used for removed device mappings Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Thomas Lamprecht <[email protected]> Closes #11674
* Don't bomb out when using keylocation=file://James Wah2021-03-031-3/+7
| | | | | | | | Avoid following the error path when the operation in fact succeeded. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: James Wah <[email protected]> Closes #11651
* linux: zvol: avoid heap allocation for zvol_request_sync=1Christian Schwarz2021-03-031-29/+64
| | | | | | | | | | | | The spl_kmem_alloc showed up in some flamegraphs in a single-threaded 4k sync write workload at 85k IOPS on an Intel(R) Xeon(R) Silver 4215 CPU @ 2.50GHz. Certainly not a huge win but I believe the change is clean and easy to maintain down the road. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #11666
* Add "zstd-fast" to help options for "compression" propertyJake Howard2021-03-031-1/+1
| | | | | | | This value does work as expected, and is documented in the manpage. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jake Howard <[email protected]> Closes #11670
* Cancel TRIM / initialize on FAULTED non-writeable vdevsnssrikanth2021-03-027-6/+144
| | | | | | | | | | | | | | When a device which is actively trimming or initializing becomes FAULTED, and therefore no longer writable, cancel the active TRIM or initialization. When the device is merely taken offline with `zpool offline` then stop the operation but do not cancel it. When the device is brought back online the operation will be resumed if possible. Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Brian Behlendorf <[email protected]> Co-authored-by: Vipin Kumar Verma <[email protected]> Signed-off-by: Srikanth N S <[email protected]> Closes #11588
* Fix assert in FreeBSD-specific dmu_read_pagesAndriy Gapon2021-02-271-1/+1
| | | | | | | | | | | | | The function has three similar pieces of code: for read-behind pages, requested pages and read-ahead pages. All three pieces had an assert to ensure that the page is not mapped. Later the assert was relaxed to require that the page is not mapped for writing. But that was done in two places out of three. This change fixes the third piece, read-ahead. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Andriy Gapon <[email protected]> Closes #11654
* ZTS: zpool_trim_start_and_cancel_pos.kshBrian Behlendorf2021-02-271-11/+11
| | | | | | | | | | | Several of the TRIM tests were based of the initialize tests and then adapted for TRIM. The zpool_trim_start_and_cancel_pos.ksh test was intended to be one such test but it was overlooked and actually never adapted. Update it accordingly. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11649
* Add missing checks for unsupported featuresMartin Matuška2021-02-274-0/+9
| | | | | | | | | | | | | After 35ec517 it has become possible to import ZFS pools witn an active org.illumos:edonr feature on FreeBSD, leading to a panic. In addition, "zpool status" reported all pools without edonr as upgradable and "zpool upgrade -v" reported edonr in the list of upgradable features. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Martin Matuska <[email protected]> Closes #11653
* Linux 5.12 compat: replace bio_*_io_acct with disk_*_io_acctColeman Kane2021-02-242-24/+53
| | | | | | | | | | | | The bio_*_acct functions became GPL exports, which causes the kernel modules to refuse to compile. This replaces code with alternate function calls to the disk_*_io_acct interfaces, which are not GPL exports. This change was added in kernel commit 99dfc43ecbf67f12a06512918aaba61d55863efc. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #11639
* Linux 5.12 compat: bio->bi_disk member movedColeman Kane2021-02-243-0/+37
| | | | | | | | | | The struct bio member bi_disk was moved underneath a new member named bi_bdev. So all attempts to reference bio->bi_disk need to now become bio->bi_bdev->bd_disk. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #11639
* Fix vdev_rebuild_thread deadlockBrian Behlendorf2021-02-241-1/+1
| | | | | | | | | | | | The metaslab_disable() call may block waiting for a txg sync. Therefore it's important that vdev_rebuild_thread release the SCL_CONFIG read lock it is holding before this call. Failure to do so can result in the txg_sync thread getting blocked waiting for this lock which results in a deadlock. Reviewed-by: Mark Maybee <[email protected]> Reviewd-by: Srikanth N S <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11647
* Fix overly broad locking in spa_vdev_config_exit()Brian Behlendorf2021-02-241-2/+2
| | | | | | | | | | | | | Calling vdev_free() only requires the we acquire the spa config SCL_STATE_ALL locks, not the SCL_ALL locks. In particular, we need need to avoid taking the SCL_CONFIG lock (included in SCL_ALL) as a writer since this can lead to a deadlock. The txg_sync_thread() may block in spa_txg_history_init_io() when taking the SCL_CONFIG lock as a reading when it detects there's a pending writer. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11585
* vdev_id: Fix partition regular expressionTony Hutter2021-02-241-3/+9
| | | | | | | | | | | | | | | | Given a DM device name, the old vdev_id script would extract any text after a 'p' as the partition number. It then appends "-part" + the partition number to the name, giving a by-vdev name like "L0-part5". This works fine if the DM name is like 'dm-2p5', but doesn't work if the DM name is a multipath name like "mpatha". In those cases it incorrectly matches the 'p' in "mpatha", giving by-vdev names like "L0-partatha". This patch fixes the issue by making the partition regex match stricter. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #11637
* Linux: increase max nvlist_src sizeBrian Behlendorf2021-02-242-2/+2
| | | | | | | | | | | | On Linux increase the maximum allowed size of the src nvlist which can be passed to the /dev/zfs ioctl. Originally, this was set to a maximum of KMALLOC_MAX_SIZE (4M) because it was kmalloc'd. Since that time it's been converted to a vmalloc so that's no longer a hard limit, and it's desirable for `zfs send/recv` to allow larger nvlists so more snapshots can be sent at once. Signed-off-by: Brian Behlendorf <[email protected]> Closes #6572 Closes #11638
* Add upper bound for slop space calculationPrakash Surya2021-02-242-14/+22
| | | | | | | | | | This change modifies the behavior of how we determine how much slop space to use in the pool, such that now it has an upper limit. The default upper limit is 128G, but is configurable via a tunable. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Prakash Surya <[email protected]> Closes #11023
* Wrap bare EINVAL returns with SET_ERRORRyan Moeller2021-02-241-2/+2
| | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11636
* Force symlink creation for zpool.d compat linksRyan Moeller2021-02-241-1/+1
| | | | | | | | | | gmake install fails when zpool.d compat links already exist. Force the symlinks to be recreated if already present. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11633
* send_iterate_snap : doall send without fromsnapCedric Maunoury2021-02-2410-2/+182
| | | | | | | | | | The behavior of a NULL fromsnap was inadvertently changed for a doall send when the send/recv logic in libzfs was updated. Restore the previous behavior by correcting send_iterate_snap() to include all the snapshots in the nvlist for this case. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Cedric Maunoury <[email protected]> Closes #11608
* Fix error message when zfs module are already unloadedAdam D. Moss2021-02-201-1/+1
| | | | | | | | | Using zfs-sh -u on linux will fail with inaccurate message when the zfs modules are already unloaded. Deal with the case where a module is already unloaded; its USE_COUNT will be the empty string Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Adam Moss <[email protected]> Closes #11627
* vdev_ops: don't try to call vdev_op_hold or vdev_op_rele when NULLfbynite2021-02-201-2/+2
| | | | | | | | | | | | | | | | | | | | This prevents a panic after a SLOG add/removal on the root pool followed by a zpool scrub. When a SLOG is removed, a hole takes its place - the vdev_ops for a hole is vdev_hole_ops, which defines the handler functions of vdev_op_hold and vdev_op_rele as NULL. This bug has been reported in illumos and FreeBSD, a different trigger in the FreeBSD report though. Credit for this patch goes to Patrick Mooney <[email protected]> Obtained from: illumos-gate commit: c65bd18728f34725 External-issue: https://www.illumos.org/issues/12981 External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252396 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Wing <[email protected]> Closes #11623
* Better zfs_get_enclosure_sysfs_path() enclosure supportTony Hutter2021-02-201-110/+142
| | | | | | | | | | | | | | | | | | | | | A multpathed disk will have several 'underlying' paths to the disk. For example, multipath disk 'dm-0' may be made up of paths: /dev/{sda,sdb,sdc,sdd}. On many enclosures those underlying sysfs paths will have a symlink back to their enclosure device entry (like 'enclosure_device0/slot1'). This is used by the statechange-led.sh script to set/clear the fault LED for a disk, and by 'zpool status -c'. However, on some enclosures, those underlying paths may not all have symlinks back to the enclosure device. Maybe only two out of four of them might. This patch updates zfs_get_enclosure_sysfs_path() to favor returning paths that have symlinks back to their enclosure devices, rather than just returning the first path. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #11617
* Cleaning up uio headersBrian Atkinson2021-02-209-63/+40
| | | | | | | | | Making uio_impl.h the common header interface between Linux and FreeBSD so both OS's can share a common header file. This also helps reduce code duplication for zfs_uio_t for each OS. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Brian Atkinson <[email protected]> Closes #11622
* ztest: propagate -o to the zdb child processChristian Schwarz2021-02-191-23/+79
| | | | | | | | | | | | | I think this is the behavior that most users expect. Future work: have a separate flag, e.g., -O, to specify separate set_global_vars for the zdb child than for the ztest children. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #11602
* ztest: fix -o by calling set_global_var in child processesChristian Schwarz2021-02-191-2/+51
| | | | | | | | | | | | | | | | | | | Without set_global_var() in the child processes the -o option provides little use. Before this change set_global_var() was called as a side-effect of getopt processing which only happens for the parent ztest process. This change limits the set of options that can be set and makes them available to the child through ztest_shared_opts_t. Future work: support arbitrary option count and length. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #11602
* libzpool: set_global_var: refactor to not modify 'arg'Christian Schwarz2021-02-192-19/+55
| | | | | | | | | | Also fixes leak of the dlopen handle in the error case. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #11602
* libzpool: set_global_var: fix endianness handling (fixes zdb -o )Christian Schwarz2021-02-191-1/+1
| | | | | | | | | | | | | | Without this patch I get the error Setting global variables is only supported on little-endian systems when using `zdb -o` on my amd64 machine. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #11602
* Restore FreeBSD resource usage accountingRyan Moeller2021-02-199-0/+139
| | | | | | | Add zfs_racct_* interfaces for platform-dependent read/write accounting. Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11613
* Checksum errors may not be countedDon Brady2021-02-1912-42/+214
| | | | | | | | | | Fix regression seen in issue #11545 where checksum errors where not being counted or showing up in a zpool event. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #11609
* FreeBSD: disable the use of hardware crypto offload drivers for nowMark Johnston2021-02-181-2/+13
| | | | | | | | | | | | | | | | | | | | | First, the crypto request completion handler contains a bug in that it fails to reset fs_done correctly after the request is completed. This is only a problem for asynchronous drivers. Second, some hardware drivers have input constraints which ZFS does not satisfy. For instance, ccp(4) apparently requires the AAD length for AES-GCM to be a multiple of the cipher block size, and with qat(4) the AES-GCM AAD length may not be longer than 240 bytes. FreeBSD's generic crypto framework doesn't have a mechanism to automatically fall back to a software implementation if a hardware driver cannot process a request, and ZFS does not tolerate such errors. The plan is to implement such a fallback mechanism, but with FreeBSD 13.0 approaching we should simply disable the use hardware drivers for now. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Mark Johnston <[email protected]> Closes #11612
* Fix report_mount_progress never calling set_progress_headerAndriy Gapon2021-02-181-3/+0
| | | | | | | | | | | | | | | | | | That happens because of an off-by-one mistake. share_mount_one_cb() calls report_mount_progress(current=sm_done) after having incremented sm_done by one. Then report_mount_progress() increments the parameter again. It appears that that logic became obsolete after commit a10d50f999511, parallel zfs mount. On FreeBSD I observe that zfs mount -a -v prints, for example, (null): (201/248) That happens because set_progress_header() is never called. With this change the output becomes correct: Mounting ZFS filesystems: (209/248) Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Andriy Gapon <[email protected]> Closes #11607
* Remove unused abd_alloc_scatter_offset_chunkcntRyan Libby2021-02-171-19/+0
| | | | | | | | | | Remove function that become unused after refactoring in e2af2acce3436acdb2b35fdc7c9de1a30ea85514. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Libby <[email protected]> Closes #11614
* Add "compatibility" property for zpool feature setsColm2021-02-1748-2823/+5490
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Property to allow sets of features to be specified; for compatibility with specific versions / releases / external systems. Influences the behavior of 'zpool upgrade' and 'zpool create'. Initial man page changes and test cases included. Brief synopsis: zpool create -o compatibility=off|legacy|file[,file...] pool vdev... compatibility = off : disable compatibility mode (enable all features) compatibility = legacy : request that no features be enabled compatibility = file[,file...] : read features from specified files. Only features present in *all* files will be enabled on the resulting pool. Filenames may be absolute, or relative to /etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d (/etc checked first). Only affects zpool create, zpool upgrade and zpool status. ABI changes in libzfs: * New function "zpool_load_compat" to load and parse compat sets. * Add "zpool_compat_status_t" typedef for compatibility parse status. * Add ZPOOL_PROP_COMPATIBILITY to the pool properties enum * Add ZPOOL_STATUS_COMPATIBILITY_ERR to the pool status enum An initial set of base compatibility sets are included in cmd/zpool/compatibility.d, and the Makefile for cmd/zpool is modified to install these in $pkgdatadir/compatibility.d and to create symbolic links to a reasonable set of aliases. Reviewed-by: ericloewe Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Colm Buckley <[email protected]> Closes #11468
* FreeBSD: disable edonr in zfs_mod_supported_feature()Brian Behlendorf2021-02-173-7/+16
| | | | | | | | | | | Rather than conditionally compiling out the edonr code for FreeBSD update zfs_mod_supported_feature() to indicate this feature is unsupported. This ensures that all spa features are defined on every platform, even if they are not supported. Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11605 Issue #11468
* Support uClibc for the tests compilationsJosé Luis Salvador Rufo2021-02-162-6/+4
| | | | | | | | | | | | There are two issues that don't allow ZFS to be compiled using uClibc. `backtrace()`, and `program_invocation_short_name` as a `const`. This patch adds uClibc to the conditionals in the same way there are already for Glibc for `backtrace()`; and removes the external param `program_invocation_short_name` because its only used here for the whole project. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: José Luis Salvador Rufo <[email protected]> Closes #11600
* Make inline ABD predicates compatible with C++Ryan Moeller2021-02-151-3/+3
| | | | | | | | | | | | | FreeBSD's zfsd fails to build after e2af2acce3 due to strict type checking errors from the implicit conversion between bool and boolean_t in the inline predicate definitions in abd.h. Use conditionals to return the correct value type from these functions. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11592
* Linux 5.11 compat: METABrian Behlendorf2021-02-101-1/+1
| | | | | | | | Increase the Linux-Maximum version in the META file to 5.11. All of the required compatibility patches have been merged. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11586
* vdev_id: Support daisy-chained JBODs in multipath modeArshad Hussain2021-02-091-115/+283
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Within function sas_handler() userspace commands like '/usr/sbin/multipath' have been replaced with sourcing device details from within sysfs which reduced a significant amount of overhead and processing time. Multiple JBOD enclosures and their order are sourced from the bsg driver (/sys/class/enclosure) to isolate chassis top-level expanders, which are then dynamically indexed based on host channel of the multipath subordinate disk member device being processed. Additionally added a "mixed" mode for slot identification for environments where a ZFS server system may contain SAS disk slots where there is no expander (direct connect to HBA) while an attached external JBOD with an expander have different slot identifier methods. How Has This Been Tested? ~~~~~~~~~~~~~~~~~~~~~~~~~ Testing was performed on a AMD EPYC based dual-server high-availability multipath environment with multiple HBAs per ZFS server and four SAS JBODs. The two primary JBODs were multipath/cross-connected between the two ZFS-HA servers. The secondary JBODs were daisy-chained off of the primary JBODs using aligned SAS expander channels (JBOD-0 expanderA--->JBOD-1 expanderA, JBOD-0 expanderB--->JBOD-1 expanderB, etc). Pools were created, exported and re-imported, imported globally with 'zpool import -a -d /dev/disk/by-vdev'. Low level udev debug outputs were traced to isolate and resolve errors. Result: ~~~~~~~ Initial testing of a previous version of this change showed how reliance on userspace utilities like '/usr/sbin/multipath' and '/usr/bin/lsscsi' were exacerbated by increasing numbers of disks and JBODs. With four 60-disk SAS JBODs and 240 disks the time to process a udevadm trigger was 3 minutes 30 seconds during which nearly all CPU cores were above 80% utilization. By switching reliance on userspace utilities to sysfs in this version, the udevadm trigger processing time was reduced to 12.2 seconds and negligible CPU load. This patch also fixes few shellcheck complains. Reviewed-by: Gabriel A. Devenyi <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Jeff Johnson <[email protected]> Signed-off-by: Jeff Johnson <[email protected]> Signed-off-by: Arshad Hussain <[email protected]> Closes #11526
* Rename zfs_inode_update to zfs_znode_update_vfskhng3002021-02-098-35/+31
| | | | | | | | | | | zfs_znode_update_vfs is a more platform-agnostic name than zfs_inode_update. Besides that, the function's prototype is moved to include/sys/zfs_znode.h as the function is also used in common code. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ka Ho Ng <[email protected]> Sponsored by: The FreeBSD Foundation Closes #11580
* Add an assert to clarify codeKleber Tarcísio2021-02-092-2/+6
| | | | | | | | | | The first time through the loop prevdb and prevhdl are NULL. They are then both set, but only prevdb is checked. Add an ASSERT to make it clear that prevhdl must be set when prevdb is. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Kleber <[email protected]> Closes #10754 Closes #11575
* Set file mode during zfs_writeAntonio Russo2021-02-081-0/+1
| | | | | | | | | | | | | | | | | | | | 3d40b65 refactored zfs_vnops.c, which shared much code verbatim between Linux and BSD. After a successful write, the suid/sgid bits are reset, and the mode to be written is stored in newmode. On Linux, this was propagated to both the in-memory inode and znode, which is then updated with sa_update. 3d40b65 accidentally removed the initialization of newmode, which happened to occur on the same line as the inode update (which has been moved out of the function). The uninitialized newmode can be saved to disk, leading to a crash on stat() of that file, in addition to a merely incorrect file mode. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Antonio Russo <[email protected]> Closes #11474 Closes #11576
* zfs-import-{cache,scan}: change condition to FileNotEmptyнаб2021-02-052-2/+2
| | | | | | | | | | | | | | | | When all pools are exported ZFS will generate an empty cache file. This will cause the import service to fail, which is sub-optimal, since this means that dracut fails, and it necessary to run `zpool import -a` to boot, delete the file, and regenerate+reinstall the initrd. This resolves the issue by treating an zero-length cache files the same as a missing cache file. This aligns the behavior with that of the `zpool` command itself. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #11568
* Fixed issue with processing of EC_dev_remove eventnssrikanth2021-02-051-6/+13
| | | | | | | | | | | | | The pool guid and vdev guid received by zfs_agent_post_event(), which calls zfs_retire_recv(), are normally non-zero. However, later in this same method they may be unconditionally reset to zero by the code which is intended to handle multipath, spare and l2arc vdevs. This will result in the EC_dev_remove not being handled. Reviewed-by: Brian Behlendorf <[email protected]>\ Co-authored-by: Vipin Kumar Verma <[email protected]> Signed-off-by: Srikanth N S <[email protected]> Closes #11564
* zfs-list.8: clarify listing snapshotsBrian Behlendorf2021-02-041-3/+8
| | | | | | | | | | | Clarify how to include snapshots in the `zpool list` output by referencing the full name of the `listsnapshots` pool property, and the `zpool list -t snapshot` option. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11562 Closes #11565