aboutsummaryrefslogtreecommitdiffstats
path: root/module
Commit message (Collapse)AuthorAgeFilesLines
* Fix use-after-free bugs in icp codeRichard Yao2022-09-152-2/+2
| | | | | | | | | | These were reported by Coverity as "Read from pointer after free" bugs. Presumably, it did not report it as a use-after-free bug because it does not understand the inline assembly that implements the atomic instruction. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13881
* FreeBSD: Fix integer conversion for vnlru_free{,_vfsops}()Richard Yao2022-09-141-0/+6
| | | | | | | | | | | | | | | | When reviewing #13875, I noticed that our FreeBSD code has an issue where it converts from `int64_t` to `int` when calling `vnlru_free{,_vfsops}()`. The result is that if the int64_t is `1 << 36`, the int will be 0, since the low bits are 0. Even when some low bits are set, a value such as `((1 << 36) + 1)` would truncate to 1, which is wrong. There is protection against this on 32-bit platforms, but on 64-bit platforms, there is no check to protect us, so we add a check. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13882
* Add assertion to dsl_dataset_set_compression_syncRichard Yao2022-09-141-0/+1
| | | | | | | | | | | | Coverity pointed out that if we somehow receive SPA_FEATURE_NONE, we will use a negative number as an array index. A defensive assertion seems appropriate. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13872
* Fix theoretical "use-after-free" in dbuf_prefetch_indirect_done()Richard Yao2022-09-131-2/+6
| | | | | | | | | | | | | | | | | | | Coverity complains about a "use-after-free" bug in `dbuf_prefetch_indirect_done()` because we use a pointer value after freeing its buffer. The pointer is used for refcounting in ARC (as the reference holder). There is a theoretical situation where the pointer would be reused in a way that causes the refcounting to collide, so we change the order in which we call arc_buf_destroy() and dbuf_prefetch_fini() to match the rest of the function. This prevents the theoretical situation from being a possibility. Also, we have a few return statements with a value, despite this being a void function. We clean those up while we are making changes here. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13869
* Cleanup: Make memory barrier definitions consistent across kernelsRichard Yao2022-09-131-1/+1
| | | | | | | | | | | | | | | | | We inherited membar_consumer() and membar_producer() from OpenSolaris, but we had replaced membar_consumer() with Linux's smp_rmb() in zfs_ioctl.c. The FreeBSD SPL consequently implemented a shim for the Linux-only smp_rmb(). We reinstate membar_consumer() in platform independent code and fix the FreeBSD SPL to implement membar_consumer() in a way analogous to Linux. Reviewed-by: Konstantin Belousov <[email protected]> Reviewed-by: Mateusz Guzik <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13843
* Cleanup dead spa_boot codeRichard Yao2022-09-137-56/+0
| | | | | | | | | | Unused code detected by coverity. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13868
* vdev_draid_lookup_map() should not iterate outside draid_mapsRichard Yao2022-09-121-1/+1
| | | | | | | | | Coverity reported this as an out-of-bounds read. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13865
* Fix use-after-free in btree codeRichard Yao2022-09-121-2/+2
| | | | | | | | | | Coverty static analysis found these. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #10989 Closes #13861
* Cleanup: Use OpenSolaris functions to call schedulerRichard Yao2022-09-124-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In our codebase, `cond_resched() and `schedule()` are Linux kernel functions that have replaced the OpenSolaris `kpreempt()` functions in the codebase to such an extent that `kpreempt()` in zfs_context.h was broken. Nobody noticed because we did not actually use it. The header had defined `kpreempt()` as `yield()`, which works on OpenSolaris and Illumos where `sched_yield()` is a wrapper for `yield()`, but that does not work on any other platform. The FreeBSD platform specific code implemented shims for these, but the shim for `schedule()` forced us to wait, which is different than merely rescheduling to another thread as the original Linux code does, while the shim for `cond_resched()` had the same definition as its kernel kpreempt() shim. After studying this, I have concluded that we should reintroduce the kpreempt() function in platform independent code with the following definitions: - In the Linux kernel: kpreempt(unused) -> cond_resched() - In the FreeBSD kernel: kpreempt(unused) -> kern_yield(PRI_USER) - In userspace: kpreempt(unused) -> sched_yield() In userspace, nothing changes from this cleanup. In the kernels, the function `fm_fini()` will now call `kern_yield(PRI_USER)` on FreeBSD and `cond_resched()` on Linux. This is instead of `pause("schedule", 1)` on FreeBSD and `schedule()` on Linux. This makes our behavior consistent across platforms. Note that Linux's SPL continues to use `cond_resched()` and `schedule()`. However, those functions have been removed from both the FreeBSD code and userspace code. This should have the benefit of making it slightly easier to port the code to new platforms by making how things should be mapped less confusing. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13845
* FreeBSD: Replace legacy make_dev() interface usageRyan Moeller2022-09-081-3/+10
| | | | | | | | | | | | | The function make_dev_s() was introduced to replace make_dev() in FreeBSD 11.0. It allows further specification of properties and flags and returns an error code on failure. Using this we can fail loading the module more gracefully than a panic in situations such as when a device named zfs already exists. We already use it for zvols. Use make_dev_s() for /dev/zfs. Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13854
* Improve too large physical ashift handlingAlexander Motin2022-09-085-11/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When iterating through children physical ashifts for vdev, prefer ones above the maximum logical ashift, that we can actually use, but within the administrator defined maximum. When selecting top-level vdev ashift, do not set it to the defined maximum in case physical ashift is even higher, but just ignore one. Using the maximum does not prevent misaligned writes, but reduces space efficiency. Since ZFS tries to write data sequentially and aggregates the writes, in many cases large misanigned writes may be not as bad as the space penalty otherwise. Allow internal physical ashifts for vdevs higher than SHIFT_MAX. May be one day allocator or aggregation could benefit from that. Reduce zfs_vdev_max_auto_ashift default from 16 (64KB) to 14 (16KB), so that ZFS may still use bigger ashifts up to SHIFT_MAX (64KB), but only if it really has to or explicitly told to, but not as an "optimization". There are some read-intensive NVMe SSDs that report Preferred Write Alignment of 64KB, and attempt to build RAIDZ2 of those leads to a space inefficiency that can't be justified. Instead these changes make ZFS fall back to logical ashift of 12 (4KB) by default and only warn user that it may be suboptimal for performance. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #13798
* Add Linux posix_fadvise supportFinix19792022-09-081-0/+62
| | | | | | | | | | | | | | | | | The purpose of this PR is to accepts fadvise ioctl from userland to do read-ahead by demand. It could dramatically improve sequential read performance especially when primarycache is set to metadata or zfs_prefetch_disable is 1. If the file is mmaped, generic_fadvise is also called for page cache read-ahead besides dmu_prefetch. Only POSIX_FADV_WILLNEED and POSIX_FADV_SEQUENTIAL are supported in this PR currently. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Finix Yan <[email protected]> Closes #13694
* Linux SPL module init: Handle memory allocation failures correctlyRichard Yao2022-09-085-7/+18
| | | | | | | | | | | | | | | | Upon inspection of our code, I noticed that we assume that __alloc_percpu() cannot fail, and while it probably never has failed in practice, technically, it can fail, so we should handle that. Additionally, we incorrectly assume that `taskq_create()` in spl_kmem_cache_init() cannot fail. The same remark applies to it. Lastly, `spl-init()` failures should always return negative error values, but in some places, we are returning positive 1, which is incorrect. We change those values to their correct error codes. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13847
* Fix build on FreeBSD/powerpc64*pkubaj2022-09-081-2/+2
| | | | | | | There's no VSX handler on FreeBSD for now. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Piotr Kubaj <[email protected]> Closes #13848
* FreeBSD: add kqfilter support for zvol cdevRob Wing2022-09-061-0/+64
| | | | | | | | | | The only event hooked up is NOTE_ATTRIB, which is triggered when the device is resized. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Wing <[email protected]> Closes #13773
* FreeBSD: add knlist_init_sx() for exclusive locksRob Wing2022-09-062-0/+66
| | | | | | | | | This will be used to implement kqfilter support for zvol cdevs. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Rob Wing <[email protected]> Closes #13773
* Cleanup Raid-Z Typo fixesRichard Yao2022-09-061-3/+3
| | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13834
* Add DD_FIELD string for snapshots_changed propertyUmer Saleem2022-09-021-2/+2
| | | | | | | | | This commit adds DD_FIELD string used in extensified dsl_dir zap object for snapshots_changed property. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13819
* Add zfs.sync.snapshot_renameAndriy Gapon2022-09-022-10/+39
| | | | | | | | | Only the single snapshot rename is provided. The recursive or more complex rename can be scripted. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Andriy Gapon <[email protected]> Closes #13802
* FreeBSD: Organize sysctlsRyan Moeller2022-09-022-229/+370
| | | | | | | | | | | | | | | | | | | | | | | FreeBSD had a few platform-specific ARC tunables in the wrong place: - Move FreeBSD-specifc ARC tunables into the same vfs.zfs.arc node as the rest of the ARC tunables. - Move the handlers from arc_os.c to sysctl_os.c and add compat sysctls for the legacy names. While here, some additional clean up: - Most handlers are specific to a particular variable and don't need a pointer passed through the args. - Group blocks of related variables, handlers, and sysctl declarations into logical sections. - Match variable types for temporaries in handlers with the type of the global variable. - Remove leftover comments. Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13756
* Apply arc_shrink_shift to ARC above arc_c_minAlexander Motin2022-09-022-5/+9
| | | | | | | | | | | | It makes sense to free memory in smaller chunks when approaching arc_c_min to let other kernel subsystems to free more, since after that point we can't free anything. This also matches behavior on Linux, where to shrinker reported only the size above arc_c_min. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes #13794
* FreeBSD: Cleanup dead code from VFSRichard Yao2022-09-022-49/+0
| | | | | | | | | | | | | | The vfs_*_feature() macros turn anything that uses them into dead code, so we can delete all of it. As a side effect, zfs_set_fuid_feature() is now identical in module/os/freebsd/zfs/zfs_vnops_os.c and module/os/linux/zfs/zfs_vnops_os.c. A few other functions are identical too. Future cleanup could move these into a common file. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13832
* Revert "Avoid panic with recordsize > 128k, raw sending and no large_blocks"Brian Behlendorf2022-08-254-55/+20
| | | | | | | | | | This reverts commit 80a650b7bb04bce3aef5e4cfd1d966e3599dafd4. This change inadvertently introduced a regression in ztest where one of the new ASSERTs is triggered in dsl_scan_visitbp(). Reviewed-by: George Amanakis <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #12275 Closes #13799
* Updates for snapshots_changed propertyUmer Saleem2022-08-241-13/+22
| | | | | | | | | | | | | | | | Currently, snapshots_changed property is stored in dd_props_zapobj, due to which the property is assumed to be local. This causes a difference in behavior with respect to other readonly properties. This commit stores the snapshots_changed property in dd_object. Source is not set to local in this case, which makes it consistent with other readonly properties. This commit also updates the date string format to include seconds. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13785
* Fix zpool status in case of unloaded keysGeorge Amanakis2022-08-222-33/+104
| | | | | | | | | | When scrubbing an encrypted filesystem with unloaded key still report an error in zpool status. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #13675 Closes #13717
* Prevent zevent list from consuming all of kernel memoryPaul Dagnelie2022-08-222-4/+17
| | | | | | | | | | | | | | | | | | | | There are a couple changes included here. The first is to introduce a cap on the size the ZED will grow the zevent list to. One million entries is more than enough for most use cases, and if you are overflowing that value, the problem needs to be addressed another way. The value is also tunable, for those who want the limit to be higher or lower. The other change is to add a kernel module parameter that allows snapshot creation/deletion to be exempted from the history logging; for most workloads, having these things logged is valuable, but for some workloads it produces large quantities of log spam and isn't especially helpful. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Issue #13374 Closes #13753
* Enable relatime by defaultGeorge Melikov2022-08-121-1/+1
| | | | | | | | | | Linux sets relatime on mount by default for any file system, but relatime=off in ZFS disables it explicitly. Let's be consistent with other file systems on Linux. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #13614
* Add comment on acb_zio_dummyChristian Schwarz2022-08-081-0/+17
| | | | | | | | Thanks to George Wilson for clarifying this on Slack. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #13698
* Linux 5.20 compat: blk_cleanup_disk()Brian Behlendorf2022-08-041-0/+4
| | | | | | | | | As of the Linux 5.20 kernel blk_cleanup_disk() has been removed, all callers should use put_disk(). Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13728
* Linux 5.20 compat: bdevname()Brian Behlendorf2022-08-041-1/+11
| | | | | | | | | As of the Linux 5.20 kernel bdevname() has been removed, all callers should use snprintf() and the "%pg" format specifier. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13728
* Don't double-zero buffers in fault management nvlistsPaul Dagnelie2022-08-041-1/+1
| | | | | | | | This is a small cleanup for a trivial problem which happened to be noticed while another issue was being investigated. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #13730
* Add snapshots_changed as propertyUmer Saleem2022-08-024-5/+27
| | | | | | | | | | | | | | | | | | | Make dd_snap_cmtime property persistent across mount and unmount operations by storing in ZAP and restore the value from ZAP on hold into dd_snap_cmtime instead of updating it. Expose dd_snap_cmtime as 'snapshots_changed' property that provides a mechanism to quickly determine whether snapshot list for dataset has changed without having to mount a dataset or iterate the snapshot list. It specifies the time at which a snapshot for a dataset was last created or deleted. This allows us to be more efficient how often we query snapshots. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13635
* FreeBSD: Ignore symlink to i386 includesRyan Moeller2022-08-021-0/+1
| | | | | | | | | | A symlink to i386 includes is created in the build dir on amd64 since freebsd/freebsd-src@d07600c563039f252becc29ac7d9a454b6b0600d Tell git to ignore it like the other include links. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13719
* Skip checksum benchmarks on systems with slow cpuTino Reichardt2022-08-011-13/+34
| | | | | | | | | | | | | | | | | | | | | | | | The checksum benchmarking on module load may take a really long time on embedded systems with a slow cpu. Avoid all benchmarks >= 1MiB on systems, where EdonR is slower then 300 MiB/s. This limit is currently hardcoded via the define LIMIT_PERF_MBS. This is the new benchmark output of a slow Intel Atom: ``` implementation 1k 4k 16k 64k 256k 1m 4m 16m edonr-generic 209 257 268 259 262 0 0 0 skein-generic 129 150 151 150 150 0 0 0 sha256-generic 50 55 56 56 56 0 0 0 sha512-generic 76 86 88 89 88 0 0 0 blake3-generic 63 62 62 62 61 0 0 0 blake3-sse2 114 292 301 307 309 0 0 0 ``` Reviewed-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13695
* Implement a new type of zfs receive: corrective receive (-c)Alek P2022-07-286-47/+492
| | | | | | | | | | | | | | This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Paul Zuchowski <[email protected]> Signed-off-by: Alek Pinchuk <[email protected]> Closes #9372
* FreeBSD compile fixTino Reichardt2022-07-281-3/+3
| | | | | | | | | | | The file module/os/freebsd/zfs/zfs_ioctl_compat.c fails compiling because of this error: 'static' is not at beginning of declaration This commit fixes the three places within that file. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13702
* Add createtxg sort support for simple snapshot iteratorAmeer Hamza2022-07-251-0/+2
| | | | | | | | | | | | | | | - When iterating snapshots with name only, e.g., "-o name -s name", libzfs uses simple snapshot iterator and results are displayed in alphabetic order. This PR adds support for faster version of createtxg sort by avoiding nvlist parsing for properties. Flags "-o name -s createtxg" will enable createtxg sort while using simple snapshot iterator. - Added support to read createtxg property directly from zfs handle for filesystem, volume and snapshot types instead of parsing nvlist. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #13577
* Add support for per dataset zil stats and use wmsum countersixhamza2022-07-206-51/+183
| | | | | | | | | | | | | | | | ZIL kstats are reported in an inclusive way, i.e., same counters are shared to capture all the activities happening in zil. Added support to report zil stats for every datset individually by combining them with already exposed dataset kstats. Wmsum uses per cpu counters and provide less overhead as compared to atomic operations. Updated zil kstats to replace wmsum counters to avoid atomic operations. Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #13636
* Fix scrub resume from newly created holeAlexander Motin2022-07-202-7/+21
| | | | | | | | | | | | | | | | | | It may happen that scan bookmark points to a block that was turned into a part of a big hole. In such case dsl_scan_visitbp() may skip it and dsl_scan_check_resume() will not be called for it. As result new scan suspend won't be possible until the end of the object, that may take hours if the object is a multi-terabyte ZVOL on a slow HDD pool, stretching TXG to all that time, creating all sorts of problems. This patch changes the resume condition to any greater or equal block, so even if we miss the bookmarked block, the next one we find will delete the bookmark, allowing new suspend. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #13643
* Fix memory allocation for the checksum benchmarkTino Reichardt2022-07-201-6/+17
| | | | | | | | | | | | | | | Allocation via kmem_cache_alloc() is limited to less then 4m for some architectures. This commit limits the benchmarks with the linear abd cache to 1m on all architectures and adds 4m + 16m benchmarks via non-linear abd_alloc(). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Co-authored-by: Sebastian Gottschall <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13669 Closes #13670
* Expose ZFS dataset case sensitivity setting via sb_optsixhamza2022-07-141-0/+12
| | | | | | | | Makes the case sensitivity setting visible on Linux in /proc/mounts. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #13607
* Replace dead opensolaris.org license linkTino Reichardt2022-07-11230-230/+230
| | | | | | | | | The commit replaces all findings of the link: http://www.opensolaris.org/os/licensing with this one: https://opensource.org/licenses/CDDL-1.0 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13619
* Linux: Align MODULE_LICENSE macro textBrian Behlendorf2022-07-111-3/+2
| | | | | | | | | Specify the lua and zstd license text in the manor in which the kernel MODULE_LICENSE macro requires it. The now duplicate entries were merged and a comment added to make it clear what they apply to. Reviewed-by: Christian Schwarz <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13641
* Call nvlist_free before returnFinix19792022-07-071-0/+1
| | | | | | | | | | | Fixes a small kernel memory leak which would occur if a pool failed to import because the `DMU_POOL_VDEV_ZAP_MAP` key can't be read from a presumably damaged MOS config. In the case of a missing key there was no leak. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Finix1979 <[email protected]> Closes #13629
* Avoid memory copy when verifying raidz/draid parityAlexander Motin2022-07-051-2/+3
| | | | | | | | | | | | | | | | | | Before this change for every valid parity column raidz_parity_verify() allocated new buffer and copied there existing data, then recalculated the parity and compared the result with the copy. This patch removes the memory copy, simply swapping original buffer pointers with newly allocated empty ones for parity recalculation and comparison. Original buffers with potentially incorrect parity data are then just freed, while new recalculated ones are used for repair. On a pool of 12 4-wide raidz vdevs, storing 1.5TB of 16MB blocks, this change reduces memory traffic during scrub by 17% and total unhalted CPU time by 25%. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #13613
* Avoid memory copies during mirror scrubAlexander Motin2022-07-051-70/+61
| | | | | | | | | | | | | | | | | | | | Issuing several scrub reads for a block we may use the parent ZIO buffer for one of child ZIOs. If that read complete successfully, then we won't need to copy the data explicitly. If block has only one copy (typical for root vdev, which is also a mirror inside), then we never need to copy -- succeed or fail as-is. Previous code also copied data from buffer of every successfully completed child ZIO, but that just does not make any sense. On healthy N-wide mirror this saves all N+1 (or even more in case of ditto blocks) memory copies for each scrubbed block, allowing CPU to focus mostly on check-summing. For other vdev types it should save one memory copy per block copy at root vdev. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #13606
* Re-fix -Wwrite-strings on FreeBSDнаб2022-06-301-1/+1
| | | | | | | | Follow up fix for a926aab902ac5c680f4766568d19674b80fb58bb. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13348 Closes #13610
* Fix dnode byteswappingGeorge Amanakis2022-06-291-11/+2
| | | | | | | | | | | If a dnode has a spill pointer, and we use DN_SLOTS_TO_BONUSLEN() then we will possibly include the spill pointer in the len calculation and it will be byteswapped. Then dnode_byteswap() will carry on and swap the spill pointer again. Fix this by using DN_MAX_BONUS_LEN() instead. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #13002 Closes #13015
* module: lua: ldo: fix pragma nameнаб2022-06-291-1/+1
| | | | | | | | | | | | | | /home/nabijaczleweli/store/code/zfs/module/lua/ldo.c:175:32: warning: unknown option after ‘#pragma GCC diagnostic’ kind [-Wpragmas] 175 | #pragma GCC diagnostic ignored "-Winfinite-recursion"a | ^~~~~~~~~~~~~~~~~~~~~~ Fixes: a6e8113fed8a508ffda13cf1c4d8da99a4e8133a ("Silence -Winfinite-recursion warning in luaD_throw()") Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13348
* Remaining {=> const} char|void *tagнаб2022-06-2921-83/+91
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13348