aboutsummaryrefslogtreecommitdiffstats
path: root/module/zcommon
Commit message (Collapse)AuthorAgeFilesLines
* set autotrim default to 'off' everywhereYuri Pankov2023-07-201-1/+1
| | | | | | | | | | | | As it turns out having autotrim default to 'on' on FreeBSD never really worked due to mess with defines where userland and kernel module were getting different default values (userland was defaulting to 'off', module was thinking it's 'on'). Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Yuri Pankov <[email protected]> Closes #15079
* Create zap for root vdevrob-wing2023-04-201-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | And add it to the AVZ, this is not backwards compatible with older pools due to an assertion in spa_sync() that verifies the number of ZAPs of all vdevs matches the number of ZAPs in the AVZ. Granted, the assertion only applies to #DEBUG builds - still, a feature flag is introduced to avoid the assertion, com.klarasystems:vdev_zaps_v2 Notably, this allows to get/set properties on the root vdev: % zpool set user:prop=value <pool> root-0 Before this commit, it was already possible to get/set properties on top-level vdevs with the syntax <type>-<vdev_id> (e.g. mirror-0): % zpool set user:prop=value <pool> mirror-0 This syntax also applies to the root vdev as it is is of type 'root' with a vdev_id of 0, root-0. The keyword 'root' as an alias for 'root-0'. The following tests have been added: - zpool get all properties from root vdev - zpool set a property on root vdev - verify root vdev ZAP is created Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Wing <[email protected]> Sponsored-by: Seagate Technology Submitted-by: Klara, Inc. Closes #14405
* Miscellaneous FreBSD compilation bugfixesMartin Matuška2023-04-062-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | Add missing machine/md_var.h to spl/sys/simd_aarch64.h and spl/sys/simd_arm.h In spl/sys/simd_x86.h, PCB_FPUNOSAVE exists only on amd64, use PCB_NPXNOSAVE on i386 In FreeBSD sys/elf_common.h redefines AT_UID and AT_GID on FreeBSD, we need a hack in vnode.h similar to Linux. sys/simd.h needs to be included early. In zfs_freebsd_copy_file_range() we pass a (size_t *)lenp to zfs_clone_range() that expects a (uint64_t *) Allow compiling armv6 world by limiting ARM macros in sha256_impl.c and sha512_impl.c to __ARM_ARCH > 6 Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Pawel Jakub Dawidek <[email protected]> Reviewed-by: Signed-off-by: WHR <[email protected]> Signed-off-by: Martin Matuska <[email protected]> Closes #14674
* Drop lying to the compiler in the fletcher4 code Rich Ercolani2023-03-247-20/+0
| | | | | | | | | | | | | | | | | | | This is probably the uncontroversial part of #13631, which fixes a real problem people are having. There's still things to improve in our code after this is merged, but it should stop the breakage that people have reported, where we lie about a type always being aligned and then pass in stack objects with no alignment requirement and hope for the best. Of course, our SIMD code was written with unaligned accesses, so it doesn't care if we drop this...but some auto-vectorized code that gcc emits sure does, since we told it it can assume they're aligned. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #14649
* nvpair: Constify string functionsRichard Yao2023-03-142-4/+4
| | | | | | | | | | | | | | After addressing coverity complaints involving `nvpair_name()`, the compiler started complaining about dropping const. This lead to a rabbit hole where not only `nvpair_name()` needed to be constified, but also `nvpair_value_string()`, `fnvpair_value_string()` and a few other static functions, plus variable pointers throughout the code. The result became a fairly big change, so it has been split out into its own patch. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14612
* zcommon: Refactor FPU state handling in fletcher4Attila Fülöp2023-03-147-8/+31
| | | | | | | | | | | | | | | | | | Currently calls to kfpu_begin() and kfpu_end() are split between the init() and fini() functions of the particular SIMD implementation. This was done in #14247 as an optimization measure for the ABD adapter. Unfortunately the split complicates FPU handling on platforms that use a local FPU state buffer, like Windows and macOS. To ease porting, we introduce a boolean struct member in fletcher_4_ops_t, indicating use of the FPU, and move the FPU state handling from the SIMD implementations to the call sites. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #14600
* Implementation of block cloning for ZFSPawel Jakub Dawidek2023-03-102-0/+15
| | | | | | | | | | | | | | | Block Cloning allows to manually clone a file (or a subset of its blocks) into another (or the same) file by just creating additional references to the data blocks without copying the data itself. Those references are kept in the Block Reference Tables (BRTs). The whole design of block cloning is documented in module/zfs/brt.c. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Pawel Jakub Dawidek <[email protected]> Closes #13392
* Configure zed's diagnosis engine with vdev propertiesrob-wing2023-01-231-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce four new vdev properties: checksum_n checksum_t io_n io_t These properties can be used for configuring the thresholds of zed's diagnosis engine and are interpeted as <N> events in T <seconds>. When this property is set to a non-default value on a top-level vdev, those thresholds will also apply to its leaf vdevs. This behavior can be overridden by explicitly setting the property on the leaf vdev. Note that, these properties do not persist across vdev replacement. For this reason, it is advisable to set the property on the top-level vdev instead of the leaf vdev. The default values for zed's diagnosis engine (10 events, 600 seconds) remains unchanged. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Rob Wing <[email protected]> Sponsored-by: Seagate Technology LLC Closes #13805
* Cleanup of dead code suggested by Clang Static Analyzer (#14380)Richard Yao2023-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I recently gained the ability to run Clang's static analyzer on the linux kernel modules via a few hacks. This extended coverage to code that was previously missed since Clang's static analyzer only looked at code that we built in userspace. Running it against the Linux kernel modules built from my local branch produced a total of 72 reports against my local branch. Of those, 50 were reports of logic errors and 22 were reports of dead code. Since we already had cleaned up all of the previous dead code reports, I felt it would be a good next step to clean up these dead code reports. Clang did a further breakdown of the dead code reports into: Dead assignment 15 Dead increment 2 Dead nested assignment 5 The benefit of cleaning these up, especially in the case of dead nested assignment, is that they can expose places where our error handling is incorrect. A number of them were fairly straight forward. However several were not: In vdev_disk_physio_completion(), not only were we not using the return value from the static function vdev_disk_dio_put(), but nothing used it, so I changed it to return void and removed the existing (void) cast in the other area where we call it in addition to no longer storing it to a stack value. In FSE_createDTable(), the function is dead code. Its helper function FSE_freeDTable() is also dead code, as are the CPP definitions in `module/zstd/include/zstd_compat_wrapper.h`. We just delete it all. In zfs_zevent_wait(), we have an optimization opportunity. cv_wait_sig() returns 0 if there are waiting signals and 1 if there are none. The Linux SPL version literally returns `signal_pending(current) ? 0 : 1)` and FreeBSD implements the same semantics, we can just do `!cv_wait_sig()` in place of `signal_pending(current)` to avoid unnecessarily calling it again. zfs_setattr() on FreeBSD version did not have error handling issue because the code was removed entirely from FreeBSD version. The error is from updating the attribute directory's files. After some thought, I decided to propapage errors on it to userspace. In zfs_secpolicy_tmp_snapshot(), we ignore a lack of permission from the first check in favor of checking three other permissions. I assume this is intentional. In zfs_create_fs(), the return value of zap_update() was not checked despite setting an important version number. I see no backward compatibility reason to permit failures, so we add an assertion to catch failures. Interestingly, Linux is still using ASSERT(error == 0) from OpenSolaris while FreeBSD has switched to the improved ASSERT0(error) from illumos, although illumos has yet to adopt it here. ASSERT(error == 0) was used on Linux while ASSERT0(error) was used on FreeBSD since the entire file needs conversion and that should be the subject of another patch. dnode_move()'s issue was caused by us not having implemented POINTER_IS_VALID() on Linux. We have a stub in `include/os/linux/spl/sys/kmem_cache.h` for it, when it really should be in `include/os/linux/spl/sys/kmem.h` to be consistent with Illumos/OpenSolaris. FreeBSD put both `POINTER_IS_VALID()` and `POINTER_INVALIDATE()` in `include/os/freebsd/spl/sys/kmem.h`, so we copy what it did. Whenever a report was in platform-specific code, I checked the FreeBSD version to see if it also applied to FreeBSD, but it was only relevant a few times. Lastly, the patch that enabled Clang's static analyzer to be run on the Linux kernel modules needs more work before it can be put into a PR. I plan to do that in the future as part of the on-going static analysis work that I am doing. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14380
* Micro-optimize fletcher4 calculationsRichard Yao2022-12-056-68/+36
| | | | | | | | | | | | | | | | | | | | | When processing abds, we execute 1 `kfpu_begin()`/`kfpu_end()` pair on every page in the abd. This is wasteful and slows down checksum performance versus what the benchmark claimed. We correct this by moving those calls to the init and fini functions. Also, we always check the buffer length against 0 before calling the non-scalar checksum functions. This means that we do not need to execute the loop condition for the first loop iteration. That allows us to micro-optimize the checksum calculations by switching to do-while loops. Note that we do not apply that micro-optimization to the scalar implementation because there is no check in `fletcher_4_incremental_native()`/`fletcher_4_incremental_byteswap()` against 0 sized buffers being passed. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14247
* Convert some sprintf() calls to kmem_scnprintf()Richard Yao2022-11-281-2/+2
| | | | | | | | | | | | | | | | | | | These `sprintf()` calls are used repeatedly to write to a buffer. There is no protection against overflow other than reviewers explicitly checking to see if the buffers are big enough. However, such issues are easily missed during review and when they are missed, we would rather stop printing rather than have a buffer overflow, so we convert these functions to use `kmem_scnprintf()`. The Linux kernel provides an entire page for module parameters, so we are safe to write up to PAGE_SIZE. Removing `sprintf()` from these functions removes the last instances of `sprintf()` usage in our platform-independent kernel code. This improves XNU kernel compatibility because the XNU kernel does not support (removed support for?) `sprintf()`. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14209
* Allow to control failfastMariusz Zaborski2022-11-101-0/+3
| | | | | | | | | | | | | | | | | | | | | Linux defaults to setting "failfast" on BIOs, so that the OS will not retry IOs that fail, and instead report the error to ZFS. In some cases, such as errors reported by the HBA driver, not the device itself, we would wish to retry rather than generating vdev errors in ZFS. This new property allows that. This introduces a per vdev option to disable the failfast option. This also introduces a global module parameter to define the failfast mask value. Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Allan Jude <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Mariusz Zaborski <[email protected]> Sponsored-by: Seagate Technology LLC Submitted-by: Klara, Inc. Closes #14056
* dsl_prop_known_index(): check for invalid propDamian Szuberski2022-11-081-0/+26
| | | | | | | | Resolve UBSAN array-index-out-of-bounds error in zprop_desc_t. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #14142 Closes #14147
* Add options to zfs redundant_metadata propertyAkash B2022-10-191-1/+4
| | | | | | | | | | | | | | | | | Currently, additional/extra copies are created for metadata in addition to the redundancy provided by the pool(mirror/raidz/draid), due to this 2 times more space is utilized per inode and this decreases the total number of inodes that can be created in the filesystem. By setting redundant_metadata to none, no additional copies of metadata are created, hence can reduce the space consumed by the additional metadata copies and increase the total number of inodes that can be created in the filesystem. Additionally, this can improve file create performance due to the reduced amount of metadata which needs to be written. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Dipak Ghosh <[email protected]> Signed-off-by: Akash B <[email protected]> Closes #13680
* Enable relatime by defaultGeorge Melikov2022-08-121-1/+1
| | | | | | | | | | Linux sets relatime on mount by default for any file system, but relatime=off in ZFS disables it explicitly. Let's be consistent with other file systems on Linux. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #13614
* Add snapshots_changed as propertyUmer Saleem2022-08-021-0/+5
| | | | | | | | | | | | | | | | | | | Make dd_snap_cmtime property persistent across mount and unmount operations by storing in ZAP and restore the value from ZAP on hold into dd_snap_cmtime instead of updating it. Expose dd_snap_cmtime as 'snapshots_changed' property that provides a mechanism to quickly determine whether snapshot list for dataset has changed without having to mount a dataset or iterate the snapshot list. It specifies the time at which a snapshot for a dataset was last created or deleted. This allows us to be more efficient how often we query snapshots. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13635
* Replace dead opensolaris.org license linkTino Reichardt2022-07-119-9/+9
| | | | | | | | | The commit replaces all findings of the link: http://www.opensolaris.org/os/licensing with this one: https://opensource.org/licenses/CDDL-1.0 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13619
* Enable -Wwrite-stringsнаб2022-06-291-5/+3
| | | | | | | | Also, fix leak from ztest_global_vars_to_zdb_args() Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13348
* Introduce BLAKE3 checksums as an OpenZFS featureTino Reichardt2022-06-082-12/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds BLAKE3 checksums to OpenZFS, it has similar performance to Edon-R, but without the caveats around the latter. Homepage of BLAKE3: https://github.com/BLAKE3-team/BLAKE3 Wikipedia: https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE3 Short description of Wikipedia: BLAKE3 is a cryptographic hash function based on Bao and BLAKE2, created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O'Hearn. It was announced on January 9, 2020, at Real World Crypto. BLAKE3 is a single algorithm with many desirable features (parallelism, XOF, KDF, PRF and MAC), in contrast to BLAKE and BLAKE2, which are algorithm families with multiple variants. BLAKE3 has a binary tree structure, so it supports a practically unlimited degree of parallelism (both SIMD and multithreading) given enough input. The official Rust and C implementations are dual-licensed as public domain (CC0) and the Apache License. Along with adding the BLAKE3 hash into the OpenZFS infrastructure a new benchmarking file called chksum_bench was introduced. When read it reports the speed of the available checksum functions. On Linux: cat /proc/spl/kstat/zfs/chksum_bench On FreeBSD: sysctl kstat.zfs.misc.chksum_bench This is an example output of an i3-1005G1 test system with Debian 11: implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1196 1602 1761 1749 1762 1759 1751 skein-generic 546 591 608 615 619 612 616 sha256-generic 240 300 316 314 304 285 276 sha512-generic 353 441 467 476 472 467 426 blake3-generic 308 313 313 313 312 313 312 blake3-sse2 402 1289 1423 1446 1432 1458 1413 blake3-sse41 427 1470 1625 1704 1679 1607 1629 blake3-avx2 428 1920 3095 3343 3356 3318 3204 blake3-avx512 473 2687 4905 5836 5844 5643 5374 Output on Debian 5.10.0-10-amd64 system: (Ryzen 7 5800X) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1840 2458 2665 2719 2711 2723 2693 skein-generic 870 966 996 992 1003 1005 1009 sha256-generic 415 442 453 455 457 457 457 sha512-generic 608 690 711 718 719 720 721 blake3-generic 301 313 311 309 309 310 310 blake3-sse2 343 1865 2124 2188 2180 2181 2186 blake3-sse41 364 2091 2396 2509 2463 2482 2488 blake3-avx2 365 2590 4399 4971 4915 4802 4764 Output on Debian 5.10.0-9-powerpc64le system: (POWER 9) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1213 1703 1889 1918 1957 1902 1907 skein-generic 434 492 520 522 511 525 525 sha256-generic 167 183 187 188 188 187 188 sha512-generic 186 216 222 221 225 224 224 blake3-generic 153 152 154 153 151 153 153 blake3-sse2 391 1170 1366 1406 1428 1426 1414 blake3-sse41 352 1049 1212 1174 1262 1258 1259 Output on Debian 5.10.0-11-arm64 system: (Pi400) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 487 603 629 639 643 641 641 skein-generic 271 299 303 308 309 309 307 sha256-generic 117 127 128 130 130 129 130 sha512-generic 145 165 170 172 173 174 175 blake3-generic 81 29 71 89 89 89 89 blake3-sse2 112 323 368 379 380 371 374 blake3-sse41 101 315 357 368 369 364 360 Structurally, the new code is mainly split into these parts: - 1x cross platform generic c variant: blake3_generic.c - 4x assembly for X86-64 (SSE2, SSE4.1, AVX2, AVX512) - 2x assembly for ARMv8 (NEON converted from SSE2) - 2x assembly for PPC64-LE (POWER8 converted from SSE2) - one file for switching between the implementations Note the PPC64 assembly requires the VSX instruction set and the kfpu_begin() / kfpu_end() calls on PowerPC were updated accordingly. Reviewed-by: Felix Dörre <[email protected]> Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Co-authored-by: Rich Ercolani <[email protected]> Closes #10058 Closes #12918
* Improve zpool status output, list all affected datasetsGeorge Amanakis2022-04-251-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, determining which datasets are affected by corruption is a manual process. The primary difficulty in reporting the list of affected snapshots is that since the error was initially found, the snapshot where the error originally occurred in, may have been deleted. To solve this issue, we add the ID of the head dataset of the original snapshot which the error was detected in, to the stored error report. Then any time a filesystem is deleted, the errors associated with it are deleted as well. Any time a clone promote occurs, we modify reports associated with the original head to refer to the new head. The stored error reports are identified by this head ID, the birth time of the block which the error occurred in, as well as some information about the error itself are also stored. Once this information is stored, we can find the set of datasets affected by an error by walking back the list of snapshots in the given head until we find one with the appropriate birth txg, and then traverse through the snapshots of the clone family, terminating a branch if the block was replaced in a given snapshot. Then we report this information back to libzfs, and to the zpool status command, where it is displayed as follows: pool: test state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A scan: scrub repaired 0B in 00:00:00 with 800 errors on Fri Dec 3 08:27:57 2021 config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 sdb ONLINE 0 0 1.58K errors: Permanent errors have been detected in the following files: test@1:/test.0.0 /test/test.0.0 /test/1clone/test.0.0 A new feature flag is introduced to mark the presence of this change, as well as promotion and backwards compatibility logic. This is an updated version of #9175. Rebase required fixing the tests, updating the ABI of libzfs, updating the man pages, fixing bugs, fixing the error returns, and updating the old on-disk error logs to the new format when activating the feature. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Co-authored-by: TulsiJain <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #9175 Closes #12812
* linux: module: weld all but spl.ko into zfs.koнаб2022-04-202-35/+7
| | | | | | | | | | | | | | Originally it was thought it would be useful to split up the kmods by functionality. This would allow external consumers to only load what was needed. However, in practice we've never had a case where this functionality would be needed, and conversely managing multiple kmods can be awkward. Therefore, this change merges all but the spl.ko kmod in to a single zfs.ko kmod. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13274
* Fix string/index variables being unflexed by defaultнаб2022-04-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I got the status backward (B_FALSE for fixed, rather than B_TRUE for flex); before: $ zfs get mountpoint tarta-zoot -r NAME PROPERTY VALUE SOURCE tarta-zoot mountpoint / local tarta-zoot/PAGEFILE.SYS mountpoint - - tarta-zoot/etc mountpoint /etc inherited from tarta-zoot tarta-zoot/home mountpoint /home inherited from tarta-zoot tarta-zoot/home/xspon mountpoint /home/xspon inherited from tarta-zoot tarta-zoot/home/nabijaczleweli mountpoint /home/nabijaczleweli inherited from tarta-zoot tarta-zoot/home/nabijaczleweli/tftp mountpoint /home/nabijaczleweli/tftp inherited from tarta-zoot tarta-zoot/home/root mountpoint /root local after: $ zfs get mountpoint tarta-zoot -r NAME PROPERTY VALUE SOURCE tarta-zoot mountpoint / local tarta-zoot/PAGEFILE.SYS mountpoint - - tarta-zoot/etc mountpoint /etc inherited from tarta-zoot tarta-zoot/home mountpoint /home inherited from tarta-zoot tarta-zoot/home/xspon mountpoint /home/xspon inherited from tarta-zoot tarta-zoot/home/nabijaczleweli mountpoint /home/nabijaczleweli inherited from tarta-zoot tarta-zoot/home/nabijaczleweli/tftp mountpoint /home/nabijaczleweli/tftp inherited from tarta-zoot tarta-zoot/home/root mountpoint /root local Fixes: be8e1d81bf2b85f6f76851b7013ac54e9bf488f0 ("Flex non-pretty-printed properties and raw-/pretty-print remaining ones") Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13286
* Forbid b{copy,zero,cmp}(). Don't include <strings.h> for <string.h>наб2022-03-157-7/+7
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12996
* Remove bcopy(), bzero(), bcmp()наб2022-03-157-8/+8
| | | | | | | | | | bcopy() has a confusing argument order and is actually a move, not a copy; they're all deprecated since POSIX.1-2001 and removed in -2008, and we shim them out to mem*() on Linux anyway Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12996
* Linux x86 SIMD: factor out unneeded kernel dependenciesAttila Fülöp2022-03-081-3/+3
| | | | | | | | | | | | | | | | | Cleanup the kernel SIMD code by removing kernel dependencies. - Replace XSTATE_XSAVE with our own XSAVE implementation for all kernels not exporting kernel_fpu{begin,end}(), see #13059 - Replace union fpregs_state by a uint8_t * buffer and get the size of the buffer from the hardware via the CPUID instruction - Replace kernels xgetbv() by our own implementation which was already there for userspace. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #13102
* Flex non-pretty-printed properties and raw-/pretty-print remaining onesнаб2022-03-043-109/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before: nabijaczleweli@tarta:~/store/code/zfs$ /sbin/zpool list -Td -o name,size,alloc,free,ckpoint,expandsz,guid,load_guid,frag,cap,dedup,health,altroot,guid,dedupditto,load_guid,maxblocksize,maxdnodesize 2>/dev/null Sun 20 Feb 03:57:44 CET 2022 NAME SIZE ALLOC FREE CKPOINT EXPANDSZ GUID LOAD_GUID FRAG CAP DEDUP HEALTH ALTROOT GUID DEDUPDITTO LOAD_GUID MAXBLOCKSIZE MAXDNODESIZE filling 25.5T 6.52T 18.9T - 64M 11512889483096932869 11656109927366648364 1% 25% 1.00x ONLINE - 11512889483096932869 0 11656109927366648364 1048576 16384 tarta-boot 240M 50.6M 189M - - 2372068846917849656 7752280792179633787 12% 21% 1.00x ONLINE - 2372068846917849656 0 7752280792179633787 1048576 512 tarta-zoot 55.5G 6.42G 49.1G - - 12971868889665384604 8622632123393589527 17% 11% 1.00x ONLINE - 12971868889665384604 0 8622632123393589527 1048576 16384 nabijaczleweli@tarta:~/store/code/zfs$ /sbin/zfs list -o name,guid,keyguid,ivsetguid,createtxg,objsetid,pbkdf2iters,refratio -r tarta-zoot NAME GUID KEYGUID IVSETGUID CREATETXG OBJSETID PBKDF2ITERS REFRATIO tarta-zoot 1110930838977259561 659P - 1 54 0 1.03x tarta-zoot/PAGEFILE.SYS 2202570496672997800 3.20E - 2163 1539 0 1.07x tarta-zoot/dupa 16941280502417785695 9.81E - 2274707 1322 1000000000000 1.00x tarta-zoot/etc 17029963068508333530 12.9E - 3663 1087 0 1.52x tarta-zoot/home 3508163802370032575 8.50E - 3664 294 0 1.00x tarta-zoot/home/misio 7283672744014848555 13.0E - 3665 302 0 2.28x tarta-zoot/home/nabijaczleweli 12286744508078616303 5.15E - 3666 200 0 2.05x tarta-zoot/home/nabijaczleweli/tftp 13551632689932817643 5.16E - 3667 1095 0 1.00x tarta-zoot/home/root 5203106193060067946 15.4E - 3668 698 0 2.86x tarta-zoot/home/shared-config 8866040021005142194 14.5E - 3670 2069 0 1.20x tarta-zoot/home/tymek 9472751824283011822 4.56E - 3671 1202 0 1.32x tarta-zoot/oldboot 10460192444135730377 13.8E - 2268398 1232 0 1.01x tarta-zoot/opt 9945621324983170410 5.84E - 3672 1210 0 1.00x tarta-zoot/opt/icecc 13178238931846132425 9.04E - 3673 1103 0 2.83x tarta-zoot/opt/swtpm 10172962421514870859 4.13E - 825669 145132 0 1.87x tarta-zoot/srv 217179989022738337 3.90E - 3674 2469 0 1.00x tarta-zoot/usr 12214213243060765090 15.0E - 3675 2477 0 2.58x tarta-zoot/usr/local 7542700368693813134 941P - 3676 2484 0 2.33x tarta-zoot/var 13414177124447929530 10.2E - 3677 2492 0 1.57x tarta-zoot/var/lib 6969944550407159241 5.28E - 3678 2499 0 2.34x tarta-zoot/var/tmp 6399468088048343912 1.34E - 3679 1218 0 3.95x After: nabijaczleweli@tarta:~/store/code/zfs$ cmd/zpool/zpool list -Td -o name,size,alloc,free,ckpoint,expandsz,guid,load_guid,frag,cap,dedup,health,altroot,guid,dedupditto,load_guid,maxblocksize,maxdnodesize 2>/dev/null Sun 20 Feb 03:57:42 CET 2022 NAME SIZE ALLOC FREE CKPOINT EXPANDSZ GUID LOAD_GUID FRAG CAP DEDUP HEALTH ALTROOT GUID DEDUPDITTO LOAD_GUID MAXBLOCKSIZE MAXDNODESIZE filling 25.5T 6.52T 18.9T - 64M 11512889483096932869 11656109927366648364 1% 25% 1.00x ONLINE - 11512889483096932869 0 11656109927366648364 1M 16K tarta-boot 240M 50.6M 189M - - 2372068846917849656 7752280792179633787 12% 21% 1.00x ONLINE - 2372068846917849656 0 7752280792179633787 1M 512 tarta-zoot 55.5G 6.42G 49.1G - - 12971868889665384604 8622632123393589527 17% 11% 1.00x ONLINE - 12971868889665384604 0 8622632123393589527 1M 16K nabijaczleweli@tarta:~/store/code/zfs$ cmd/zfs/zfs list -o name,guid,keyguid,ivsetguid,createtxg,objsetid,pbkdf2iters,refratio -r tarta-zoot NAME GUID KEYGUID IVSETGUID CREATETXG OBJSETID PBKDF2ITERS REFRATIO tarta-zoot 1110930838977259561 741529699813639505 - 1 54 0 1.03x tarta-zoot/PAGEFILE.SYS 2202570496672997800 3689529982640017884 - 2163 1539 0 1.07x tarta-zoot/dupa 16941280502417785695 11312442953423259518 - 2274707 1322 1000000000000 1.00x tarta-zoot/etc 17029963068508333530 14852574366795347233 - 3663 1087 0 1.52x tarta-zoot/home 3508163802370032575 9802810070759776956 - 3664 294 0 1.00x tarta-zoot/home/misio 7283672744014848555 14983161489316798151 - 3665 302 0 2.28x tarta-zoot/home/nabijaczleweli 12286744508078616303 5937870537299886218 - 3666 200 0 2.05x tarta-zoot/home/nabijaczleweli/tftp 13551632689932817643 5950522828900813054 - 3667 1095 0 1.00x tarta-zoot/home/root 5203106193060067946 17718025091255443518 - 3668 698 0 2.86x tarta-zoot/home/shared-config 8866040021005142194 16716354482778968577 - 3670 2069 0 1.20x tarta-zoot/home/tymek 9472751824283011822 5251854710505749954 - 3671 1202 0 1.32x tarta-zoot/oldboot 10460192444135730377 15894065034622168157 - 2268398 1232 0 1.01x tarta-zoot/opt 9945621324983170410 6737735639539098405 - 3672 1210 0 1.00x tarta-zoot/opt/icecc 13178238931846132425 10425145983015238428 - 3673 1103 0 2.83x tarta-zoot/opt/swtpm 10172962421514870859 4764783754852521469 - 825669 145132 0 1.87x tarta-zoot/srv 217179989022738337 4492810461439647259 - 3674 2469 0 1.00x tarta-zoot/usr 12214213243060765090 17306702395865262834 - 3675 2477 0 2.58x tarta-zoot/usr/local 7542700368693813134 1059954157997659784 - 3676 2484 0 2.33x tarta-zoot/var 13414177124447929530 11764397504176937123 - 3677 2492 0 1.57x tarta-zoot/var/lib 6969944550407159241 6084753728494937404 - 3678 2499 0 2.34x tarta-zoot/var/tmp 6399468088048343912 1548692824635344277 - 3679 1218 0 3.95x Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13122 Closes #13125
* module: zcommon: zprop: common: zprop_width: namespace exceptionsнаб2022-03-041-2/+6
| | | | | | | | | | | | | Before this, /all/ numerical properties 1 (ZFS_PROP_CREATION, ZPOOL_PROP_SIZE, VDEV_PROP_CAPACITY) would be non-fixed and /all/ numerical properties 5 (ZFS_PROP_COMPRESSRATIO, ZPOOL_PROP_HEALTH, VDEV_PROP_PSIZE) would be 8-wide Realistically, this doesn't appear to be much of a problem Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13125
* log xattr=sa create/remove/update to ZILJitendra Patidar2022-02-221-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As such, there are no specific synchronous semantics defined for the xattrs. But for xattr=on, it does log to ZIL and zil_commit() is done, if sync=always is set on dataset. This provides sync semantics for xattr=on with sync=always set on dataset. For the xattr=sa implementation, it doesn't log to ZIL, so, even with sync=always, xattrs are not guaranteed to be synced before xattr call returns to caller. So, xattr can be lost if system crash happens, before txg carrying xattr transaction is synced. This change adds xattr=sa logging to ZIL on xattr create/remove/update and xattrs are synced to ZIL (zil_commit() done) for sync=always. This makes xattr=sa behavior similar to xattr=on. Implementation notes: The actual logging is fairly straight-forward and does not warrant additional explanation. However, it has been 14 years since we last added new TX types to the ZIL [1], hence this is the first time we do it after the introduction of zpool features. Therefore, here is an overview of the feature activation and deactivation workflow: 1. The feature must be enabled. Otherwise, we don't log the new record type. This ensures compatibility with older software. 2. The feature is activated per-dataset, since the ZIL is per-dataset. 3. If the feature is enabled and dataset is not for zvol, any append to the ZIL chain will activate the feature for the dataset. Likewise for starting a new ZIL chain. 4. A dataset that doesn't have a ZIL chain has the feature deactivated. We ensure (3) by activating on the first zil_commit() after the feature was enabled. Since activating the features requires waiting for txg sync, the first zil_commit() after enabling the feature will be slower than usual. The downside is that this is really a conservative approximation: even if we never append a 'TX_SETSAXATTR' to the ZIL chain, we pay the penalty for feature activation. The upside is that the user is in control of when we pay the penalty, i.e., upon enabling the feature. We ensure (4) by hooking into zil_sync(), where ZIL destroy actually happens. One more piece on feature activation, since it's spread across multiple functions: zil_commit() zil_process_commit_list() if lwb == NULL // first zil_commit since zil_open zil_create() if no log block pointer in ZIL header: if feature enabled and not active: // CASE 1 enable, COALESCE txg wait with dmu_tx that allocated the log block else // log block was allocated earlier than this zil_open if feature enabled and not active: // CASE 2 enable, EXPLICIT txg wait else // already have an in-DRAM LWB if feature enabled and not active: // this happens when we enable the feature after zil_create // CASE 3 enable, EXPLICIT txg wait [1] https://github.com/illumos/illumos-gate/commit/da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0 Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jitendra Patidar <[email protected]> Closes #8768 Closes #9078
* Linux 5.16 compat: don't use XSTATE_XSAVE to save FPU stateAttila Fülöp2022-02-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 5.16 moved XSTATE_XSAVE and XSTATE_XRESTORE out of our reach, so add our own XSAVE{,OPT,S} code and use it for Linux 5.16. Please note that this differs from previous behavior in that it won't handle exceptions created by XSAVE an XRSTOR. This is sensible for three reasons. - Exceptions during XSAVE and XRSTOR can only occur if the feature is not supported or enabled or the memory operand isn't aligned on a 64 byte boundary. If this happens something else went terribly wrong, and it may be better to stop execution. - Previously we just printed a warning and didn't handle the fault, this is arguable for the above reason. - All other *SAVE instruction also don't handle exceptions, so this at least aligns behavior. Finally add a test to catch such a regression in the future. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #13042 Closes #13059
* Add `--enable-asan` and `--enable-ubsan` switchesDamian Szuberski2022-02-037-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | `configure` now accepts `--enable-asan` and `--enable-ubsan` switches which results in passing `-fsanitize=address` and `-fsanitize=undefined`, respectively, to the compiler. Those flags are enabled in GitHub workflows for ZTS and zloop. Errors reported by both instrumentations are corrected, except for: - Memory leak reporting is (temporarily) suppressed. The cost of fixing them is relatively high compared to the gains. - Checksum computing functions in `module/zcommon/zfs_fletcher*` have UBSan errors suppressed. It is completely impractical to enforce 64-byte payload alignment there due to performance impact. - There's no ASan heap poisoning in `module/zstd/lib/zstd.c`. A custom memory allocator is used there rendering that measure unfeasible. - Memory leaks detection has to be suppressed for `cmd/zvol_id`. `zvol_id` is run by udev with the help of `ptrace(2)`. Tracing is incompatible with memory leaks detection. Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #12928
* Clean up CSTYLEDsнаб2022-01-261-3/+1
| | | | | | | | | | | | | | | | | | | | 69 CSTYLED BEGINs remain, appx. 30 of which can be removed if cstyle(1) had a useful policy regarding CALL(ARG1, ARG2, ARG3); above 2 lines. As it stands, it spits out *both* sysctl_os.c: 385: continuation line should be indented by 4 spaces sysctl_os.c: 385: indent by spaces instead of tabs which is very cool Another >10 could be fixed by removing "ulong" &al. handling. I don't foresee anyone actually using it intentionally (does it even exist in modern headers? why did it in the first place?). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12993
* module/*.ko: prune .data, global .rodataнаб2022-01-144-21/+10
| | | | | | | | | | | | Evaluated every variable that lives in .data (and globals in .rodata) in the kernel modules, and constified/eliminated/localised them appropriately. This means that all read-only data is now actually read-only data, and, if possible, at file scope. A lot of previously- global-symbols became inlinable (and inlined!) constants. Probably not in a big Wowee Performance Moment, but hey. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12899
* module: zcommon: zprop: fix unused, remove argsusedнаб2021-12-231-0/+1
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12844
* libzfs: fix unused, remove argsusedнаб2021-12-231-5/+4
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12844
* zcommon: pre-iterate over sysfs instead of statting every featureнаб2021-12-164-400/+553
| | | | | | | | | | | | | | | | | | | | | If sufficient memory (<2K, realistically) is available, libzfs_init() can be significantly shorted by iterating over the correct sysfs directory before registrations, we can turn 168 stats into 15/18 syscalls (3 opens (6 if built in), 3 fstats, 6 getdentses, and 3 closes), a tenfoldish reduction; this is probably a bit faster, too. The list is always optional, and registration functions (and one-off users) can simply pass NULL, which will fall back to the previous mechanism Also, don't allocate in zfs_mod_supported_impl, and use use access() instead of stat(), since existence is really what we care about Also, fix pre-prop-checking compat in fallback for built-in ZFS Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12089
* zcommon: *_prop: make all zprop_index_t tables constнаб2021-12-162-28/+28
| | | | | | | | | They're already static, and there's no point in them being R/W and living outside .rodata Reviewed-by: RageLtMan <rageltman@sempervictus> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12836
* FreeBSD supports edonr follow upнаб2021-12-081-1/+2
| | | | | | | | | | | | This chases 269b5dadcfd1d5732cf763dddcd46009a332eae4 (#12735), which touched the actual code but didn't fix the comment Additionally, ignore the name. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12823
* Vdev Properties FeatureAllan Jude2021-11-303-14/+272
| | | | | | | | | | | | | | | | | | | | | | | | Add properties, similar to pool properties, to each vdev. This makes use of the existing per-vdev ZAP that was added as part of device evacuation/removal. A large number of read-only properties are exposed, many of the members of struct vdev_t, that provide useful statistics. Adds support for read-only "removing" vdev property. Adds the "allocating" property that defaults to "on" and can be set to "off" to prevent future allocations from that top-level vdev. Supports user-defined vdev properties. Includes support for properties.vdev in SYSFS. Co-authored-by: Allan Jude <[email protected]> Co-authored-by: Mark Maybee <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #11711
* Enable edonr in FreeBSDRich Ercolani2021-11-162-20/+2
| | | | | | | | | | | The code is integrated, builds fine, runs fine, there's not really any reason not to. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #12735
* libzfs: add keylocation=https://, backed by fetch(3) or libcurlнаб2021-05-121-1/+5
| | | | | | | | | | | Add support for http and https to the keylocation properly to allow encryption keys to be fetched from the specified URL. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Issue #9543 Closes #9947 Closes #11956
* Fix AVX512BW Fletcher code on AVX512-but-not-BW machinesRomain Dolbeau2021-04-261-1/+7
| | | | | | | | | | Introduce a specific valid function for avx512f+avx512bw (instead of checking only for avx512f). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Adam Moss <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #11937 Closes #11938
* Fix various typosAndrea Gelmini2021-04-021-1/+1
| | | | | | | | | | Correct an assortment of typos throughout the code base. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Andrea Gelmini <[email protected]> Closes #11774
* Allow pool names that look like Solaris disk namesRyan Moeller2021-04-011-6/+0
| | | | | | | | | | | Nothing bad happens if a prefix of your pool name matches a disk name. This is a bit of a silly restriction at this point. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11781 Closes #11813
* Add "zstd-fast" to help options for "compression" propertyJake Howard2021-03-031-1/+1
| | | | | | | This value does work as expected, and is documented in the manpage. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jake Howard <[email protected]> Closes #11670
* Add missing checks for unsupported featuresMartin Matuška2021-02-271-0/+2
| | | | | | | | | | | | | After 35ec517 it has become possible to import ZFS pools witn an active org.illumos:edonr feature on FreeBSD, leading to a panic. In addition, "zpool status" reported all pools without edonr as upgradable and "zpool upgrade -v" reported edonr in the list of upgradable features. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Martin Matuska <[email protected]> Closes #11653
* Add "compatibility" property for zpool feature setsColm2021-02-171-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Property to allow sets of features to be specified; for compatibility with specific versions / releases / external systems. Influences the behavior of 'zpool upgrade' and 'zpool create'. Initial man page changes and test cases included. Brief synopsis: zpool create -o compatibility=off|legacy|file[,file...] pool vdev... compatibility = off : disable compatibility mode (enable all features) compatibility = legacy : request that no features be enabled compatibility = file[,file...] : read features from specified files. Only features present in *all* files will be enabled on the resulting pool. Filenames may be absolute, or relative to /etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d (/etc checked first). Only affects zpool create, zpool upgrade and zpool status. ABI changes in libzfs: * New function "zpool_load_compat" to load and parse compat sets. * Add "zpool_compat_status_t" typedef for compatibility parse status. * Add ZPOOL_PROP_COMPATIBILITY to the pool properties enum * Add ZPOOL_STATUS_COMPATIBILITY_ERR to the pool status enum An initial set of base compatibility sets are included in cmd/zpool/compatibility.d, and the Makefile for cmd/zpool is modified to install these in $pkgdatadir/compatibility.d and to create symbolic links to a reasonable set of aliases. Reviewed-by: ericloewe Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Colm Buckley <[email protected]> Closes #11468
* FreeBSD: disable edonr in zfs_mod_supported_feature()Brian Behlendorf2021-02-171-5/+8
| | | | | | | | | | | Rather than conditionally compiling out the edonr code for FreeBSD update zfs_mod_supported_feature() to indicate this feature is unsupported. This ensures that all spa features are defined on every platform, even if they are not supported. Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11605 Issue #11468
* Linux 5.10 compat: use iov_iter in uio structureBrian Behlendorf2020-12-182-288/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As of the 5.10 kernel the generic splice compatibility code has been removed. All filesystems are now responsible for registering a ->splice_read and ->splice_write callback to support this operation. The good news is the VFS provided generic_file_splice_read() and iter_file_splice_write() callbacks can be used provided the ->iter_read and ->iter_write callback support pipes. However, this is currently not the case and only iovecs and bvecs (not pipes) are ever attached to the uio structure. This commit changes that by allowing full iov_iter structures to be attached to uios. Ever since the 4.9 kernel the iov_iter structure has supported iovecs, kvecs, bvevs, and pipes so it's desirable to pass the entire thing when possible. In conjunction with this the uio helper functions (i.e uiomove(), uiocopy(), etc) have been updated to understand the new UIO_ITER type. Note that using the kernel provided uio_iter interfaces allowed the existing Linux specific uio handling code to be simplified. When there's no longer a need to support kernel's older than 4.9, then it will be possible to remove the iovec and bvec members from the uio structure and always use a uio_iter. Until then we need to maintain all of the existing types for older kernels. Some additional refactoring and cleanup was included in this change: - Added checks to configure to detect available iov_iter interfaces. Some are available all the way back to the 3.10 kernel and are used when available. In particular, uio_prefaultpages() now always uses iov_iter_fault_in_readable() which is available for all supported kernels. - The unused UIO_USERISPACE type has been removed. It is no longer needed now that the uio_seg enum is platform specific. - Moved zfs_uio.c from the zcommon.ko module to the Linux specific platform code for the zfs.ko module. This gets it out of libzfs where it was never needed and keeps this Linux specific code out of the common sources. - Removed unnecessary O_APPEND handling from zfs_iter_write(), this is redundant and O_APPEND is already handled in zfs_write(); Reviewed-by: Colin Ian King <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11351
* FreeBSD: Implement sysctl for fletcher4 implRyan Moeller2020-12-111-8/+59
| | | | | | | | | | | | There is a tunable to select the fletcher 4 checksum implementation on Linux but it was not present in FreeBSD. Implement the sysctl handler for FreeBSD and use ZFS_MODULE_PARAM_CALL to provide the tunable on both platforms. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11270
* FreeBSD: Do zcommon_init sooner to avoid FPU panicRyan Moeller2020-12-091-1/+1
| | | | | | | | | | | | | | | | | | | | | There has been a panic affecting some system configurations where the thread FPU context is disturbed during the fletcher 4 benchmarks, leading to a panic at boot. module_init() registers zcommon_init to run in the last subsystem (SI_SUB_LAST). Running it as soon as interrupts have been configured (SI_SUB_INT_CONFIG_HOOKS) makes sure we have finished the benchmarks before we start doing other things. While it's not clear *how* the FPU context was being disturbed, this does seem to avoid it. Add a module_init_early() macro to run zcommon_init() at this earlier point on FreeBSD. On Linux this is defined as module_init(). Authored by: Konstantin Belousov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11302