aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* Add Linux namespace delegation supportWill Andrews2022-06-1033-15/+1166
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows ZFS datasets to be delegated to a user/mount namespace Within that namespace, only the delegated datasets are visible Works very similarly to Zones/Jailes on other ZFS OSes As a user: ``` $ unshare -Um $ zfs list no datasets available $ echo $$ 1234 ``` As root: ``` # zfs list NAME ZONED MOUNTPOINT containers off /containers containers/host off /containers/host containers/host/child off /containers/host/child containers/host/child/gchild off /containers/host/child/gchild containers/unpriv on /unpriv containers/unpriv/child on /unpriv/child containers/unpriv/child/gchild on /unpriv/child/gchild # zfs zone /proc/1234/ns/user containers/unpriv ``` Back to the user namespace: ``` $ zfs list NAME USED AVAIL REFER MOUNTPOINT containers 129M 47.8G 24K /containers containers/unpriv 128M 47.8G 24K /unpriv containers/unpriv/child 128M 47.8G 128M /unpriv/child ``` Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Will Andrews <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Mateusz Piotrowski <[email protected]> Co-authored-by: Allan Jude <[email protected]> Co-authored-by: Mateusz Piotrowski <[email protected]> Sponsored-by: Buddy <https://buddy.works> Closes #12263
* Revert parts of 938cfeb0f27303721081223816d4f251ffeb1767Allan Jude2022-06-101-16/+0
| | | | | | | | | | | | | | | When read and writing the UID/GID, we always want the value relative to the root user namespace, the kernel will take care of remapping this to the user namespace for us. Calling from_kuid(user_ns, uid) with a unmapped uid will return -1 as that uid is outside of the scope of that namespace, and will result in the files inside the namespace all being owned by 'nobody' and not being allowed to call chmod or chown on them. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #12263
* AVL: Remove obsolete branching optimizationsAlexander Motin2022-06-091-20/+4
| | | | | | | | | | | | | | | | | Modern Clang and GCC can successfully implement simple conditions without branching with math and flag operations. Use of arrays for translation no longer helps as much as it was 14+ years ago. Disassemble of the code generated by Clang 13.0.0 on FreeBSD 13.1, Clang 14.0.4 on FreeBSD 14 and GCC 10.2.1 on Debian 11 with this change still shows no branching instructions. Profiling of CPU-bound scan stage of sorted scrub shows reproducible reduction of time spent inside avl_find() from 6.52% to 4.58%. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #13540
* libzfs: Rename msg bufs to errbuf for consistencyRyan Moeller2022-06-091-135/+138
| | | | | | | | | | | | `libzfs_pool.c` uses the name `msg` where everywhere else in libzfs uses `errbuf` for the error message buffer. Use the name consistent with the rest of libzfs and use ERRBUFLEN instead of 1024. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13539
* libzfs: Define the defecto standard errbuf sizeRyan Moeller2022-06-098-52/+52
| | | | | | | | | | Every errbuf array in libzfs is 1024 chars. Define ERRBUFLEN in a shared header, and use it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13539
* zvol: Support blk-mq for better performanceTony Hutter2022-06-0918-148/+1437
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the kernel's block multiqueue (blk-mq) interface in the zvol block driver. blk-mq creates multiple request queues on different CPUs rather than having a single request queue. This can improve zvol performance with multithreaded reads/writes. This implementation uses the blk-mq interfaces on 4.13 or newer kernels. Building against older kernels will fall back to the older BIO interfaces. Note that you must set the `zvol_use_blk_mq` module param to enable the blk-mq API. It is disabled by default. In addition, this commit lets the zvol blk-mq layer process whole `struct request` IOs at a time, rather than breaking them down into their individual BIOs. This reduces dbuf lock contention and overhead versus the legacy zvol submit_bio() codepath. sequential dd to one zvol, 8k volblocksize, no O_DIRECT: legacy submit_bio() 292MB/s write 453MB/s read this commit 453MB/s write 885MB/s read It also introduces a new `zvol_blk_mq_chunks_per_thread` module parameter. This parameter represents how many volblocksize'd chunks to process per each zvol thread. It can be used to tune your zvols for better read vs write performance (higher values favor write, lower favor read). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #13148 Issue #12483
* Introduce BLAKE3 checksums as an OpenZFS featureTino Reichardt2022-06-0853-52/+22804
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds BLAKE3 checksums to OpenZFS, it has similar performance to Edon-R, but without the caveats around the latter. Homepage of BLAKE3: https://github.com/BLAKE3-team/BLAKE3 Wikipedia: https://en.wikipedia.org/wiki/BLAKE_(hash_function)#BLAKE3 Short description of Wikipedia: BLAKE3 is a cryptographic hash function based on Bao and BLAKE2, created by Jack O'Connor, Jean-Philippe Aumasson, Samuel Neves, and Zooko Wilcox-O'Hearn. It was announced on January 9, 2020, at Real World Crypto. BLAKE3 is a single algorithm with many desirable features (parallelism, XOF, KDF, PRF and MAC), in contrast to BLAKE and BLAKE2, which are algorithm families with multiple variants. BLAKE3 has a binary tree structure, so it supports a practically unlimited degree of parallelism (both SIMD and multithreading) given enough input. The official Rust and C implementations are dual-licensed as public domain (CC0) and the Apache License. Along with adding the BLAKE3 hash into the OpenZFS infrastructure a new benchmarking file called chksum_bench was introduced. When read it reports the speed of the available checksum functions. On Linux: cat /proc/spl/kstat/zfs/chksum_bench On FreeBSD: sysctl kstat.zfs.misc.chksum_bench This is an example output of an i3-1005G1 test system with Debian 11: implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1196 1602 1761 1749 1762 1759 1751 skein-generic 546 591 608 615 619 612 616 sha256-generic 240 300 316 314 304 285 276 sha512-generic 353 441 467 476 472 467 426 blake3-generic 308 313 313 313 312 313 312 blake3-sse2 402 1289 1423 1446 1432 1458 1413 blake3-sse41 427 1470 1625 1704 1679 1607 1629 blake3-avx2 428 1920 3095 3343 3356 3318 3204 blake3-avx512 473 2687 4905 5836 5844 5643 5374 Output on Debian 5.10.0-10-amd64 system: (Ryzen 7 5800X) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1840 2458 2665 2719 2711 2723 2693 skein-generic 870 966 996 992 1003 1005 1009 sha256-generic 415 442 453 455 457 457 457 sha512-generic 608 690 711 718 719 720 721 blake3-generic 301 313 311 309 309 310 310 blake3-sse2 343 1865 2124 2188 2180 2181 2186 blake3-sse41 364 2091 2396 2509 2463 2482 2488 blake3-avx2 365 2590 4399 4971 4915 4802 4764 Output on Debian 5.10.0-9-powerpc64le system: (POWER 9) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 1213 1703 1889 1918 1957 1902 1907 skein-generic 434 492 520 522 511 525 525 sha256-generic 167 183 187 188 188 187 188 sha512-generic 186 216 222 221 225 224 224 blake3-generic 153 152 154 153 151 153 153 blake3-sse2 391 1170 1366 1406 1428 1426 1414 blake3-sse41 352 1049 1212 1174 1262 1258 1259 Output on Debian 5.10.0-11-arm64 system: (Pi400) implementation 1k 4k 16k 64k 256k 1m 4m edonr-generic 487 603 629 639 643 641 641 skein-generic 271 299 303 308 309 309 307 sha256-generic 117 127 128 130 130 129 130 sha512-generic 145 165 170 172 173 174 175 blake3-generic 81 29 71 89 89 89 89 blake3-sse2 112 323 368 379 380 371 374 blake3-sse41 101 315 357 368 369 364 360 Structurally, the new code is mainly split into these parts: - 1x cross platform generic c variant: blake3_generic.c - 4x assembly for X86-64 (SSE2, SSE4.1, AVX2, AVX512) - 2x assembly for ARMv8 (NEON converted from SSE2) - 2x assembly for PPC64-LE (POWER8 converted from SSE2) - one file for switching between the implementations Note the PPC64 assembly requires the VSX instruction set and the kfpu_begin() / kfpu_end() calls on PowerPC were updated accordingly. Reviewed-by: Felix Dörre <[email protected]> Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Co-authored-by: Rich Ercolani <[email protected]> Closes #10058 Closes #12918
* autoconf: AC_MSG_CHECKING consistencyBrian Behlendorf2022-06-019-17/+17
| | | | | | | | | | | Make the wording more consistent for the kernel AC_MSG_CHECKING output (e.g. "checking whether ...".). Additionally, group some of the VFS interface checks with the others. No functional change. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Attila Fülöp <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13529
* Linux 5.19 compat: asm/fpu/internal.hBrian Behlendorf2022-06-012-2/+23
| | | | | | | | | | | As of the Linux 5.19 kernel the asm/fpu/internal.h header was entirely removed. It has been effectively empty since the 5.16 kernel and provides no required functionality. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Attila Fülöp <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13529
* Remove wrong assertion in log spacemapAlexander Motin2022-06-011-6/+0
| | | | | | | | | | | | | | | | | | | It is typical, but not generally true that if log summary has more blocks it must also have unflushed metaslabs. Normally with metaslabs flushed in order it works, but there are known exceptions, such as device removal or metaslab being loaded during its flush attempt. Before 600a02b8844 if spa_flush_metaslabs() hit loading metaslab it usually stopped (unless memlimit is also exceeded), but now it may flush more metaslabs, just skipping that particular one. This increased chances of assertion to fire when the skipped metaslab is flushed on next iteration if all other metaslabs in that summary entry are already flushed out of order. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #13486 Closes #13513
* Corrected parameters for zstd early abortRich Ercolani2022-05-311-2/+2
| | | | | | | That'll teach me to try and recall them from the definition. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #13519
* Fix typo in zil_commit() comment blockAllan Jude2022-05-311-1/+1
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #13518
* Linux 5.18 compat: METABrian Behlendorf2022-05-311-1/+1
| | | | | | | | Update the META file to reflect compatibility with the 5.18 kernel. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13527
* Linux 5.19 compat: zap_flags_t conflictBrian Behlendorf2022-05-311-0/+5
| | | | | | | | | | | | As of the Linux 5.19 kernel an identically named zap_flags_t typedef is declared in the include/linux/mm_types.h linux header. Sadly, the inclusion of this header cannot be easily avoided. To resolve the conflict a #define is used to remap the name in the OpenZFS sources when building against the Linux kernel. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.19 compat: bdev_start_io_acct() / bdev_end_io_acct()Brian Behlendorf2022-05-312-30/+62
| | | | | | | | | As of the Linux 5.19 kernel the disk_*_io_acct() helper functions have been replaced by the bdev_*_io_acct() functions. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.19 compat: aops->read_folio()Brian Behlendorf2022-05-313-0/+46
| | | | | | | | | As of the Linux 5.19 kernel the readpage() address space operation has been replaced by read_folio(). Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.19 compat: blkdev_issue_secure_erase()Brian Behlendorf2022-05-312-9/+81
| | | | | | | | | | | Linux 5.19 commit torvalds/linux@44abff2c0 splits the secure erase functionality from the blkdev_issue_discard() function. The blkdev_issue_secure_erase() must now be issued to issue a secure erase. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.19 compat: bdev_max_secure_erase_sectors()Brian Behlendorf2022-05-313-24/+43
| | | | | | | | | | | Linux 5.19 commit torvalds/linux@44abff2c0 removed the blk_queue_secure_erase() helper function. The preferred interface is to now use the bdev_max_secure_erase_sectors() function to check for discard support. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.19 compat: bdev_max_discard_sectors()Brian Behlendorf2022-05-315-6/+51
| | | | | | | | | | | Linux 5.19 commit torvalds/linux@70200574cc removed the blk_queue_discard() helper function. The preferred interface is to now use the bdev_max_discard_sectors() function to check for discard support. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.18 compat: bio_alloc()Brian Behlendorf2022-05-311-14/+39
| | | | | | | | | | As for the Linux 5.18 kernel bio_alloc() expects a block_device struct as an argument. This removes the need for the bio_set_dev() compatibility code for 5.18 and newer kernels. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Fix inflated quiesce time caused by lwb_tx during zil_commit()Kevin Jin2022-05-262-21/+76
| | | | | | | | | | | | | | | | | | | | In current zil_commit() process, transaction lwb_tx is assigned in zil_lwb_write_issue(), and is committed in zil_lwb_flush_vdevs_done(). Thus, during lwb write out process, the txg is held in open or quiesing state, until zil_lwb_flush_vdevs_done() is called. If the zil's zio latency is high, it will cause txg_sync_thread() to starve. The goal here is to defer waiting for zil_lwb_flush_vdevs_done to the 'syncing' txg state. That is, in zil_sync(). In this patch, it achieves the goal without holding transaction. A new function zil_lwb_flush_wait_all() is introduced. It waits for the completion of all the zil_lwb_flush_vdevs_done() by given txg. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Prakash Surya <[email protected]> Signed-off-by: jxdking <[email protected]> Closes #12321
* Replace EXTRA_DIST with dist_noinst_DATABrian Behlendorf2022-05-2627-57/+56
| | | | | | | | | | | | | | | The EXTRA_DIST variable is ignored when used in the FALSE conditional of a Makefile.am. This results in the `make dist` target omitting these files from the generated tarball unless CONFIG_USER is defined. This issue can be avoided by switching to use the dist_noinst_DATA variable which is handled as expected by autoconf. This change also adds support for --with-config=dist as an alias for --with-config=srpm and updates the GitHub workflows to use it. Reviewed-by: Ahelenia Ziemiańska <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13459 Closes #13505
* Silence unused-but-set-variable warningRyan Moeller2022-05-251-1/+1
| | | | | | | | | | | This was breaking the kmod port build on FreeBSD with Clang 13. Use the same trick as we do for ASSERT() to make DNODE_VERIFY() use its parameter at compile time without actually using it at run time in non-debug builds. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13507
* More speculative prefetcher improvementsAlexander Motin2022-05-255-101/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Make prefetch distance adaptive: up to 4MB prefetch doubles for every, hit same as before, but after that it grows by 1/8 every time the prefetch read does not complete in time to satisfy the demand. My tests show that 4MB is sufficient for wide NVMe pool to saturate single reader thread at 2.5GB/s, while new 64MB maximum allows the same thread to reach 1.5GB/s on wide HDD pool. Further distance increase may increase speed even more, but less dramatic and with higher latency. - Allow early reuse of inactive prefetch streams: streams that never saw hits can be reused immediately if there is a demand, while others can be reused after 1s of inactivity, starting with the oldest. After 2s of inactivity streams are deleted to free resources same as before. This allows by several times increase strided read performance on HDD pool in presence of simultaneous random reads, previously filling the zfetch_max_streams limit for seconds and so blocking most of prefetch. - Always issue intermediate indirect block reads with SYNC priority. Each of those reads if delayed for longer may delay up to 1024 other block prefetches, that may be not good for wide pools. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #13452
* automake: don't install /e/d/zfs or /e/z/zfs-functions +xнаб2022-05-251-8/+8
| | | | | | | | _SCRIPTS means it's made +x when installing; _DATA is made -x. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13496 Closes #13503
* Cancel in-progress rebuilds when we finish removalPaul Dagnelie2022-05-251-0/+2
| | | | | | | | | | | This issue was discovered by zloop runs. When a mirror or other redundant top-level vdev has a disk failure, and the disk is replaced, the rebuild process occurs. A removal can happen while this is in progress. If the removal completes before the rebuild does, the removal process will try to free the vdev that is still in use. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #13498
* rpm: Keep debug symbols if configured with '--enable-debuginfo'Umer Saleem2022-05-251-0/+4
| | | | | | | | Do not strip debug information from packages if '--enable-debuginfo' is configured. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13500
* Standardize RHEL version check in packagesBrian Behlendorf2022-05-253-3/+3
| | | | | | | | | | This is a follow up to 3c356622994 which standardizes how the RHEL version check is done. This simpler "0%{?rhel}" check is used elsewhere in the packages so we do the same here. Reviewed-by: Neal Gompa <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13501
* Unbreak zstd build on sparc64Rich Ercolani2022-05-251-1/+1
| | | | | | | | It turns out that wrapping the atomic macro in () breaks build on Linux/SPARC64. Oops. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #13506
* Switch sed -E to -r for better portabilityBrian Behlendorf2022-05-251-1/+1
| | | | | | | | GNU sed 4.1.2 does not support the -E flag and this version is used by some cross-compiling tool chains. Switch -E to -r which is understood. Reviewed-by: Ahelenia Ziemiańska <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13502
* rpm: Use the correct version-release information in dependenciesNeal Gompa (ニール・ゴンパ)2022-05-241-15/+14
| | | | | | | | | This tightly links the subpackages together and ensures that everything is upgraded together. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Neal Gompa <[email protected]> Closes #13489
* Refactor Log Size LimitAlexander Motin2022-05-245-33/+53
| | | | | | | | | | | | | | | Original Log Size Limit implementation blocked all writes in case of limit reached until the TXG is committed and the log is freed. It caused huge delays and following speed spikes in application writes. This implementation instead smoothly throttles writes, using exactly the same mechanism as used for dirty data. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: jxdking <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Issue #12284 Closes #13476
* Tiered early abort, zstd editionRich Ercolani2022-05-244-6/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | It turns out that "do LZ4 and zstd-1 both fail" is a great heuristic for "don't even bother trying higher zstd tiers". By way of illustration: $ cat /incompress | mbuffer | zfs recv -o compression=zstd-12 evenfaster/lowcomp_1M_zstd12_normal summary: 39.8 GiByte in 3min 40.2sec - average of 185 MiB/s $ echo 3 | sudo tee /sys/module/zzstd/parameters/zstd_lz4_pass 3 $ cat /incompress | mbuffer -m 4G | zfs recv -o compression=zstd-12 evenfaster/lowcomp_1M_zstd12_patched summary: 39.8 GiByte in 48.6sec - average of 839 MiB/s $ sudo zfs list -p -o name,used,lused,ratio evenfaster/lowcomp_1M_zstd12_normal evenfaster/lowcomp_1M_zstd12_patched NAME USED LUSED RATIO evenfaster/lowcomp_1M_zstd12_normal 39549931520 42721221632 1.08 evenfaster/lowcomp_1M_zstd12_patched 39626399744 42721217536 1.07 $ python3 -c "print(39626399744 - 39549931520)" 76468224 $ I'll take 76 MB out of 42 GB for > 4x speedup. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Reviewed-by: Ahelenia Ziemiańska <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #13244
* FreeBSD: libspl: Add locking around statfs globalsRyan Moeller2022-05-241-1/+15
| | | | | | | | | | Makes getmntent and getmntany thread-safe for external consumers of libzfs zpool_disable_datasets, zfs_iter_mounted, libzfs_mnttab_update, libzfs_mnttab_find. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13484
* Modified ncompress requirement in RPM to exclude RHEL9Rich Ercolani2022-05-243-4/+7
| | | | | | | | The bug this was working around is no longer present. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #13480 Closes #13490
* zed: Take no action on scrub/resilver checksum errorsBrian Behlendorf2022-05-241-0/+20
| | | | | | | | | | | | | | | | | | | | | | | | | When scrubbing/resilvering a pool it can be counter productive to cancel the scan and kick of a replace operation to a hot spare when encountering checksum errors. In this case, the best course of action is to allow the scrub/resilver to complete as quickly as possible and to keep the vdevs fully online if possible. Realistically, this is less of an issue for a RAIDZ since a traditional resilver must be used and checksums will be verified. However, this is not the case for a mirror or dRAID pool which is sequentially resilvered and checksum verification is deferred until after the replace operation completes. Regardless, we apply this policy to all pool types since it's a good idea for all vdevs. Degrading additional vdevs has the potential to make a bad situation worse. Note the checksum errors will still be reported as both an event and by `zpool status`. This change only prevents the ZED from proactively taking any action. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13499
* Verify BPs in spa_load_verify_cb() and dsl_scan_visitbp()Brian Behlendorf2022-05-203-24/+31
| | | | | | | | | | | | | | | | | | | We want `zpool import` to be highly robust and never panic, even when encountering corrupt metadata. This is already handled in the arc_read() code path, which covers most cases, but spa_load_verify_cb() relies on zio_read() and is responsible for verifying the block pointer. During import it is also possible to encounter blocks pointers which contain ZIO_COMPRESS_INHERIT and ZIO_CHECKSUM_INHERIT values. Relax the verification function slightly to allow this. Futhermore, extend dsl_scan_recurse() to verify the block pointer contents of level zero blocks which are not of type DMU_OT_DNODE or DMU_OT_OBJSET. This is handled by arc_read() in the other cases. Reviewed-by: Paul Dagnelie <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13124 Closes #13360
* zdb: Fix handling of nul termination in symlink targetsMark Johnston2022-05-201-1/+6
| | | | | | | | | | The SA attribute containing the symlink target does not include a nul terminator, so when printing the target zdb would sometimes include garbage at the end of the string. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Mark Johnston <[email protected]> Closes #13482
* linux: libshare: smb: don't swallow net(1) errorsнаб2022-05-181-2/+2
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13191 Closes #13470
* libzfs: return (allocated) strings instead of filling buffersнаб2022-05-187-1443/+1428
| | | | | | | | | This also expands the zfs version output from 127 characters to However Many Are Actually Set Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13330
* linux: libzfs: simplify module-loaded checkнаб2022-05-184-83/+103
| | | | | | | | | | | | | | | The short-path is now one access() call, we always modprobe zfs (ZFS_MODULE_LOADING which doesn't use the libzfs boolean parsing is gone), and we use a simple inotify IN_CREATE loop with a timerfd timeout rather than 10ms kernel-style polling There's one substantial difference: ZFS_MODULE_TIMEOUT=-1 now means "never give up", rather than "wait 10 minutes" Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13330
* Remove final K&R definitionsнаб2022-05-182-6/+6
| | | | | | | | | Clang trunk now warns -Wstrict-prototypes on this, and they're removed in C2x Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13447
* libspl/include: remove unused/empty headersнаб2022-05-1824-604/+6
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13447
* kmodtool: cleanupнаб2022-05-181-5/+1
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13447
* rpm: don't spec obsolete_name/version anymoreнаб2022-05-183-60/+10
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13447
* Add make regen-tests to regenerate the test bundleнаб2022-05-182-1/+31
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13447
* zed: support subject as header in zed_notify_email()heeplr2022-05-182-3/+17
| | | | | | | | | | | Some minimal MUAs don't support passing the subjects as cmdline option. This commit checks if "@SUBJECT@" is missing in ZED_EMAIL_OPTS and then prepends a subject header to the notification message. Also set a default for ${subject}. Reviewed-by: Ahelenia Ziemia<C5><84>ska <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Daniel Hiepler <[email protected]> Closes #13440
* Expose zpool guids through kstatsAndrew2022-05-182-0/+49
| | | | | | | | | | | | There are times when end-users may wish to have a fast and convenient method to get zpool guid without having to use libzfs. This commit exposes the zpool guid via kstats in similar manner to the zpool state. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Andrew Walker <[email protected]> Closes #13466
* Fix compiler warnings about zero-length arrays in inline bitopsColeman Kane2022-05-171-3/+9
| | | | | | | | | | | | The compiler appears to be expanding the unused NULL pointer into a zero-length array via the inline bitops code. When -Werror=array-bounds is used, this causes a build failure. Recommended solution is allocate temporary structures, fill with zeros (to avoid uninitialized data use warnings), and pass the pointer to those to the inline calls. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #13463 Closes #13465
* linux: libzutil: zfs_strip_path: only strip known prefixesнаб2022-05-161-1/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This mirrors FreeBSD: # zpool create -o cachefile= testpsko media/testpsko # zpool create -o cachefile= testpsko2 $PWD/testpsko2 $ ./zpool list -v NAME SIZE ALLOC FREE filling 25.5T 6.85T 18.6T mirror-0 3.64T 500G 3.15T ata-HGST_HUS726T4TALE6L4_V6K2L4RR - - - ata-HGST_HUS726T4TALE6L4_V6K2MHYR - - - raidz1-1 21.8T 6.36T 15.5T ata-HGST_HUS728T8TALE6L4_VDKT237K - - - ata-HGST_HUS728T8TALE6L4_VDGY075D - - - ata-HGST_HUS728T8TALE6L4_VDKVRRJK - - - cache - - - nvme0n1p4 63.0G 12.8G 50.2G tarta-boot 240M 50.0M 190M mirror-0 240M 50.0M 190M tarta-boot - - - tarta-boot-nvme - - - tarta-zoot 55.5G 6.96G 48.5G mirror-0 55.5G 6.96G 48.5G tarta-zoot - - - tarta-zoot-nvme - - - testpsko 39.5G 744K 39.5G media/testpsko1 39.5G 744K 39.5G testpsko2 39.5G 130K 39.5G /home/nabijaczleweli/store/code/zfs/testpsko2 39.5G 130K 39.5G Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13413 Closes #9771