aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Python 3.12 deprecated python3-distutils"b_zfs_2.2.6_debian12Sven Göthel2024-10-243-235/+117
| | | | | | Issue #16155, see https://github.com/openzfs/zfs/issues/16155 This reverts commit 71216b91d281e7e58f5e29ca4d4553945e080fe9.
* Tag zfs-2.2.6zfs-2.2.6b_zfs_2.2.6Tony Hutter2024-08-271-1/+1
| | | | | | META file and changelog updated. Signed-off-by: Tony Hutter <[email protected]>
* Enable L2 cache of all (MRU+MFU) metadata but MFU data onlyshodanshok2024-08-272-7/+18
| | | | | | | | | | | | | | | | | | `l2arc_mfuonly` was added to avoid wasting L2 ARC on read-once MRU data and metadata. However it can be useful to cache as much metadata as possible while, at the same time, restricting data cache to MFU buffers only. This patch allow for such behavior by setting `l2arc_mfuonly` to 2 (or higher). The list of possible values is the following: 0: cache both MRU and MFU for both data and metadata; 1: cache only MFU for both data and metadata; 2: cache both MRU and MFU for metadata, but only MFU for data. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Gionatan Danti <[email protected]> Closes #16343 Closes #16402
* linux/zvol_os: fix zvol queue limits initializationAmeer Hamza2024-08-261-3/+4
| | | | | | | | | | | | | | | | | zvol queue limits initialization depends on `zv_volblocksize`, but it is initialized later, leading to several limits being initialized with incorrect values, including `max_discard_*` limits. This also causes `blkdiscard` command to consistently fail, as `blk_ioctl_discard` reads `bdev_max_discard_sectors()` limits as 0, leading to failure. The fix is straightforward: initialize `zv->zv_volblocksize` early, before setting the queue limits. This PR should fix `zvol/zvol_misc/zvol_misc_trim` failure on recent PRs, as the test case issues `blkdiscard` for a zvol. Additionally, `zvol_misc_trim` was recently enabled in `6c7d41a`, which is why the issue wasn't identified earlier. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #16454
* linux/zvol_os: tidy and document queue limit/config setupRob Norris2024-08-261-7/+38
| | | | | | | | | | | It gets hairier again in Linux 6.11, so I want some actual theory of operation laid out for next time. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400
* ZTS: fix zfs_copies_006_pos test on Ubuntu 20.04 (#16409)Tino Reichardt2024-08-261-0/+2
| | | | | | | | This test was failing before: - FAIL cli_root/zfs_copies/zfs_copies_006_pos (expected PASS) Signed-off-by: Tino Reichardt <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: George Melikov <[email protected]>
* ZTS: fix history_007_pos test on Ubuntu 24.04 (#16410)Tino Reichardt2024-08-261-5/+1
| | | | | | | | | The timezone "US/Mountain" isn't supported on newer linux versions. Using the correct timezone "America/Denver" like it's done in FreeBSD will fix this. Older Linux distros should behave also okay with this. Signed-off-by: Tino Reichardt <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: George Melikov <[email protected]>
* contrib: link zpool to zfs in bash-completion (#16376)Shengqi Chen2024-08-263-4/+10
| | | | | | | | | | | | Currently user won't have completion of `zpool` command until they trigger completion of `zfs` first. This patch adds a link to `zfs`, thus user can use both to initialize the completion. Fixes: #16320 Signed-off-by: Shengqi Chen <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Tino Reichardt <[email protected]>
* module/icp/asm-arm/sha2: enable non-SIMD asm kernels on armv5/6Shengqi Chen2024-08-264-22/+27
| | | | | | | | | | | | My merged pull request #15557 fixes compilation of sha2 kernels on arm v5/6. However, the compiler guards only allows sha256/512_armv7_impl to be used when __ARM_ARCH > 6. This patch enables these ASM kernels on all arm architectures. Some compiler guards are adjusted accordingly to avoid the unnecessary compilation of SIMD (e.g., neon, armv8ce) kernels on old architectures. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Shengqi Chen <[email protected]> Closes #15623
* module/icp/asm-arm/sha2: auto detect __ARM_ARCHShengqi Chen2024-08-262-4/+10
| | | | | | | | | | | This patch uses __ARM_ARCH set by compiler (both GCC and Clang have this) whenever possible instead of hardcoding it to 7. This change allows code to compile on earlier ARM architectures such as armv5te. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Shengqi Chen <[email protected]> Closes #15557
* Linux 6.10 compat: METATony Hutter2024-08-221-1/+1
| | | | | | | | Update the META file to reflect compatibility with the 6.10 kernel. Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #16466
* linux/zvol_os.c: cleanup limits for non-blk mq caseAmeer Hamza2024-08-221-5/+0
| | | | | | | | | | | Rob Noris suggested that we could clean up redundant limits for the case of non-blk mq scenario. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #16462
* linux/zvol_os.c: Fix max_discard_sectors limit for 6.8+ kernelAmeer Hamza2024-08-221-0/+1
| | | | | | | | | | | | | | | | In kernels 6.8 and later, the zvol block device is allocated with qlimits passed during initialization. However, the zvol driver does not set `max_hw_discard_sectors`, which is necessary to properly initialize `max_discard_sectors`. This causes the `zvol_misc_trim` test to fail on 6.8+ kernels when invoking the `blkdiscard` command. Setting `max_hw_discard_sectors` in the `HAVE_BLK_ALLOC_DISK_2ARG` case resolve the issue. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #16462
* Fix null ptr deref when renaming a zvol with snaps and snapdev=visible (#16316)Justin Gottula2024-08-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a zvol is renamed, and it has one or more snapshots, and snapdev=visible is true for the zvol, then the rename causes a kernel null pointer dereference error. This has the effect (on Linux, anyway) of killing the z_zvol taskq kthread, with locks still held; which in turn causes a variety of zvol-related operations afterward to hang indefinitely (such as udev workers, among other things). The problem occurs because of an oversight in #15486 (e36ff84c338d2f7b15aef2538f6a9507115bbf4a). As documented in dataset_kstats_create, some datasets may not actually have kstats allocated for them; and at least at the present time, this is true for snapshots. In practical terms, this means that for snapshots, dk->dk_kstats will be NULL. The dataset_kstats_rename function introduced in the patch above does not first check whether dk->dk_kstats is NULL before proceeding, unlike e.g. the nearby dataset_kstats_update_* functions. In the very particular circumstance in which a zvol is renamed, AND that zvol has one or more snapshots, AND that zvol also has snapdev=visible, zvol_rename_minors_impl will loop over not just the zvol dataset itself, but each of the zvol's snapshots as well, so that their device nodes will be renamed as well. This results in dataset_kstats_create being called for snapshots, where, as we've established, dk->dk_kstats is NULL. Fix this by simply adding a NULL check before doing anything in dataset_kstats_rename. This still allows the dataset_name kstat value for the zvol to be updated (as was the intent of the original patch), and merely blocks attempts by the code to act upon the zvol's non-kstat-having snapshots. If at some future time, kstats are added for snapshots, then things should work as intended in that case as well. Signed-off-by: Justin Gottula <[email protected]> Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Alan Somers <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
* Linux 6.10 compat: Fix zvol NULL pointer deferenceTony Hutter2024-08-221-3/+4
| | | | | | | | zvol_alloc_non_blk_mq()->blk_queue_set_write_cache() needs the disk queue setup to prevent a NULL pointer deference. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #16453
* Linux 6.10 compat: fix rpm-kmod and builtinTony Hutter2024-08-222-2/+20
| | | | | | | | | | | | | | | | | | The 6.10 kernel broke our rpm-kmod builds. The 6.10 kernel really wants the source files in the same directory as the object files. This workaround makes rpm-kmod work again. It also updates the builtin kernel codepath to work correctly with 6.10. See kernel commits: b1992c3772e6 kbuild: use $(src) instead of $(srctree)/$(src) for source directory 9a0ebe5011f4 kbuild: use $(obj)/ instead of $(src)/ for common pattern rules Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #16439 Closes #16450
* ZTS: Use /dev/urandom instead of /dev/randomTony Hutter2024-08-223-3/+3
| | | | | | | | Use /dev/urandom so we never have to wait on entropy. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #16442
* Linux 6.11: avoid passing "end" sentinel to register_sysctl()Rob Norris2024-08-223-3/+66
| | | | | | | | Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400
* Linux 6.11: add compat macro for page_mapping()Rob Norris2024-08-225-17/+46
| | | | | | | | | | | Since the change to folios it has just been a wrapper anyway. Linux has removed their wrapper, so we add one. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400
* Linux 6.11: add more queue_limit fields with removed settersRob Norris2024-08-221-8/+15
| | | | | | | | | | | These fields are very old, so no detection necessary; we just move them into the limit setup functions. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400
* Linux 6.11: IO stats is now a queue feature flagRob Norris2024-08-221-4/+3
| | | | | | | | | | Apply them with with the rest of the settings. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400
* Linux 6.11: first arg to proc_handler is now constRob Norris2024-08-223-3/+44
| | | | | | | | | | Detect it, and use a macro to make sure we always match the prototype. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400
* Linux 6.11: get backing_dev_info through queue gendiskRob Norris2024-08-222-1/+31
| | | | | | | | | | | It's no longer available directly on the request queue, but its easy to get from the attached disk. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400
* Linux 6.11: enable queue flush through queue limitsRob Norris2024-08-223-14/+50
| | | | | | | | | | | | | | | | | In 6.11 struct queue_limits gains a 'features' field, where, among other things, flush and write-cache are enabled. Detect it and use it. Along the way, the blk_queue_set_write_cache() compat wrapper gets a little cleanup. Since both flags are alway set together, its now a single bool. Also the very very ancient version that sets q->flush_flags directly couldn't actually turn it off, so I've fixed that. Not that we use it, but still. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400
* ZTS: Add a test to verify that copy_file_range obeys RLIMIT_FSIZEMark Johnston2024-08-224-1/+69
| | | | | | | | | Signed-off-by: Mark Johnston <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
* FreeBSD: Fix RLIMIT_FSIZE handling for block cloningMark Johnston2024-08-221-7/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ZFS implements copy_file_range(2) using block cloning when possible. This implementation must respect the RLIMIT_FSIZE limit. zfs_clone_range() already checks the limit, so it is safe to remove this check in zfs_freebsd_copy_file_range(). Moreover, the removed check produces false positives: the length passed to copy_file_range(2) may be larger than the input file size; as the man page notes, "for best performance, call copy_file_range() with the largest len value possible." In particular, some existing code passes SSIZE_MAX there. The check in zfs_clone_range() clamps the length to the input file's size before checking, but the removed check uses the caller supplied length, so something like $ echo a > /tmp/foo $ limits -f 1024 cat /tmp/foo > /tmp/bar fails because FreeBSD's cat(1) uses copy_file_range(2) in the manner described above. Reported-by: Philip Paeps <[email protected]> Signed-off-by: Mark Johnston <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
* zfs: add bounds checking to zil_parse (#16308)c1ick2024-08-221-2/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure log record don't stray beyond valid memory region. There is a lack of verification of the space occupied by fixed members of lr_t in the zil_parse. We can create a crafted image to trigger an out of bounds read by following these steps: 1) Do some file operations and reboot to simulate abnormal exit without umount 2) zil_chain.zc_nused: 0x1000 3) First lr_t lr_t.lrc_txtype: 0x0 lr_t.lrc_reclen: 0x1000-0xb8-0x1 lr_t.lrc_txg: 0x0 lr_t.lrc_seq: 0x1 4) Update checksum in zil_chain.zc_eck Fix: Add some checks to make sure the remaining bytes are large enough to hold an log record. Signed-off-by: XDTG <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]>
* linux/zvol_os: fix SET_ERROR with negative return codesRob Norris2024-08-221-4/+4
| | | | | | | | | | | | | SET_ERROR is our facility for tracking errors internally. The negation is to match the what the kernel expects from us. Thus, the negation should happen outside of the SET_ERROR. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <[email protected]> Closes #16364
* ZTS: fix io_uring test on RHEL 9 variants (#16411)Tino Reichardt2024-08-221-3/+3
| | | | | | | | Simplify the test, by using the variable "$PLATFORM_ID" in favor of "$REDHAT_SUPPORT_PRODUCT_VERSION". Signed-off-by: Tino Reichardt <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: George Melikov <[email protected]>
* Tag zfs-2.2.5zfs-2.2.5Tony Hutter2024-08-021-1/+1
| | | | | | META file and changelog updated. Signed-off-by: Tony Hutter <[email protected]>
* [2.2.5-only] Make 'rmmod zfs' work after zfs-2.2.4 (#16406)Tony Hutter2024-08-021-2/+11
| | | | | | | | | | | | | | | | | | | | | db65272ae was added to zfs-2.2.4 to stub in the VDEV_PROP_RAIDZ_EXPANDING enum without adding the RAIDz expansion feature. This was needed to provide the right enum count for when the VDEV_PROP_SLOW_IO proprieties got added. This had the unfortunate side effect of breaking module removal though. Specifically, with the VDEV_PROP_RAIDZ_EXPANDING stub added, the module would correctly omit making kobjects for the RAIDz expansion vdev property, but then would try to blindly remove its non-existent kobjects during module unload. This commit fixes the issue by checking for an uninitialized kobject. Fixes: #16249 Signed-off-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ameer Hamza <[email protected]> Reviewed-by: Tino Reichardt <[email protected]>
* ZTS: Make do_vol_test() more deterministic (#16379)Alexander Motin2024-07-301-9/+9
| | | | | | | | | - Explicitly disable compression since mkfile uses a zero buffer. - Explicitly sync file systems instead of waiting for timeout. Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
* Linux 6.9: Fix UBSAN errors in sa.c (#16380)Tony Hutter2024-07-231-0/+1
| | | | | | | | | | | | | This is a follow-on to 156a64161b4f9da35f2e0484106173344cf78317 that ignores UBSAN errors in sa.c. Thank you @thwalker3 for the fix. Original-patch-by: @thwalker3 Signed-off-by: Tony Hutter <[email protected]> Closes #16278 Closes #16330 Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]>
* Fix long_free_dirty accounting for small files (#16264)Chunwei Chen2024-07-231-0/+7
| | | | | | | | | | | | | | For files smaller than recordsize, it's most likely that they don't have L1 blocks. However, current calculation will always return at least 1 L1 block. In this change, we check dnode level to figure out if it has L1 blocks or not, and return 0 if it doesn't. This will reduce the chance of unnecessary throttling when deleting a large number of small files. Signed-off-by: Chunwei Chen <[email protected]> Co-authored-by: Chunwei Chen <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]>
* AUTHORS: refresh with recent new contributors (#16362)Rob Norris2024-07-232-0/+19
| | | | | | | | Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: George Melikov <[email protected]>
* FreeBSD: Use a statement expression to implement SET_ERROR() (#16284)Mark Johnston2024-07-231-5/+6
| | | | | | | | | | | This way we can avoid making assumptions about the SDT probe implementation. No functional change intended. Signed-off-by: Mark Johnston <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Rob Norris <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
* zdb: dump ZAP_FLAG_UINT64_KEY ZAPs properly (#16334)Rob Norris2024-07-171-4/+26
| | | | | | | | | | | | These are used for DDT and BRT stores. There's limited information available to produce meaningful output, but at least we can put something on screen rather than crashing. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tony Hutter <[email protected]>
* vdev_open: clear async fault flag after reopenRob Norris2024-07-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | After c3f2f1aa2, vdev_fault_wanted is set on a vdev after a probe fails. An end-of-txg async task is charged with actually faulting the vdev. In a single-disk pool, the probe failure will degrade the last disk, and then suspend the pool. However, vdev_fault_wanted is not cleared. After the pool returns, the transaction finishes and the async task runs and faults the vdev, which suspends the pool again. The fix is simple: when reopening a vdev, clear the async fault flag. If the vdev is still failed, the startup probe will quickly notice and degrade/suspend it again. If not, all is well! Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Co-authored-by: Don Brady <[email protected]> Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Don Brady <[email protected]>
* zts: test single-disk pool resumes properly after disk pullRob Norris2024-07-174-1/+105
| | | | | | | | | | | | | A single disk pool should suspend when its disk fails and hold the IO. When the disk is returned, the pool should return and the IO be reissued, leaving everything in good shape. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Signed-off-by: Rob Norris <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Don Brady <[email protected]>
* disable automatic dependency tracking for dkms buildsMartin Wagner2024-07-171-0/+1
| | | | | | | | | | | Previously the dkms build left some unwanted files in `/usr/lib/modules` which could cause package managers to not properly clean up old kernels. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Martin Wagner <[email protected]> Closes #16221 Closes #16241
* Some improvements to metaslabs evictionAlexander Motin2024-07-172-2/+8
| | | | | | | | | | | | | | | | | | | - Add old eviction for special and dedup metaslab classes. Those vdevs may be potentially big and fragmented with large metaslabs, while their asynchronous write pattern is not really different from normal class. It seems an omission to not evict old metaslabs from them. - If we have metaslab preload enabled, which means we are not too low on memory, do not evict active metaslabs even if they are not used for some time. Eviction of active metaslabs means we won't be able to write anything until we load them, that may take some time, that is straight opposite to metaslab preload goals. For small systems the memory saving should be less important after recent reduction in number of allocators and so open metaslabs. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #16214
* Destroy ARC buffer in case of fill errorAlexander Motin2024-07-171-0/+1
| | | | | | | | | | | | | | In case of error dmu_buf_fill_done() returns the buffer back into DB_UNCACHED state. Since during transition from DB_UNCACHED into DB_FILL state dbuf_noread() allocates an ARC buffer, we must free it here, otherwise it will be leaked. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #15665 Closes #15802 Closes #16216
* Use memset to zero stack allocations containing unionsRob N2024-07-174-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | C99 6.7.8.17 says that when an undesignated initialiser is used, only the first element of a union is initialised. If the first element is not the largest within the union, how the remaining space is initialised is up to the compiler. GCC extends the initialiser to the entire union, while Clang treats the remainder as padding, and so initialises according to whatever automatic/implicit initialisation rules are currently active. When Linux is compiled with CONFIG_INIT_STACK_ALL_PATTERN, -ftrivial-auto-var-init=pattern is added to the kernel CFLAGS. This flag sets the policy for automatic/implicit initialisation of variables on the stack. Taken together, this means that when compiling under CONFIG_INIT_STACK_ALL_PATTERN on Clang, the "zero" initialiser will only zero the first element in a union, and the rest will be filled with a pattern. This is significant for aes_ctx_t, which in aes_encrypt_atomic() and aes_decrypt_atomic() is initialised to zero, but then used as a gcm_ctx_t, which is the fifth element in the union, and thus gets pattern initialisation. Later, it's assumed to be zero, resulting in a hang. As confusing and undiscoverable as it is, by the spec, we are at fault when we initialise a structure containing a union with the zero initializer. As such, this commit replaces these uses with an explicit memset(0). Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #16135 Closes #16206
* zdb: bring crash handling over from ztestRob Norris2024-07-171-5/+56
| | | | | | | | | | | | ztest has a very nice ability to show a backtrace when there's an unexpected crash. zdb is used often enough on corrupted data and can blow up too, so nice output is useful there too. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #16181
* libspl_assert: always link -lpthread on FreeBSDRob N2024-07-171-0/+4
| | | | | | | | | | | The pthread_* functions are in -lpthread on FreeBSD. Some of them are implicitly linked through libc, but on FreeBSD 13 at least pthread_getname_np() is not. Just be explicit, since -lpthread is the documented interface anyway. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #16168
* Unbreak FreeBSD cross-build on MacOS broken in 051460b8bMartin Matuška2024-07-171-1/+20
| | | | | | | | | | | | | | MacOS used FreeBSD-compatible getprogname() and pthread_getname_np(). But pthread_getthreadid_np() does not exist on MacOS. This implements libspl_gettid() using pthread_threadid_np() to get the thread id of the current thread. Tested with FreeBSD GitHub actions freebsd-src/.github/workflows/cross-bootstrap-tools.yml Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rob Norris <[email protected]> Signed-off-by: Martin Matuska <[email protected]> Closes #16167
* libspl/assert: use libunwind for backtrace when availableRob Norris2024-07-174-3/+79
| | | | | | | | | | | libunwind seems to do a better job of resolving a symbols than backtrace(), and is also useful on platforms that don't have backtrace() (eg musl). If it's available, use it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16140
* libspl/assert: dump backtrace in assertRob Norris2024-07-174-0/+37
| | | | | | | | | | Adds a check for the backtrace() function. If available, uses it to show a stack backtrace in the assertion output. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16140
* libspl/assert: add lock around assertion outputRob Norris2024-07-171-0/+6
| | | | | | | | | | | | | | | | | | | | | If multiple threads trip an assertion at the same moment (quite common), they can be printing at the same time, and their output gets messy. This adds a simple lock around the whole thing, to prevent a second task printing assert output before the first has finished. Additionally, if libspl_assert_ok is not set, abort() is called without dropping the lock, so that any other asserting tasks will be killed before starting any output, rather than only getting part-way through. This is a tradeoff; it's assumed that multiple threads asserting at the same moment are likely the same fault in different instances of a thread, and so there won't be any more useful information from the other tasks anyway. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16140
* libspl/assert: show process/task details in assert outputRob Norris2024-07-172-3/+35
| | | | | | | | | | | | | | Makes it much easier to see what thing complained. Getting thread id, program name and thread name vary wildly between Linux and FreeBSD, so those are set up in macros. pthread_getname_np() did not appear in musl until very recently, but the same info has always been available via prctl(PR_GET_NAME), so we use that instead. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16140