summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Import ZStandard v1.4.5Michael Niewöhner2020-08-207-2/+30110
| | | | | | | | | | | | | | | | | | | | | | | | | | ZStandard is a modern, high performance, general compression algorithm. It provides similar or better compression levels to GZIP, but with much better performance. ZStandard provides a large selection of compression levels to allow a storage administrator to select the preferred performance/compression trade-off. This commit imports the unmodified ZStandard single-file library which will be used by ZFS. The implementation of this new library is done with future updates of zstd in mind. For this reason we integrated the code in a way, that does not require modifications to the library. For more details, see `module/zstd/README.md`. The library is excluded from codecov calculation and cppcheck as unaltered dependencies do not need full codecov or cppcheck. Co-authored-by: Allan Jude <[email protected]> Co-authored-by: Kjeld Schouten-Lebbing <[email protected]> Co-authored-by: Michael Niewöhner <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Kjeld Schouten-Lebbing <[email protected]> Signed-off-by: Michael Niewöhner <[email protected]>
* Linux 5.7 compat: Include linux/sched.h in spl/sys/mutex.hPavel Snajdr2020-08-191-0/+1
| | | | | | | | | struct task_struct is needed for lockdep_off() in mutex.h This has popped up after e616cb8daadf (in linux-5.7-rc7). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Pavel Snajdr <[email protected]> Closes #10741
* FreeBSD: Add option to rewind checkpoint while importing root poolMariusz Zaborski2020-08-193-3/+6
| | | | | | | | This option is used by FreeBSD boot loader. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Mariusz Zaborski <[email protected]> Closes #10738
* ZED: Do not offline a missing device if no spare is availableBrian Behlendorf2020-08-182-24/+39
| | | | | | | | | | | | | | | | | | | | | | | | Due to commit d48091d a removed device is now explicitly offlined by the ZED if no spare is available, rather than the letting ZFS detect it as UNAVAIL. This broke auto-replacing of whole-disk devices, as described in issue #10577. In short, when a new device is reinserted in the same slot, the ZED will try to ONLINE it without letting ZFS recreate the necessary partition table. This change simply avoids setting the device OFFLINE when removed if no spare is available (or if spare_on_remove is false). This change has been left minimal to allow it to be backported to 0.8.x release. The auto_offline_001_pos ZTS test has been updated accordingly. Some follow up work is planned to update the ZED so it transitions the vdev to a REMOVED state. This is a state which has always existed but there is no current interface the ZED can use to accomplish this. Therefore it's being left to a follow up PR. Reviewed-by: Gionatan Danti <[email protected]> Co-authored-by: Gionatan Danti <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10577 Closes #10730
* Fix ARC aggsum access after arc_state_fini()Brian Behlendorf2020-08-181-8/+8
| | | | | | | | | | | | | | Commit 85ec5cbae updated abd_update_scatter_stats() such that it calls arc_space_consume() and arc_space_return() when updating the scatter stats. This requires that the global aggsum value for the ARC be initialized. Normally this is not an issue, however during module unload the l2arc_do_free_on_write() function was called in l2arc_cleanup() after arc_state_fini() destroyed the aggsum values. We can resolve this issue by performing l2arc_do_free_on_write() slightly earlier in arc_fini(). Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10739
* libzfs_core: Initialize fail_ioc_cmd to ZFS_IOC_LASTRyan Moeller2020-08-181-2/+2
| | | | | | | | | FreeBSD numbers `ZFS_IOC_*` starting at 0, so pick a different sentinel value to avoid unintentionally messing with `ZFS_IOC_POOL_CREATE` ioctls. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10729
* FreeBSD: Fix UNIX permissions checkingMatthew Macy2020-08-187-99/+174
| | | | | | Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10727
* Add define to enable autotrim to default to onMatthew Macy2020-08-182-2/+8
| | | | | | | | | | | | In FreeBSD trim has defaulted to on for several years. In order to minimize POLA violations on import it's important to maintain this default when importing vendored openzfs in to FreeBSD base. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10719
* Make zc_nvlist_src_size limit tunableRyan Moeller2020-08-186-4/+51
| | | | | | | | | | | | | | | | | | We limit the size of nvlists passed to the kernel so a user cannot make the kernel do an unreasonably large allocation. On FreeBSD this limit was 128 kiB, which turns out to be a bit too small when doing some operations involving a large number of datasets or snapshots, for example replication. Make this limit tunable, with a platform-specific auto default. Linux keeps its limit at KMALLOC_MAX_SIZE. FreeBSD uses 1/4 of the system limit on user wired memory, which allows it to scale depending on system configuration. Reviewed-by: Matt Macy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Issue #6572 Closes #10706
* Remove unused `zpool_is_bootable`George Melikov2020-08-181-11/+0
| | | | | | | | | | | | | Otherwise compiler errors with: ``` libzfs_pool.c:449:1: error: 'zpool_is_bootable' defined but not used [-Werror=unused-function] ``` Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #10734
* Remove GRUB restrictionsRichard Laager2020-08-175-89/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GRUB restrictions are based around the pool's bootfs property. Given the current situation where GRUB is not staying current with OpenZFS pool features, having either a non-ZFS /boot or a separate pool with limited features are pretty much the only long-term answers for GRUB support. Only the second case matters in this context. For the restrictions to be useful, the bootfs property would have to be set on the boot pool, because that is where we need the restrictions, as that is the pool that GRUB reads from. The documentation for bootfs describes it as pointing to the root pool. That's also how it's used in the initramfs. ZFS does not allow setting bootfs to point to a dataset in another pool. (If it did, it'd be difficult-to-impossible to enforce these restrictions cross-pool). Accordingly, bootfs is pretty much useless for GRUB scenarios moving forward. Even for users who have only one pool, the existing restrictions for GRUB are incomplete. They don't prevent you from enabling the unsupported checksums, for example. For that reason, I have ripped out all the GRUB restrictions. A little longer-term, I think extending the proposed features=portable system to define a features=grub is a much more useful approach. The user could set that on the boot pool at creation, and things would Just Work. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Laager <[email protected]> Closes #8627
* ZTS: ztest may cause mmp tests failuresBrian Behlendorf2020-08-171-0/+2
| | | | | | | | | | | | The mmp_exported_import and mmp_inactive_import tests depend on ztest simulating an active pool. If ztest unexpectedly terminates due to an unrelated issue the test case will fail. Since ztest is not yet 100% reliable I've added these tests to the maybe exception list. They can be removed when the issues with ztest are resolved or if the test cases are updated to handle these unexpected failures. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10726
* Include scatter_chunk_waste in arc_sizeMatthew Ahrens2020-08-176-11/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ARC caches data in scatter ABD's, which are collections of pages, which are typically 4K. Therefore, the space used to cache each block is rounded up to a multiple of 4K. The ABD subsystem tracks this wasted memory in the `scatter_chunk_waste` kstat. However, the ARC's `size` is not aware of the memory used by this round-up, it only accounts for the size that it requested from the ABD subsystem. Therefore, the ARC is effectively using more memory than it is aware of, due to the `scatter_chunk_waste`. This impacts observability, e.g. `arcstat` will show that the ARC is using less memory than it effectively is. It also impacts how the ARC responds to memory pressure. As the amount of `scatter_chunk_waste` changes, it appears to the ARC as memory pressure, so it needs to resize `arc_c`. If the sector size (`1<<ashift`) is the same as the page size (or larger), there won't be any waste. If the (compressed) block size is relatively large compared to the page size, the amount of `scatter_chunk_waste` will be small, so the problematic effects are minimal. However, if using 512B sectors (`ashift=9`), and the (compressed) block size is small (e.g. `compression=on` with the default `volblocksize=8k` or a decreased `recordsize`), the amount of `scatter_chunk_waste` can be very large. On a production system, with `arc_size` at a constant 50% of memory, `scatter_chunk_waste` has been been observed to be 10-30% of memory. This commit adds `scatter_chunk_waste` to `arc_size`, and adds a new `waste` field to `arcstat`. As a result, the ARC's memory usage is more observable, and `arc_c` does not need to be adjusted as frequently. Reviewed-by: Pavel Zakharov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #10701
* Remove KMC_KMEM and KMC_VMEMMatthew Ahrens2020-08-175-151/+10
| | | | | | | | | | | | | | | | | `KMC_KMEM` and `KMC_VMEM` are now unused since all SPL-implemented caches are `KMC_KVMEM`. KMC_KMEM: Given the default value of `spl_kmem_cache_kmem_limit`, we don't use kmalloc to back the SPL caches, instead we use kvmalloc (KMC_KVMEM). The flag, module parameter, /proc entries, and associated code are removed. KMC_VMEM: This flag is not used, and kvmalloc() is always preferable to vmalloc(). The flag, /proc entries, and associated code are removed. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #10673
* FreeBSD: fix the build with Clang 11Ryan Moeller2020-08-176-2/+12
| | | | | | | | | | | | * Cast void * to uintptr_t before casting to boolean_t. * Avoid clashing definition of __asm when not on Linux to prevent duplicate __volatile__. This was already done in some places but not all. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Macy <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10723
* FreeBSD: fix merge error in zfs_acl_ids_createMatthew Macy2020-08-171-27/+16
| | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10721
* Fix typo in btree.cSerapheim Dimitropoulos2020-08-171-2/+2
| | | | | | | Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Serapheim Dimitropoulos <[email protected]> Closes #10725
* FreeBSD: fallback to /boot/ to look for zpool.cacheMatthew Macy2020-08-172-1/+5
| | | | | | | | | Up until now zpool.cache has always lived in /boot on FreeBSD. For the sake of compatibility fallback to /boot if zpool.cache isn't found in /etc/zfs. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10720
* Fix reporting of L2ARC writes in arc_summary3George Amanakis2020-08-171-3/+3
| | | | | | | | | | arc_summary3 reports L2ARC writes in bytes. However, the related arc_stat is reported as hits. arc_summary2 report this correctly. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #10717
* Fix l2arc_dev_rebuild_start thread nameRyan Moeller2020-08-171-4/+5
| | | | | | | | | | | | | | | `thread_create` on FreeBSD stringifies the argument passed as the thread function to create a name for the thread. The thread name for `l2arc_dev_rebuild_start` ended up with `(void (*)(void *))` in it. Change the type signature so the function does not need to be cast when creating the thread. Rename the function to `l2arc_dev_rebuild_thread` for clarity and consistency, as well. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Amanakis <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10716
* FreeBSD: Create taskq threads in appropriate procRyan Moeller2020-08-174-19/+27
| | | | | | | | Stepping stone toward re-enabling spa_thread on FreeBSD. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10715
* Fix L2ARC reads when compressed ARC disabledAllan Jude2020-08-135-2/+215
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading compressed blocks from the L2ARC, with compressed ARC disabled, arc_hdr_size() returns LSIZE rather than PSIZE, but the actual read is PSIZE. This causes l2arc_read_done() to compare the checksum against the wrong size, resulting in checksum failure. This manifests as an increase in the kstat l2_cksum_bad and the read being retried from the main pool, making the L2ARC ineffective. Add new L2ARC tests with Compressed ARC enabled/disabled Blocks are handled differently depending on the state of the zfs_compressed_arc_enabled tunable. If a block is compressed on-disk, and compressed_arc is enabled: - the block is read from disk - It is NOT decompressed - It is added to the ARC in its compressed form - l2arc_write_buffers() may write it to the L2ARC (as is) - l2arc_read_done() compares the checksum to the BP (compressed) However, if compressed_arc is disabled: - the block is read from disk - It is decompressed - It is added to the ARC (uncompressed) - l2arc_write_buffers() will use l2arc_apply_transforms() to recompress the block, before writing it to the L2ARC - l2arc_read_done() compares the checksum to the BP (compressed) - l2arc_read_done() will use l2arc_untransform() to uncompress it This test writes out a test file to a pool consisting of one disk and one cache device, then randomly reads from it. Since the arc_max in the tests is low, this will feed the L2ARC, and result in reads from the L2ARC. We compare the value of the kstat l2_cksum_bad before and after to determine if any blocks failed to survive the trip through the L2ARC. Sponsored-by: The FreeBSD Foundation Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #10693
* Release onexit/events with any missed zfsdev_stateJorgen Lundman2020-08-133-7/+12
| | | | | | | | | | | | | | Linux and FreeBSD will most likely never see this issue. On macOS when kext is unloaded, but zed is still connected, zed will be issued ENODEV. As the cdevsw is released, the kernel will not have zfsdev_release() called to release minor/onexit/events, and it "leaks". This ensures it is cleaned up before unload. Changed the for loop from zsprev, to zsnext style, for less code duplication. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #10700
* Github workflow: checkstyleGeorge Melikov2020-08-131-0/+27
| | | | | | | | | | | | Use github workflow to run checkstyle - use free (for OS projects) resources - starts for every commit and branch - work on forks, contributors may use it before creating PRs Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #10705
* cstyle.pl: echo commands for github workflowGeorge Melikov2020-08-131-4/+9
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #10705
* Remove stale .travis.ymlGeorge Melikov2020-08-131-38/+0
| | | | | | | | | | - It doesn't work now. - It has to be manually edited on tests changes. (even on test runtime changes!) - Travis gives too small time to run to be useful. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #10704
* Use zfs_dbgmsg to log metaslab_load/unloadMatthew Ahrens2020-08-121-19/+34
| | | | | | | | | | | | | Metaslabs are now (usually) loaded and unloaded infrequently, but when that is not the case, it is useful to have a log of when and why these events happened. This commit enables the zfs_dbgmsg() in metaslab_load(), and adds a zfs_dbgmsg() in metaslab_unload(). Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #10683
* Restore ARC MFU/MRU pressureMatthew Macy2020-08-121-22/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The arc_adapt() function tunes LRU/MLU balance according to 4 types of cache hits (which is passed as state agrument): ghost LRU, LRU, MRU, ghost MRU. If this function is called with wrong cache hit (state), adaptation will be sub-optimal and performance will suffer. Some time ago upstream received this commit: 6950 ARC should cache compressed data) in arc_read() do next sequence (access to ghost buffer) Before this commit, hit to any ghost list was passed arc_adapt() before call to arc_access() which revive element in cache and change state from ghost to real hit. After this commit, the order of calls was reverted and arc_adapt() is now called only with «real» hits even if hit was in one of two ghost lists, which renders ghost lists useless and breaks the ARC algorithm. FreeBSD fixed this problem locally in Change D19094 / Commit r348772. This change is an adaptation of the above commit to the current arc code. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10548 Closes #10618
* 'zfs share -a' should handle 'canmount=noauto'George Wilson2020-08-112-2/+21
| | | | | | | | | | | | | | | | The 'zfs share -a' currently skips any filesystems which have 'canmount=noauto' set. This behavior is unexpected since the one would expect 'zfs share -a' to share any mounted filesystem that has the 'sharenfs' property already set. This changes the behavior of 'zfs share -a' to allow the sharing of 'canmount=noauto' datasets if they are mounted. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Don Brady <[email protected]> Reviewed-by: Prakash Surya <[email protected]> Signed-off-by: George Wilson <[email protected]> External-issue: DLPX-71313 Closes #10688
* FreeBSD: Fix module autoloading when built in baseMatthew Macy2020-08-111-2/+12
| | | | | | | | | | | The KMOD name is "zfs" instead of "openzfs" when building in FreeBSD. Define a ZFS_KMOD symbol as "zfs" when IN_BASE is defined, otherwise "openzfs". Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Ryan Moeller <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10699
* Linux 5.9 compat: make_request_fn replaced with submit_bio interfaceColeman Kane2020-08-113-34/+70
| | | | | | | | | | The make_request_fn and associated API was replaced recently in a Linux 5.9 merge, to replace its functionality with a new submit_bio member in struct block_device_operations. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #10696
* Linux 5.9 compat: Update NR_SLAB_RECLAIMABLE to NR_SLAB_RECLAIMABLE_BColeman Kane2020-08-113-1/+26
| | | | | | | | | | | | This change appears to primarily be a name change for the enum. Had to update the test logic so that it works so long as either one of these is present (favoring the newer one). Additionally, as this is newer, it only shows up in node_page_item, so this commit doesn't test zone_page_item for the same enum. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #10696
* Linux 5.9 compat: add linux/blkdev.h includeColeman Kane2020-08-112-0/+6
| | | | | | | | | | | | | Many of the block device operations (often functions with bdev in the name) were moved into linux/blkdev.h from linux/fs.h. Seems that this header is already included where needed in the code, but in the autoconf tests it was missing causing false negatives. This commit has those tests include linux/fs.h (old location) and now also linux/blkdev.h (new locations). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #10696
* Fix typoAllan Jude2020-08-111-1/+1
| | | | | | Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #10694
* Move ZVOL_DIR back to zfs.hRyan Moeller2020-08-112-1/+3
| | | | | | | | | | This was previously moved because nothing else in-tree uses it, but evidently DilOS uses it out of tree. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10361 Closes #10685
* FreeBSD: update vaccess signature on most recent HEADMatthew Macy2020-08-071-0/+5
| | | | | | Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10682
* Clarify error message when a range-tree double-add occursPaul Dagnelie2020-08-071-8/+22
| | | | | | | | | | | | | In various other pieces of logic have resulted in situations where we double-free space in ZFS. This in turn results in a double-add to the range trees. These issues have been much more difficult to diagnose than they should have been, because the error handling around this case is much weaker than around the double remove case. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #10654
* ZTS: Remove bashisms from zfs-tests.shRyan Moeller2020-08-072-81/+70
| | | | | | | | Bring zfs-tests.sh in to compliance with the other scripts by converting it /bin/sh for to avoid a dependency on bash. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10640
* Remove commented-out codeMatthew Ahrens2020-08-051-27/+0
| | | | | | | | | Remove dead code to make the implementation easier to understand. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Ahrens <[email protected]> Closes #10650
* Remove KM_NODEBUGMatthew Ahrens2020-08-051-1/+0
| | | | | | | | | Remove dead code to make the implementation easier to understand. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Ahrens <[email protected]> Closes #10650
* Remove KMC_NOMAGAZINEMatthew Ahrens2020-08-053-11/+2
| | | | | | | | | Remove dead code to make the implementation easier to understand. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Ahrens <[email protected]> Closes #10650
* Remove KMC_QCACHEMatthew Ahrens2020-08-052-4/+0
| | | | | | | | | Remove dead code to make the implementation easier to understand. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Ahrens <[email protected]> Closes #10650
* Remove KMC_NOHASHMatthew Ahrens2020-08-053-6/+0
| | | | | | | | | Remove dead code to make the implementation easier to understand. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Ahrens <[email protected]> Closes #10650
* Remove KMC_NOTOUCHMatthew Ahrens2020-08-054-5/+1
| | | | | | | | | Remove dead code to make the implementation easier to understand. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Ahrens <[email protected]> Closes #10650
* Remove KMC_OFFSLABMatthew Ahrens2020-08-052-96/+39
| | | | | | | | | Remove dead code to make the implementation easier to understand. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Ahrens <[email protected]> Closes #10650
* Fix i/o error handling of livelists and zap iterationMatthew Ahrens2020-08-054-69/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Pool-wide metadata is stored in the MOS (Meta Object Set). This metadata is stored in triplicate, in addition to any pool-level reduncancy (e.g. RAIDZ). However, if all 3+ copies of this metadata are not available, we can still get EIO/ECKSUM when reading from the MOS. If we encounter such an error in syncing context, we have typically already committed to making a change that we now can't do because of the corrupt/missing metadata. We typically "handle" this with a `VERIFY()` or `zfs_panic_recover()`. This prevents the system from continuing on in an undefined state, while minimizing the amount of error-handling code. However, there are some code paths that ignore these i/o errors, or `ASSERT()` that they don't happen. Since assertions are disabled on non-debug builds, they effectively ignore them as well. This can lead to ZFS continuing on in an incorrect state, potentially leading to on-disk inconsistencies. This commit adds handling for these i/o errors on MOS metadata, typically with a `VERIFY()`: * Handle error return from `zap_cursor_retrieve()` in 4 places in `dsl_deadlist.c`. * Handle error return from `zap_contains()` in `dsl_dir_hold_obj()`. Turns out this call isn't necessary because we can always call `zap_lookup()`. * Handle error return from `zap_lookup()` in `dsl_fs_ss_limit_check()`. * Handle error return from `zap_remove()` in `dsl_dir_rename_sync()`. * Handle error return from `zap_lookup()` in `dsl_dir_remove_livelist()`. * Handle error return from `dsl_process_sub_livelist()` in `spa_livelist_delete_cb()`. Additionally: * Augment the internal history log message for `zfs destroy` to note which method is used (e.g. bptree, livelist, or, synchronous) and the mintxg. * Correct a comment in `dbuf_init()`. * Correct indentation in `dsl_dir_remove_livelist()`. Reviewed by: Sara Hartse <[email protected]> Reviewed-by: George Wilson <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #10643
* FreeBSD: Add support for lockless lookupMatthew Macy2020-08-057-12/+176
| | | | | | Authored-by: mjg <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10657
* Add missed thread_exit() to vdev_{autotrim,rebuild}_threadMatthew Macy2020-08-052-0/+4
| | | | | | | Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Ryan Moeller <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10668
* Fix arc__wait__for__eviction tracepointPavel Snajdr2020-08-041-2/+2
| | | | | | | | | | | | | 3442c2a02d added new `arc_wait_for_eviction` tracepoint, which fails to compile, when tracepoints are enabled. The tracepoint definition begins with `DEFINE_ARC_WAIT_FOR_EVICTION_EVENT` and is a multi-line definition, so this fixes the backslash and parenthesis accordingly. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Pavel Snajdr <[email protected]> Closes #10669
* Verify zfs module loaded before starting servicesJonathon2020-08-013-3/+3
| | | | | | | | | | | | | | | This is a minor change to the systemd service templates that verifies the zfs kernel module is loaded by the kernel prior to attempting to import any zpool. The services check for the presence of /sys/module/zfs which indicates the zfs is module is loaded. This uses the systemd built-in check ConditionPathIsDirectory. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Thode <[email protected]> Signed-off-by: Jonathon Fernyhough <[email protected]> Closes #10663