aboutsummaryrefslogtreecommitdiffstats
path: root/module
Commit message (Collapse)AuthorAgeFilesLines
* Missed wakeup when growing kmem cacheMatthew Ahrens2020-02-131-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When growing the size of a (VMEM or KVMEM) kmem cache, spl_cache_grow() always does taskq_dispatch(spl_cache_grow_work), and then waits for the KMC_BIT_GROWING to be cleared by the taskq thread. The taskq thread (spl_cache_grow_work()) does: 1. allocate new slab and add to list 2. wake_up_all(skc_waitq) 3. clear_bit(KMC_BIT_GROWING) Therefore, the waiting thread can wake up before GROWING has been cleared. It will see that the growing has not yet completed, and go back to sleep until it hits the 100ms timeout. This can have an extreme performance impact on workloads that alloc/free more than fits in the (statically-sized) magazines. These workloads allocate and free slabs with high frequency. The problem can be observed with `funclatency spl_cache_grow`, which on some workloads shows that 99.5% of the time it takes <64us to allocate slabs, but we spend ~70% of our time in outliers, waiting for the 100ms timeout. The fix is to do `clear_bit(KMC_BIT_GROWING)` before `wake_up_all(skc_waitq)`. A future investigation should evaluate if we still actually need to taskq_dispatch() at all, and if so on which kernel versions. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #9989
* Remove duplicate dbufs accountingAlexander Motin2020-02-132-4/+2
| | | | | | | | | | | | | | Since AVL already has embedded element counter, use dn_dbufs_count only for dbufs not counted there (bonus buffers) and just add them. This removes two atomics per dbuf life cycle. According to profiler it reduces time spent by dbuf_destroy() inside bottlenecked dbuf_evict_thread() from 13.36% to 9.20% of the core. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #9949
* zcp: add zfs.sync.bookmarkChristian Schwarz2020-02-112-15/+46
| | | | | | | | | | Add support for bookmark creation and cloning. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #9571
* Implement bookmark copyingChristian Schwarz2020-02-113-68/+318
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This feature allows copying existing bookmarks using zfs bookmark fs#target fs#newbookmark There are some niche use cases for such functionality, e.g. when using bookmarks as markers for replication progress. Copying redaction bookmarks produces a normal bookmark that cannot be used for redacted send (we are not duplicating the redaction object). ZCP support for bookmarking (both creation and copying) will be implemented in a separate patch based on this work. Overview: - Terminology: - source = existing snapshot or bookmark - new/bmark = new bookmark - Implement bookmark copying in `dsl_bookmark.c` - create new bookmark node - copy source's `zbn_phys` to new's `zbn_phys` - zero-out redaction object id in copy - Extend existing bookmark ioctl nvlist schema to accept bookmarks as sources - => `dsl_bookmark_create_nvl_validate` is authoritative - use `dsl_dataset_is_before` check for both snapshot and bookmark sources - Adjust CLI - refactor shortname expansion logic in `zfs_do_bookmark` - Update man pages - warn about redaction bookmark handling - Add test cases - CLI - pyyzfs libzfs_core bindings Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #9571
* Address Coverity warnings in #9902Matthew Macy2020-02-111-4/+5
| | | | | | | | | | Coverity reports the variable may be NULL, but due to the way the dirty records are handled this cannot be the case. Add a comment and VERIFY to make this clear and silence the warning. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9962
* Add missing dmu_buf_unlock_parent() calls to dbuf_read_impl()Brian Behlendorf2020-02-101-1/+3
| | | | | | | | | | | | As explained by the comment in dbuf_read() and above dbuf_read_impl(). Under all circumstances the parent lock specified by dblt should be dropped when existing dbuf_read_impl(). This was not being done for two exist paths. Additionally, ensure the mutex is unlocked before dropping the parent lock. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9968
* Fix zdb -R with 'b' flagPaul Zuchowski2020-02-101-20/+71
| | | | | | | | | | | | | zdb -R :b fails due to the indirect block being compressed, and the 'b' and 'd' flag not working in tandem when specified. Fix the flag parsing code and create a zfs test for zdb -R block display. Also fix the zio flags where the dotted notation for the vdev portion of DVA (i.e. 0.0:offset:length) fails. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #9640 Closes #9729
* Share some code for spa deadman tunablesRyan Moeller2020-02-102-19/+31
| | | | | | | | | | | | | We need to do the same thing to update all spas on any OS for these tunables, so let's share the code. While here let's match the types of the literals initializing the variables with the type of the variable. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Olaf Faaland <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #9964
* ICP: Improve AES-GCM performanceAttila Fülöp2020-02-1010-20/+2590
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently SIMD accelerated AES-GCM performance is limited by two factors: a. The need to disable preemption and interrupts and save the FPU state before using it and to do the reverse when done. Due to the way the code is organized (see (b) below) we have to pay this price twice for each 16 byte GCM block processed. b. Most processing is done in C, operating on single GCM blocks. The use of SIMD instructions is limited to the AES encryption of the counter block (AES-NI) and the Galois multiplication (PCLMULQDQ). This leads to the FPU not being fully utilized for crypto operations. To solve (a) we do crypto processing in larger chunks while owning the FPU. An `icp_gcm_avx_chunk_size` module parameter was introduced to make this chunk size tweakable. It defaults to 32 KiB. This step alone roughly doubles performance. (b) is tackled by porting and using the highly optimized openssl AES-GCM assembler routines, which do all the processing (CTR, AES, GMULT) in a single routine. Both steps together result in up to 32x reduction of the time spend in the en/decryption routines, leading up to approximately 12x throughput increase for large (128 KiB) blocks. Lastly, this commit changes the default encryption algorithm from AES-CCM to AES-GCM when setting the `encryption=on` property. Reviewed-By: Brian Behlendorf <[email protected]> Reviewed-By: Jason King <[email protected]> Reviewed-By: Tom Caputi <[email protected]> Reviewed-By: Richard Laager <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #9749
* Factor out dbuf_sync_bonusMatthew Macy2020-02-071-31/+52
| | | | | | | | | Factor the portion of dbuf_sync_leaf() responsible for handling bonus buffers out in to its own dbuf_sync_bonus() helper function. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9909
* Linux 5.6 compat: timestamp_truncate()Brian Behlendorf2020-02-072-8/+6
| | | | | | | | | | | | The timestamp_truncate() function was added, it replaces the existing timespec64_trunc() function. This change renames our wrapper function to be consistent with the upstream name and updates the compatibility code for older kernels accordingly. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9956 Closes #9961
* Linux 5.6 compat: struct proc_opsBrian Behlendorf2020-02-073-11/+47
| | | | | | | | | | | | | | The proc_ops structure was introduced to replace the use of of the file_operations structure when registering proc handlers. This change creates a new kstat_proc_op_t typedef for compatibility which can be used to pass around the correct structure. This change additionally adds the 'const' keyword to all of the existing proc operations structures. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9961
* Reduce number of atomic_add() calls in aggsumAlexander Motin2020-02-061-32/+33
| | | | | | | | | | | | | | | | | | | | | | | Previous code used 4 atomics to do aggsum_flush_bucket() and 2 more to re-borrow after the flush. But since asc_borrowed and asc_delta are accessed only while holding asc_lock, it makes no any sense to modify as_lower_bound and as_upper_bound in multiple steps. Instead of that the new code uses only 2 atomics in all the cases, one per as_*_bound variable. I think even that is overkill, simple atomic store and load could be used here, since all modifications are done under the as_lock, but there are no such primitives in ZFS code now. While there, make borrow code consider previous borrow value, so that on mixed request patterns reduce chance of needing to borrow again if much larger request follows tiny one that needed borrow. Also reduce as_numbuckets from uint64_t to u_int. It makes no sense to use so large division operation on every aggsum_add(). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Paul Dagnelie <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #9930
* Replace static per-cpu with dynamic per-cpu dataRomain Dolbeau2020-02-061-4/+14
| | | | | | | This solves the issue of loading the spl module on RISC-V. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #9942
* Few microoptimizations to dbuf layerAlexander Motin2020-02-051-22/+9
| | | | | | | | | | | | | | | | | | | Move db_link into the same cache line as db_blkid and db_level. It allows significantly reduce avl_add() time in dbuf_create() on systems with large RAM and huge number of dbufs per dnode. Avoid few accesses to dbuf_caches[].size, which is highly congested under high IOPS and never stays in cache for a long time. Use local value we are receiving from zfs_refcount_add_many() any way. Remove cache_size_bytes_max bump from dbuf_evict_one(). I don't see a point to do it on dbuf eviction after we done it on insertion in dbuf_rele_and_unlock(). Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #9931
* Convert dbuf dirty record record list to a list_tMatthew Macy2020-02-055-87/+76
| | | | | | | | | Additionally pull in state machine comments about upcoming async cow work. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9902
* Prepare ks_data before calling kstat_install()Alexander Motin2020-02-042-7/+5
| | | | | | | | | | | | It violated sequence described in kstat.h, and at least on FreeBSD kstat_install() uses provided names to create the sysctls. If the names are not available at the time, it ends up bad. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #9933
* Restore aclmode and remove acltype on FreeBSDRyan Moeller2020-02-041-3/+26
| | | | | | | | | | | | | | This replaces the placeholder ZFS_PROP_PRIVATE with ZFS_PROP_ACLMODE, matching what is done in the NFSv4 ACLs PR (#9709). On FreeBSD we hide ZFS_PROP_ACLTYPE, while on Linux we hide ZFS_PROP_ACLMODE. The tests already assume this arrangement. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #9913
* Fix const-correctness in raidz mathRyan Moeller2020-02-031-8/+8
| | | | | | | | Clang warns (errors) that "cast from 'const void *' to 'struct v *' drops const qualifier." Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #9917
* async zvol minor node creation interferes with receiveMatthew Ahrens2020-02-0310-59/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we finish a zfs receive, dmu_recv_end_sync() calls zvol_create_minors(async=TRUE). This kicks off some other threads that create the minor device nodes (in /dev/zvol/poolname/...). These async threads call zvol_prefetch_minors_impl() and zvol_create_minor(), which both call dmu_objset_own(), which puts a "long hold" on the dataset. Since the zvol minor node creation is asynchronous, this can happen after the `ZFS_IOC_RECV[_NEW]` ioctl and `zfs receive` process have completed. After the first receive ioctl has completed, userland may attempt to do another receive into the same dataset (e.g. the next incremental stream). This second receive and the asynchronous minor node creation can interfere with one another in several different ways, because they both require exclusive access to the dataset: 1. When the second receive is finishing up, dmu_recv_end_check() does dsl_dataset_handoff_check(), which can fail with EBUSY if the async minor node creation already has a "long hold" on this dataset. This causes the 2nd receive to fail. 2. The async udev rule can fail if zvol_id and/or systemd-udevd try to open the device while the the second receive's async attempt at minor node creation owns the dataset (via zvol_prefetch_minors_impl). This causes the minor node (/dev/zd*) to exist, but the udev-generated /dev/zvol/... to not exist. 3. The async minor node creation can silently fail with EBUSY if the first receive's zvol_create_minor() trys to own the dataset while the second receive's zvol_prefetch_minors_impl already owns the dataset. To address these problems, this change synchronously creates the minor node. To avoid the lock ordering problems that the asynchrony was introduced to fix (see #3681), we create the minor nodes from open context, with no locks held, rather than from syncing contex as was originally done. Implementation notes: We generally do not need to traverse children or prefetch anything (e.g. when running the recv, snapshot, create, or clone subcommands of zfs). We only need recursion when importing/opening a pool and when loading encryption keys. The existing recursive, asynchronous, prefetching code is preserved for use in these cases. Channel programs may need to create zvol minor nodes, when creating a snapshot of a zvol with the snapdev property set. We figure out what snapshots are created when running the LUA program in syncing context. In this case we need to remember what snapshots were created, and then try to create their minor nodes from open context, after the LUA code has completed. There are additional zvol use cases that asynchronously own the dataset, which can cause similar problems. E.g. changing the volmode or snapdev properties. These are less problematic because they are not recursive and don't touch datasets that are not involved in the operation, there is still potential for interference with subsequent operations. In the future, these cases should be similarly converted to create the zvol minor node synchronously from open context. The async tasks of removing and renaming minors do not own the objset, so they do not have this problem. However, it may make sense to also convert these operations to happen synchronously from open context, in the future. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Prakash Surya <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> External-issue: DLPX-65948 Closes #7863 Closes #9885
* Left-align index propsRyan Moeller2020-01-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Index type props display as strings, which should be aligned to the left not to the right. Before: ``` FreeBSD-13_0-CURRENT-r356528 ➜ ~ zfs list -ro name,aclmode,mountpoint NAME ACLMODE MOUNTPOINT p0 passthrough /p0 p0/foo discard /p0/foo ``` After: ``` FreeBSD-13_0-CURRENT-r356528 ➜ ~ zfs list -ro name,aclmode,mountpoint NAME ACLMODE MOUNTPOINT p0 passthrough /p0 p0/foo discard /p0/foo ``` Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #9912
* dsl_bookmark_create_check: fix NULL pointer deref if dbca_errors == NULLChristian Schwarz2020-01-231-2/+6
| | | | | | | | Discovered in preparation of zcp support for creating bookmarks. Handle the case where dbca_errors is NULL. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #9880
* entity_namecheck: doc comment: include space as allowed characterChristian Schwarz2020-01-231-1/+1
| | | | | | | | The helper function valid_char already allows it but the doc comment was out of date. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #9879
* Add AltiVec RAID-ZRomain Dolbeau2020-01-234-0/+5035
| | | | | | | | | | | | | Implements the RAID-Z function using AltiVec SIMD. This is basically the NEON code translated to AltiVec. Note that the 'fletcher' algorithm requires 64-bits operations, and the initial implementations of AltiVec (PPC74xx a.k.a. G4, PPC970 a.k.a. G5) only has up to 32-bits operations, so no 'fletcher'. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #9539
* dmu_send: redacted: fix memory leak on invalid redaction/from bookmarkChristian Schwarz2020-01-231-6/+6
| | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #9867
* Simplify FreeBSD's locking requirements in zfs_replay.cMatthew Macy2020-01-221-24/+12
| | | | | | | | | | | Now that the FreeBSD zfs_vnops code avoids asserting that a vnode lock is held when z_replay is true we can limit the FreeBSD specific changes to the couple of changes where it is necessary to drop the vnode locks because a function returns with it held. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9865
* Support inheriting properties in channel programsJason King2020-01-222-9/+89
| | | | | | | | | This adds support in channel programs to inherit properties analogous to `zfs inherit` by adding `zfs.sync.inherit` and `zfs.check.inherit` functions to the ZFS LUA API. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jason King <[email protected]> Closes #9738
* Update tunable macro usage for disable_ivset_guid_checkMatthew Macy2020-01-211-4/+1
| | | | | | Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9861
* Re-consolidate zio_delay_interruptMatthew Macy2020-01-213-104/+71
| | | | | | | | With recent SPL changes there is no longer any need for a per platform version. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9860
* Unify target_cpu handlingBrian Behlendorf2020-01-174-32/+13
| | | | | | | | | | | | | Over the years several slightly different approaches were used in the Makefiles to determine the target architecture. This change updates both the build system and Makefile to handle this in a consistent fashion. TARGET_CPU is set to i386, x86_64, powerpc, aarch6 or sparc64 and made available in the Makefiles to be used as appropriate. Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9848
* Fix errata #4 handling for resuming streamsTom Caputi2020-01-141-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the handling for errata #4 has two issues which allow the checks for this issue to be bypassed using resumable sends. The first issue is that drc->drc_fromsnapobj is not set in the resuming code as it is in the non-resuming code. This causes dsl_crypto_recv_key_check() to skip its checks for the from_ivset_guid. The second issue is that resumable sends do not clean up their on-disk state if they fail the checks in dmu_recv_stream() that happen before any data is received. As a result of these two bugs, a user can attempt a resumable send of a dataset without a from_ivset_guid. This will fail the initial dmu_recv_stream() checks, leaving a valid resume state. The send can then be resumed, which skips those checks, allowing the receive to be completed. This commit fixes these issues by setting drc->drc_fromsnapobj in the resuming receive path and by ensuring that resumablereceives are properly cleaned up if they fail the initial dmu_recv_stream() checks. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #9818 Closes #9829
* KMC_KVMEM disrupts kv_alloc() memory alignment expectationsloli10K2020-01-141-20/+2
| | | | | | | | | | | | | | | | | | | | | On kernels with KASAN enabled the following failure can be observed as soon as the zfs module is loaded: VERIFY(IS_P2ALIGNED(ptr, PAGE_SIZE)) failed PANIC at spl-kmem-cache.c:228:kv_alloc() The problem is kmalloc() has never guaranteed aligned allocations; this requirement resulted in zfsonlinux/spl@8b45dda which removed all kmalloc() usage in kv_alloc(). Until a GFP_ALIGNED flag (or equivalent functionality) is provided by the kernel this commit partially reverts 66955885 and 6d948c35 to prevent k(v)malloc() allocations in kv_alloc(). Reviewed-by: Kjeld Schouten <[email protected]> Reviewed-by: Michael Niewöhner <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #9813
* Change http://zfsonlinux.org links to https://zfsonlinux.orgBrian Behlendorf2020-01-131-1/+1
| | | | | | | | | | | Update the project website links contained in to repository to reference the secure https://zfsonlinux.org address. Reviewed-By: Richard Laager <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Garrett Fields <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9837
* Add 'zfs send --saved' flagTom Caputi2020-01-103-40/+159
| | | | | | | | | | | | | | | | | | This commit adds the --saved (-S) to the 'zfs send' command. This flag allows a user to send a partially received dataset, which can be useful when migrating a backup server to new hardware. This flag is compatible with resumable receives, so even if the saved send is interrupted, it can be resumed. The flag does not require any user / kernel ABI changes or any new feature flags in the send stream format. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Paul Zuchowski <[email protected]> Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #9007
* Fix "zpool add -n" for dedup, special and log devicesloli10K2020-01-062-3/+7
| | | | | | | | | | | | | | | | | | | | For dedup, special and log devices "zpool add -n" does not print correctly their vdev type: ~# zpool add -n pool dedup /tmp/dedup special /tmp/special log /tmp/log would update 'pool' to the following configuration: pool /tmp/normal /tmp/dedup /tmp/special /tmp/log This could lead storage administrators to modify their ZFS pools to unexpected and unintended vdev configurations. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #9783 Closes #9390
* Fix QAT allocation failure return valueBrian Behlendorf2020-01-061-15/+18
| | | | | | | | | | | | | | | When qat_compress() fails to allocate the required contiguous memory it mistakenly returns success. This prevents the fallback software compression from taking over and (un)compressing the block. Resolve the issue by correctly setting the local 'status' variable on all exit paths. Furthermore, initialize it to CPA_STATUS_FAIL to ensure qat_compress() always fails safe to guard against any similar bugs in the future. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9784 Closes #9788
* Static symbols exported by ICPBrian Behlendorf2020-01-021-2/+0
| | | | | | | | | | | | The crypto_cipher_init_prov and crypto_cipher_init are declared static and should not be exported by the ICP. This resolves the following warnings observed when building with the 5.4 kernel. WARNING: "crypto_cipher_init" [.../icp] is a static EXPORT_SYMBOL WARNING: "crypto_cipher_init_prov" [.../icp] is a static EXPORT_SYMBOL Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9791
* Avoid some crashes when importing a pool with corrupt metadataSteve Mokris2019-12-261-3/+11
| | | | | | | | | | | | - Skip invalid DVAs when importing pools in readonly mode (in addition to when the config is untrusted). - Upon encountering a DVA with a null VDEV, fail gracefully instead of panicking with a NULL pointer dereference. Reviewed-by: Pavel Zakharov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Steve Mokris <[email protected]> Closes #9022
* Cancel initialize and TRIM before vdev_metaslab_fini()Brian Behlendorf2019-12-261-6/+7
| | | | | | | | | | | Any running 'zpool initialize' or TRIM must be cancelled prior to the vdev_metaslab_fini() call in spa_vdev_remove_log() which will unload the metaslabs and set ms->ms_group == NULL. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8602 Closes #9751
* cppcheck: (warning) Possible null pointer dereference: nvhBrian Behlendorf2019-12-181-1/+3
| | | | | | | | | | | Move the 'nvh = (void *)buf' assignment after the 'buf == NULL' check to resolve the warning. Interestingly, cppcheck 1.88 correctly determines that the existing code is safe, while cppcheck 1.86 reports the warning. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (error) Address of local auto-variable assignedBrian Behlendorf2019-12-182-0/+2
| | | | | | | | | | | | Suppress autoVariables warnings in the lua interpreter. The usage here while unconventional in intentional and the same as upstream. [module/lua/ldebug.c:327]: (error) Address of local auto-variable assigned to a function parameter. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (warning) Possible null pointer dereference: dnpBrian Behlendorf2019-12-181-0/+1
| | | | | | | | | | | | | | The dnp argument can only be set to NULL when the DNODE_DRY_RUN flag is set. In which case, an early return path will be executed and a NULL pointer dereference at the given location is impossible. Add an additional ASSERT to silence the cppcheck warning and document that dbp must never be NULL at the point in the function. [module/zfs/dnode.c:1566]: (warning) Possible null pointer deref: dnp Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (error) Shifting signed 64-bit value by 63 bitsUbuntu2019-12-181-0/+2
| | | | | | | | | As of cppcheck 1.82 surpress the warning regarding shifting too many bits for __divdi3() implemention. The algorithm used here is correct. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (error) Uninitialized variableUbuntu2019-12-188-22/+23
| | | | | | | | | | As of cppcheck 1.82 warnings are issued when using the list_for_each_* functions with an uninitialized variable. Functionally, this is fine but to resolve the warning initialize these variables. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (error) Uninitialized variableUbuntu2019-12-182-11/+6
| | | | | | | | | | | | | | | | | Resolve the following uninitialized variable warnings. In practice these were unreachable due to the goto. Replacing the goto with a return resolves the warning and yields more readable code. [module/icp/algs/modes/ccm.c:892]: (error) Uninitialized variable: ccm_param [module/icp/algs/modes/ccm.c:893]: (error) Uninitialized variable: ccm_param [module/icp/algs/modes/gcm.c:564]: (error) Uninitialized variable: gcm_param [module/icp/algs/modes/gcm.c:565]: (error) Uninitialized variable: gcm_param [module/icp/algs/modes/gcm.c:599]: (error) Uninitialized variable: gmac_param [module/icp/algs/modes/gcm.c:600]: (error) Uninitialized variable: gmac_param Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* Minor performance fix for NEON RAID-ZRomain Dolbeau2019-12-171-4/+2
| | | | | | | | | The NEON code replicates too closely the SSE code, including a masked 16-bits shift. But NEON, like AltiVec (#9539), has unsigned 8-bits shift, so use that instead and drop the masking. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #9725
* Fix zfs_xattr_owner_unlinked on FreeBSD and commentMatthew Macy2019-12-162-0/+18
| | | | | | | | Explain FreeBSD VFS' unfortunate idiosyncratic locking requirements. There is no functional change for other platforms. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9720
* Don't fail to apply umask for O_TMPFILE filesTomohiro Kusumi2019-12-131-0/+6
| | | | | | | | | | | | | Apply umask to `mode` which will eventually be applied to inode. This is needed since VFS doesn't apply umask for O_TMPFILE files. (Note that zpl_init_acl() applies `ip->i_mode &= ~current_umask();` only when POSIX ACL is used.) Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Tomohiro Kusumi <[email protected]> Closes #8997 Closes #8998
* Allow empty ds_props_obj to be destroyedTom Caputi2019-12-131-2/+19
| | | | | | | | | | | | | | | | | | | | | Currently, 'zfs list' and 'zfs get' commands can be slow when working with snapshots that have a ds_props_obj. This is because the code that discovers all of the properties for these snapshots needs to read this object for each snapshot, which almost always ends up causing an extra random synchronous read for each snapshot. This performance penalty exists even if the properties on that snapshot have been unset because the object is normally only freed when the snapshot is freed, even though it is only created when it is needed. This patch allows the user to regain 'zfs list' performance on these snapshots by destroying the ds_props_obj when it no longer has any entries left. In practice on a production machine, this optimization seems to make 'zfs list' about 55% faster. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Paul Zuchowski <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #9704
* Make zfs_replay.c work on FreeBSDMatthew Macy2019-12-133-11/+65
| | | | | | | | | | | | | FreeBSD's vfs currently doesn't permit file systems to do their own locking. To avoid having to have duplicate zfs functions with and without locking add locking here. With luck these changes can be removed in the future. Reviewed-by: Sean Eric Fagan <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9715