summaryrefslogtreecommitdiffstats
path: root/module
Commit message (Collapse)AuthorAgeFilesLines
* async zvol minor node creation interferes with receiveMatthew Ahrens2020-02-0310-59/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we finish a zfs receive, dmu_recv_end_sync() calls zvol_create_minors(async=TRUE). This kicks off some other threads that create the minor device nodes (in /dev/zvol/poolname/...). These async threads call zvol_prefetch_minors_impl() and zvol_create_minor(), which both call dmu_objset_own(), which puts a "long hold" on the dataset. Since the zvol minor node creation is asynchronous, this can happen after the `ZFS_IOC_RECV[_NEW]` ioctl and `zfs receive` process have completed. After the first receive ioctl has completed, userland may attempt to do another receive into the same dataset (e.g. the next incremental stream). This second receive and the asynchronous minor node creation can interfere with one another in several different ways, because they both require exclusive access to the dataset: 1. When the second receive is finishing up, dmu_recv_end_check() does dsl_dataset_handoff_check(), which can fail with EBUSY if the async minor node creation already has a "long hold" on this dataset. This causes the 2nd receive to fail. 2. The async udev rule can fail if zvol_id and/or systemd-udevd try to open the device while the the second receive's async attempt at minor node creation owns the dataset (via zvol_prefetch_minors_impl). This causes the minor node (/dev/zd*) to exist, but the udev-generated /dev/zvol/... to not exist. 3. The async minor node creation can silently fail with EBUSY if the first receive's zvol_create_minor() trys to own the dataset while the second receive's zvol_prefetch_minors_impl already owns the dataset. To address these problems, this change synchronously creates the minor node. To avoid the lock ordering problems that the asynchrony was introduced to fix (see #3681), we create the minor nodes from open context, with no locks held, rather than from syncing contex as was originally done. Implementation notes: We generally do not need to traverse children or prefetch anything (e.g. when running the recv, snapshot, create, or clone subcommands of zfs). We only need recursion when importing/opening a pool and when loading encryption keys. The existing recursive, asynchronous, prefetching code is preserved for use in these cases. Channel programs may need to create zvol minor nodes, when creating a snapshot of a zvol with the snapdev property set. We figure out what snapshots are created when running the LUA program in syncing context. In this case we need to remember what snapshots were created, and then try to create their minor nodes from open context, after the LUA code has completed. There are additional zvol use cases that asynchronously own the dataset, which can cause similar problems. E.g. changing the volmode or snapdev properties. These are less problematic because they are not recursive and don't touch datasets that are not involved in the operation, there is still potential for interference with subsequent operations. In the future, these cases should be similarly converted to create the zvol minor node synchronously from open context. The async tasks of removing and renaming minors do not own the objset, so they do not have this problem. However, it may make sense to also convert these operations to happen synchronously from open context, in the future. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Prakash Surya <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> External-issue: DLPX-65948 Closes #7863 Closes #9885
* Left-align index propsRyan Moeller2020-01-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Index type props display as strings, which should be aligned to the left not to the right. Before: ``` FreeBSD-13_0-CURRENT-r356528 ➜ ~ zfs list -ro name,aclmode,mountpoint NAME ACLMODE MOUNTPOINT p0 passthrough /p0 p0/foo discard /p0/foo ``` After: ``` FreeBSD-13_0-CURRENT-r356528 ➜ ~ zfs list -ro name,aclmode,mountpoint NAME ACLMODE MOUNTPOINT p0 passthrough /p0 p0/foo discard /p0/foo ``` Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #9912
* dsl_bookmark_create_check: fix NULL pointer deref if dbca_errors == NULLChristian Schwarz2020-01-231-2/+6
| | | | | | | | Discovered in preparation of zcp support for creating bookmarks. Handle the case where dbca_errors is NULL. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #9880
* entity_namecheck: doc comment: include space as allowed characterChristian Schwarz2020-01-231-1/+1
| | | | | | | | The helper function valid_char already allows it but the doc comment was out of date. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #9879
* Add AltiVec RAID-ZRomain Dolbeau2020-01-234-0/+5035
| | | | | | | | | | | | | Implements the RAID-Z function using AltiVec SIMD. This is basically the NEON code translated to AltiVec. Note that the 'fletcher' algorithm requires 64-bits operations, and the initial implementations of AltiVec (PPC74xx a.k.a. G4, PPC970 a.k.a. G5) only has up to 32-bits operations, so no 'fletcher'. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #9539
* dmu_send: redacted: fix memory leak on invalid redaction/from bookmarkChristian Schwarz2020-01-231-6/+6
| | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Christian Schwarz <[email protected]> Closes #9867
* Simplify FreeBSD's locking requirements in zfs_replay.cMatthew Macy2020-01-221-24/+12
| | | | | | | | | | | Now that the FreeBSD zfs_vnops code avoids asserting that a vnode lock is held when z_replay is true we can limit the FreeBSD specific changes to the couple of changes where it is necessary to drop the vnode locks because a function returns with it held. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9865
* Support inheriting properties in channel programsJason King2020-01-222-9/+89
| | | | | | | | | This adds support in channel programs to inherit properties analogous to `zfs inherit` by adding `zfs.sync.inherit` and `zfs.check.inherit` functions to the ZFS LUA API. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jason King <[email protected]> Closes #9738
* Update tunable macro usage for disable_ivset_guid_checkMatthew Macy2020-01-211-4/+1
| | | | | | Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9861
* Re-consolidate zio_delay_interruptMatthew Macy2020-01-213-104/+71
| | | | | | | | With recent SPL changes there is no longer any need for a per platform version. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9860
* Unify target_cpu handlingBrian Behlendorf2020-01-174-32/+13
| | | | | | | | | | | | | Over the years several slightly different approaches were used in the Makefiles to determine the target architecture. This change updates both the build system and Makefile to handle this in a consistent fashion. TARGET_CPU is set to i386, x86_64, powerpc, aarch6 or sparc64 and made available in the Makefiles to be used as appropriate. Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9848
* Fix errata #4 handling for resuming streamsTom Caputi2020-01-141-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Currently, the handling for errata #4 has two issues which allow the checks for this issue to be bypassed using resumable sends. The first issue is that drc->drc_fromsnapobj is not set in the resuming code as it is in the non-resuming code. This causes dsl_crypto_recv_key_check() to skip its checks for the from_ivset_guid. The second issue is that resumable sends do not clean up their on-disk state if they fail the checks in dmu_recv_stream() that happen before any data is received. As a result of these two bugs, a user can attempt a resumable send of a dataset without a from_ivset_guid. This will fail the initial dmu_recv_stream() checks, leaving a valid resume state. The send can then be resumed, which skips those checks, allowing the receive to be completed. This commit fixes these issues by setting drc->drc_fromsnapobj in the resuming receive path and by ensuring that resumablereceives are properly cleaned up if they fail the initial dmu_recv_stream() checks. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #9818 Closes #9829
* KMC_KVMEM disrupts kv_alloc() memory alignment expectationsloli10K2020-01-141-20/+2
| | | | | | | | | | | | | | | | | | | | | On kernels with KASAN enabled the following failure can be observed as soon as the zfs module is loaded: VERIFY(IS_P2ALIGNED(ptr, PAGE_SIZE)) failed PANIC at spl-kmem-cache.c:228:kv_alloc() The problem is kmalloc() has never guaranteed aligned allocations; this requirement resulted in zfsonlinux/spl@8b45dda which removed all kmalloc() usage in kv_alloc(). Until a GFP_ALIGNED flag (or equivalent functionality) is provided by the kernel this commit partially reverts 66955885 and 6d948c35 to prevent k(v)malloc() allocations in kv_alloc(). Reviewed-by: Kjeld Schouten <[email protected]> Reviewed-by: Michael Niewöhner <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #9813
* Change http://zfsonlinux.org links to https://zfsonlinux.orgBrian Behlendorf2020-01-131-1/+1
| | | | | | | | | | | Update the project website links contained in to repository to reference the secure https://zfsonlinux.org address. Reviewed-By: Richard Laager <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Garrett Fields <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9837
* Add 'zfs send --saved' flagTom Caputi2020-01-103-40/+159
| | | | | | | | | | | | | | | | | | This commit adds the --saved (-S) to the 'zfs send' command. This flag allows a user to send a partially received dataset, which can be useful when migrating a backup server to new hardware. This flag is compatible with resumable receives, so even if the saved send is interrupted, it can be resumed. The flag does not require any user / kernel ABI changes or any new feature flags in the send stream format. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Paul Zuchowski <[email protected]> Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #9007
* Fix "zpool add -n" for dedup, special and log devicesloli10K2020-01-062-3/+7
| | | | | | | | | | | | | | | | | | | | For dedup, special and log devices "zpool add -n" does not print correctly their vdev type: ~# zpool add -n pool dedup /tmp/dedup special /tmp/special log /tmp/log would update 'pool' to the following configuration: pool /tmp/normal /tmp/dedup /tmp/special /tmp/log This could lead storage administrators to modify their ZFS pools to unexpected and unintended vdev configurations. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #9783 Closes #9390
* Fix QAT allocation failure return valueBrian Behlendorf2020-01-061-15/+18
| | | | | | | | | | | | | | | When qat_compress() fails to allocate the required contiguous memory it mistakenly returns success. This prevents the fallback software compression from taking over and (un)compressing the block. Resolve the issue by correctly setting the local 'status' variable on all exit paths. Furthermore, initialize it to CPA_STATUS_FAIL to ensure qat_compress() always fails safe to guard against any similar bugs in the future. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9784 Closes #9788
* Static symbols exported by ICPBrian Behlendorf2020-01-021-2/+0
| | | | | | | | | | | | The crypto_cipher_init_prov and crypto_cipher_init are declared static and should not be exported by the ICP. This resolves the following warnings observed when building with the 5.4 kernel. WARNING: "crypto_cipher_init" [.../icp] is a static EXPORT_SYMBOL WARNING: "crypto_cipher_init_prov" [.../icp] is a static EXPORT_SYMBOL Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9791
* Avoid some crashes when importing a pool with corrupt metadataSteve Mokris2019-12-261-3/+11
| | | | | | | | | | | | - Skip invalid DVAs when importing pools in readonly mode (in addition to when the config is untrusted). - Upon encountering a DVA with a null VDEV, fail gracefully instead of panicking with a NULL pointer dereference. Reviewed-by: Pavel Zakharov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Steve Mokris <[email protected]> Closes #9022
* Cancel initialize and TRIM before vdev_metaslab_fini()Brian Behlendorf2019-12-261-6/+7
| | | | | | | | | | | Any running 'zpool initialize' or TRIM must be cancelled prior to the vdev_metaslab_fini() call in spa_vdev_remove_log() which will unload the metaslabs and set ms->ms_group == NULL. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8602 Closes #9751
* cppcheck: (warning) Possible null pointer dereference: nvhBrian Behlendorf2019-12-181-1/+3
| | | | | | | | | | | Move the 'nvh = (void *)buf' assignment after the 'buf == NULL' check to resolve the warning. Interestingly, cppcheck 1.88 correctly determines that the existing code is safe, while cppcheck 1.86 reports the warning. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (error) Address of local auto-variable assignedBrian Behlendorf2019-12-182-0/+2
| | | | | | | | | | | | Suppress autoVariables warnings in the lua interpreter. The usage here while unconventional in intentional and the same as upstream. [module/lua/ldebug.c:327]: (error) Address of local auto-variable assigned to a function parameter. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (warning) Possible null pointer dereference: dnpBrian Behlendorf2019-12-181-0/+1
| | | | | | | | | | | | | | The dnp argument can only be set to NULL when the DNODE_DRY_RUN flag is set. In which case, an early return path will be executed and a NULL pointer dereference at the given location is impossible. Add an additional ASSERT to silence the cppcheck warning and document that dbp must never be NULL at the point in the function. [module/zfs/dnode.c:1566]: (warning) Possible null pointer deref: dnp Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (error) Shifting signed 64-bit value by 63 bitsUbuntu2019-12-181-0/+2
| | | | | | | | | As of cppcheck 1.82 surpress the warning regarding shifting too many bits for __divdi3() implemention. The algorithm used here is correct. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (error) Uninitialized variableUbuntu2019-12-188-22/+23
| | | | | | | | | | As of cppcheck 1.82 warnings are issued when using the list_for_each_* functions with an uninitialized variable. Functionally, this is fine but to resolve the warning initialize these variables. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* cppcheck: (error) Uninitialized variableUbuntu2019-12-182-11/+6
| | | | | | | | | | | | | | | | | Resolve the following uninitialized variable warnings. In practice these were unreachable due to the goto. Replacing the goto with a return resolves the warning and yields more readable code. [module/icp/algs/modes/ccm.c:892]: (error) Uninitialized variable: ccm_param [module/icp/algs/modes/ccm.c:893]: (error) Uninitialized variable: ccm_param [module/icp/algs/modes/gcm.c:564]: (error) Uninitialized variable: gcm_param [module/icp/algs/modes/gcm.c:565]: (error) Uninitialized variable: gcm_param [module/icp/algs/modes/gcm.c:599]: (error) Uninitialized variable: gmac_param [module/icp/algs/modes/gcm.c:600]: (error) Uninitialized variable: gmac_param Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9732
* Minor performance fix for NEON RAID-ZRomain Dolbeau2019-12-171-4/+2
| | | | | | | | | The NEON code replicates too closely the SSE code, including a masked 16-bits shift. But NEON, like AltiVec (#9539), has unsigned 8-bits shift, so use that instead and drop the masking. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #9725
* Fix zfs_xattr_owner_unlinked on FreeBSD and commentMatthew Macy2019-12-162-0/+18
| | | | | | | | Explain FreeBSD VFS' unfortunate idiosyncratic locking requirements. There is no functional change for other platforms. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9720
* Don't fail to apply umask for O_TMPFILE filesTomohiro Kusumi2019-12-131-0/+6
| | | | | | | | | | | | | Apply umask to `mode` which will eventually be applied to inode. This is needed since VFS doesn't apply umask for O_TMPFILE files. (Note that zpl_init_acl() applies `ip->i_mode &= ~current_umask();` only when POSIX ACL is used.) Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Tomohiro Kusumi <[email protected]> Closes #8997 Closes #8998
* Allow empty ds_props_obj to be destroyedTom Caputi2019-12-131-2/+19
| | | | | | | | | | | | | | | | | | | | | Currently, 'zfs list' and 'zfs get' commands can be slow when working with snapshots that have a ds_props_obj. This is because the code that discovers all of the properties for these snapshots needs to read this object for each snapshot, which almost always ends up causing an extra random synchronous read for each snapshot. This performance penalty exists even if the properties on that snapshot have been unset because the object is normally only freed when the snapshot is freed, even though it is only created when it is needed. This patch allows the user to regain 'zfs list' performance on these snapshots by destroying the ds_props_obj when it no longer has any entries left. In practice on a production machine, this optimization seems to make 'zfs list' about 55% faster. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Paul Zuchowski <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #9704
* Make zfs_replay.c work on FreeBSDMatthew Macy2019-12-133-11/+65
| | | | | | | | | | | | | FreeBSD's vfs currently doesn't permit file systems to do their own locking. To avoid having to have duplicate zfs functions with and without locking add locking here. With luck these changes can be removed in the future. Reviewed-by: Sean Eric Fagan <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9715
* Fix use-after-free of vd_path in spa_vdev_remove()Matthew Ahrens2019-12-111-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #4 0x55ffbc769fba in ztest_execute ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread ztest.c:6761 #6 0x7fe889cbc6da in start_thread #7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #7 0x55ffbc769fba in ztest_execute ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread ztest.c:6761 #9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #9706
* Relocate common quota functions to shared codeRyan Moeller2019-12-118-455/+503
| | | | | | | | | | | The quota functions are common to all implementations and can be moved to common code. As a simplification they were moved to the Linux platform code in the initial refactoring. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #9710
* Add FreeBSD jail support hooksMatthew Macy2019-12-112-1/+8
| | | | | | | | | | | | Add the 'zfs jail/unjail' subcommands along with the relevant documentation from FreeBSD. This feature is not supported on Linux and still requires the match kernel ioctls which will be included when the FreeBSD platform code is integrated. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #9686
* Eliminate Linux specific inode usage from common code Matthew Macy2019-12-1114-311/+321
| | | | | | | | | | Change many of the znops routines to take a znode rather than an inode so that zfs_replay code can be largely shared and in the future the much of the znops code may be shared. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9708
* zio_decompress_data always ASSERTs successful decompressionPaul Zuchowski2019-12-101-1/+0
| | | | | | | | | | | | | This interferes with zdb_read_block trying all the decompression algorithms when the 'd' flag is specified, as some are expected to fail. Also control the output when guessing algorithms, try the more common compression types first, allow specifying lsize/psize, and fix an uninitialized variable. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #9612 Closes #9630
* Abstract away platform specific superblock referencesMatthew Macy2019-12-102-5/+17
| | | | | | | | The zfsvfs->z_sb field is Linux specified and should be abstracted. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9697
* Exclude data from cores unconditionally and metadata conditionallyMatthew Macy2019-12-091-2/+11
| | | | | | | | | This change allows us to align the code dump logic across platforms. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Don Brady <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9691
* Mark dsl_dataset_deactivate_feature_impl staticMatthew Macy2019-12-091-1/+1
| | | | | | | | | The dsl_dataset_deactivate_feature_impl() function is private and should be marked as such. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9696
* ZTS: Fix zpool_reopen_001_posBrian Behlendorf2019-12-091-10/+25
| | | | | | | | | | | | Update the vdev_disk_open() retry logic to use a specified number of milliseconds to be more robust. Additionally, on failure log both the time waited and requested timeout to the internal log. The default maximum allowed open retry time has been increased from 500ms to 1000ms. Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9680
* Disable sysfs feature checks on FreeBSDMatthew Macy2019-12-062-2/+8
| | | | | | | | | | | The sysfs infrastructure for reporting supported features and properties is Linux specific. Disable it on FreeBSD until it can be extended to be more portable. Reviewed-by: Kjeld Schouten <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9684
* ICP: Fix out of bounds writeAttila Fülöp2019-12-061-1/+3
| | | | | | | | | | | | | | | | | | | If gcm_mode_encrypt_contiguous_blocks() is called more than once in succession, with the accumulated lengths being less than blocksize, ctx->copy_to will be incorrectly advanced. Later, if out is NULL, the bcopy at line 114 will overflow ctx->gcm_copy_to since ctx->gcm_remainder_len is larger than the ctx->gcm_copy_to buffer can hold. The fix is to set ctx->copy_to only if it's not already set. For ZoL the issue may be academic, since in all my testing I wasn't able to hit neither of both conditions needed to trigger it, but other consumers can easily do so. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #9660
* Disable EDONR on FreeBSDMatthew Macy2019-12-053-3/+26
| | | | | | | | | | | FreeBSD uses its own crypto framework in-kernel which, at this time, has no EDONR implementation. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #9664
* Refactor deadman set failmode to be cross platformMatthew Macy2019-12-052-11/+22
| | | | | | | | | Update zfs_deadman_failmode to use the ZFS_MODULE_PARAM_CALL wrapper, and split the common and platform specific portions. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9670
* Replace ASSERTV macro with compiler annotationMatthew Macy2019-12-0523-55/+56
| | | | | | | | | | | Remove the ASSERTV macro and handle suppressing unused compiler warnings for variables only in ASSERTs using the __attribute__((unused)) compiler annotation. The annotation is understood by both gcc and clang. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9671
* ICP: Fix null pointer dereference and use after freeAttila Fülöp2019-12-032-4/+9
| | | | | | | | | | | | | | | | | | | In gcm_mode_decrypt_contiguous_blocks(), if vmem_alloc() fails, bcopy is called with a NULL pointer destination and a length > 0. This results in undefined behavior. Further ctx->gcm_pt_buf is freed but not set to NULL, leading to a potential write after free and a double free due to missing return value handling in crypto_update_uio(). The code as is may write to ctx->gcm_pt_buf in gcm_decrypt_final() and may free ctx->gcm_pt_buf again in aes_decrypt_atomic(). The fix is to slightly rework error handling and check the return value in crypto_update_uio(). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #9659
* Fix use-after-free in case of L2ARC prefetch failureAlexander Motin2019-12-031-3/+16
| | | | | | | | | | | | | | | | In case L2ARC read failed, l2arc_read_done() creates _different_ ZIO to read data from the original storage device. Unfortunately pointer to the failed ZIO remains in hdr->b_l1hdr.b_acb->acb_zio_head, and if some other read try to bump the ZIO priority, it will crash. The problem is reproducible by corrupting L2ARC content and reading some data with prefetch if l2arc_noprefetch tunable is changed to 0. With the default setting the issue is probably not reproducible now. Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #9648
* Increase allowed 'special_small_blocks' maximum valueBrian Behlendorf2019-12-031-1/+1
| | | | | | | | | | | | | | There may be circumstances where it's desirable that all blocks in a specified dataset be stored on the special device. Relax the artificial 128K limit and allow the special_small_blocks property to be set up to 1M. When blocks >1MB have been enabled via the zfs_max_recordsize module option, this limit is increased accordingly. Reviewed-by: Don Brady <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9131 Closes #9355
* Wrap module_param_call() routines under __linux__Matthew Macy2019-12-032-2/+2
| | | | | | | | | The module_param_call() functionality is currently still Linux-specific and should be wrapped accordingly. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9666
* Mark write_record staticMatthew Macy2019-12-031-1/+1
| | | | | | | | The write_record() function is private and should be marked as such. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9665