aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* Update zts-report.py with additional testsBrian Behlendorf2020-06-221-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following test cases may still occasionally fail and are being added to the "maybe" list for Linux until they can be updated to be entirely reliable. cli_root/zfs_rename/zfs_rename_002_pos.ksh cli_root/zpool_reopen/zpool_reopen_003_pos.ksh refreserv/refreserv_raidz These 6 tests consistently fail only on Fedora 31+, the failures are related to the kernel rescanning the partition table on loopback devices which is no longer reliable unless partprobe is used. In order to enable the Fedora bot by default they are also being added to the list until the tests can be updated. Any significant regression in functionality covered by these tests will still be detected by the FreeBSD builders. alloc_class/alloc_class_009_pos alloc_class/alloc_class_010_pos cli_root/zpool_expand/zpool_expand_001_pos cli_root/zpool_expand/zpool_expand_005_pos rsend/rsend_007_pos rsend/rsend_010_pos rsend/rsend_011_pos snapshot/rollback_003_pos Signed-off-by: Brian Behlendorf <[email protected]> Closes #10489
* Fix copy-paste error breaking FreeBSD headRyan Moeller2020-06-191-3/+3
| | | | | | | Resolve the FreeBSD head build failure. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10480
* zfs allow/unallow should work with numeric uid/gidAndriy Gapon2020-06-191-6/+13
| | | | | | | | | | | | | And that should work even (especially) if there is no matching user or group name. The change is originally by Xin Lin <[email protected]>. Original-patch-by: Xin Li <[email protected]> Reviewed-by: Yuri Pankov <[email protected]> Reviewed-by: Andy Stormont <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Andriy Gapon <[email protected]> Closes #9792 Closes #10280
* Match new vfs_checkexp KPI in FreeBSD headRyan Moeller2020-06-181-0/+10
| | | | | | | KPI changed in FreeBSD, update accordingly. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10475
* Enable -Wmissing-prototypes/-Wstrict-prototypesArvind Sankar2020-06-183-52/+18
| | | | | | | | | | Switch on warning flags to detect mismatch between declaration and definition. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10470
* Switch off -Wmissing-prototypes for libgcc math functionsArvind Sankar2020-06-182-28/+32
| | | | | | | | | | | spl-generic.c defines some of the libgcc integer library functions on 32-bit. Don't bother checking -Wmissing-prototypes since nothing should directly call these functions from C code. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10470
* Make Skein_{Get,Put}64_LSB_First inline functionsArvind Sankar2020-06-182-15/+2
| | | | | | | | | | | | | | | Turn the generic versions into inline functions and avoid SKEIN_PORT_CODE trickery. Also drop the PLATFORM_MUST_ALIGN check for using the fast bcopy variants. bcopy doesn't assume alignment, and the userspace version is currently different because the _ALIGNMENT_REQUIRED macro is only defined by the kernelspace headers. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10470
* Add prototypesArvind Sankar2020-06-1826-163/+169
| | | | | | | | | Add prototypes/move prototypes to header files. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10470
* Add include files for prototypesArvind Sankar2020-06-1826-9/+34
| | | | | | | | | | Include the header with prototypes in the file that provides definitions as well, to catch any mismatch between prototype and definition. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10470
* Remove dead codeArvind Sankar2020-06-1816-263/+2
| | | | | | | | | Delete unused functions. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10470
* Mark functions as staticArvind Sankar2020-06-1862-176/+157
| | | | | | | | | | | Mark functions used only in the same translation unit as static. This only includes functions that do not have a prototype in a header file either. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10470
* Cleanup libzpool/kernel.cArvind Sankar2020-06-181-29/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit ec213971274a ("async zvol minor node creation interferes with receive") replaced zvol_create_minors with zvol_create_minor and zvol_create_minors_recursive, changing the prototype at the same time. However the stub functions in libzpool/kernel.c were defined with the old prototype. As the definitions are empty, this doesn't cause any runtime issues, but an LTO build shows warnings because of the mismatched prototypes. Commit a0bd735adb1b ("Add support for asynchronous zvol minor operations") removed the real zvol_remove_minor, but for some reason added a stub implementation in libzpool/kernel.c with no references. Delete this dead code. Commit 196bee4cfd57 ("Remove deduplicated send/receive code") removed zfs_onexit_del_cb and zfs_onexit_cb_data. Drop the stubs as well. Add zvol.h include to provide prototypes, and sort the include directives. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10470
* linux: add basic fallocate(mode=0/2) compatibilityadilger2020-06-1813-23/+317
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement semi-compatible functionality for mode=0 (preallocation) and mode=FALLOC_FL_KEEP_SIZE (preallocation beyond EOF) for ZPL. Since ZFS does COW and snapshots, preallocating blocks for a file cannot guarantee that writes to the file will not run out of space. Even if the first overwrite was guaranteed, it would not handle any later overwrite of blocks due to COW, so strict compliance is futile. Instead, make a best-effort check that at least enough free space is currently available in the pool (with a bit of margin), then create a sparse file of the requested size and continue on with life. This does not handle all cases (e.g. several fallocate() calls before writing into the files when the filesystem is nearly full), which would require a more complex mechanism to be implemented, probably based on a modified version of dmu_prealloc(), but is usable as-is. A new module option zfs_fallocate_reserve_percent is used to control the reserve margin for any single fallocate call. By default, this is 110% of the requested preallocation size, so an additional 10% of available space is reserved for overhead to allow the application a good chance of finishing the write when the fallocate() succeeds. If the heuristics of this basic fallocate implementation are not desirable, the old non-functional behavior of returning EOPNOTSUPP for calls can be restored by setting zfs_fallocate_reserve_percent=0. The parameter of zfs_statvfs() is changed to take an inode instead of a dentry, since no dentry is available in zfs_fallocate_common(). A few tests from @behlendorf cover basic fallocate functionality. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Arshad Hussain <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Co-authored-by: Brian Behlendorf <[email protected]> Signed-off-by: Andreas Dilger <[email protected]> Issue #326 Closes #10408
* Avoid adding new primitives in zpool waitJorgen Lundman2020-06-181-11/+22
| | | | | | | | | | zpool wait brought in sem_init() and family, which is a primitive set not previously used in Open ZFS. It also happens to be deprecated on macOS. Replace with phtread API calls. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Gallagher <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #10468
* Disambiguate condvar API contractMatthew Macy2020-06-186-36/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | On Illumos callers of cv_timedwait and cv_timedwait_hires can't distinguish between whether or not the cv was signaled or the call timed out. Illumos handles this (for some definition of handles) by calling cv_signal in the return path if we were signaled but the return value indicates instead that we timed out. This would make sense if it were possible to query the the cv for its net signal disposition. However, this isn't possible and, in spite of the fact that there are places in the code that clearly take a different and incompatible path if a timeout value is indicated, this distinction appears to be rather subtle to most developers. This problem is further compounded by the fact that on Linux, calling cv_signal in the return path wouldn't even do the right thing unless there are other waiters. Since it is possible for the caller to independently determine how much time is remaining but it is not possible to query if the cv was in fact signaled, prioritizing signalling over timeout seems like a cleaner solution. In addition, judging from usage patterns within the code itself, it is also less error prone. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10471
* Add abd_cache_reap_now for abd_chunk_cache usersMatthew Macy2020-06-174-0/+13
| | | | | | | | | | Apparently missed in the initial port integration was the need to reap the abd_chunk_cache on FreeBSD. This change addresses that oversight. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10474
* zfs_ioctl: saved_poolname can be truncatedJorgen Lundman2020-06-171-11/+14
| | | | | | | | | | | | | | | As it uses kmem_strdup() and kmem_strfree() which both rely on strlen() being the same, but saved_poolname can be truncated causing: SPL: kernel memory allocator: buffer freed to wrong cache SPL: buffer was allocated from kmem_alloc_16, SPL: caller attempting free to kmem_alloc_8. SPL: buffer=0xffffff90acc66a38 bufctl=0x0 cache: kmem_alloc_8 Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #10469
* Set initial arc_c to arc_c_min instead of arc_c_maxAlexander Motin2020-06-171-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For at least 15 years since OpenSolaris arc_c was set by default to arc_c_max, later decreased under memory pressure. I've noticed that if arc_c was set high enough to cause memory pressure as considered by ZFS, setting of arc_no_grow to TRUE in arc_reap_cb_check() makes no effect until both arc_kmem_reap_soon() and delay(reap_retry_ms) return. All that time ZFS can continue increasing its effective ARC size, causing more memory pressure, potentially up to the point when OS low memory handler activates and reduces arc_c, requesting fast reclamation of just allocated memory. The problem seems to be more serious on FreeBSD and I guess Linux, since neither of them implement/use asynchronous kmem reclamation, so arc_kmem_reap_soon() can take more time. On older FreeBSD 11 not supporting multiple memory domains system with lots of RAM can get completely unresponsive for minutes due to heavy lock congestion between ARC reclamation and page daemon kmem reclamation threads. With this change to more conservative arc_c value ARC stops growing just it time and does not need later reclamation. Also while there, since now growing arc_c is a more often situation, use aggsum_upper_bound() instead of aggsum_compare() in arc_adapt() to reduce lock congestion. It is also getting in sync with code in arc_get_data_impl(). Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #10437
* Merge bash_completions changes from upstreamJoao Carlos Mendes Luis2020-06-161-24/+114
| | | | | | | | | | | | The current bash_completion contrib code in openzfs is very old, and some changes have been added since. The original repo is at https://github.com/Aneurin/zfs-bash I've been using the original @Aneurin code since my first deploy of ZoL. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: João Carlos Mendes Luís <[email protected]> Closes #10456
* drr_begin: can't forward declare untagged structJorgen Lundman2020-06-161-10/+14
| | | | | | | | When compiling with Clang++ it does not allow for untagged structs, so struct ddr_begin needs to be declared before the struct that uses it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #10453
* FreeBSD: Kernel module should depend on xdr not krpc after 1300092Ryan Moeller2020-06-161-0/+4
| | | | | | | | | | | Since https://reviews.freebsd.org/D24408 FreeBSD provides XDR functions in the xdr module instead of krpc. For FreeBSD 13, the MODULE_DEPEND should be changed to xdr Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10442 Closes #10443
* Make struct vdev_disk_t be platform privateJorgen Lundman2020-06-162-8/+5
| | | | | | | | | Linux defines different vdev_disk_t members to macOS, but they are only used in vdev_disk.c so move the declaration there. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #10452
* Remove refences to blacklist/whitelistMatthew Ahrens2020-06-163-13/+13
| | | | | | | | | | | | | | | | | | | | These terms reinforce the incorrect notion that black is bad and white is good. Replace this language with more specific terms which are also more clear and don't rely on metaphor. Specifically: * When vdevs are specified on the command line, they are the "selected" vdevs. * Entries in /dev/ which should not be considered as possible disks are "excluded" devices. Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #10457
* Fixing ABD struct allocation for FreeBSDBrian Atkinson2020-06-161-4/+13
| | | | | | | | | | | | In the event we are allocating a gang ABD in FreeBSD we are passing 0 to abd_alloc_struct(); however, this led to an allocation of ABD scatter with 0 chunks. This left the gang ABD allocation 24 bytes smaller than it should have been. Reviewed-by: Matt Macy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Matt Macy <[email protected]> Signed-off-by: Brian Atkinson <[email protected]> Closes #10431
* Fix FreeBSD condvar semanticsRyan Moeller2020-06-161-7/+20
| | | | | | | | | We should return -1 instead of negative deltas, and 0 if signaled. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10460
* Add convenience wrappers for common uio usageJorgen Lundman2020-06-1415-250/+216
| | | | | | | | | The macOS uio struct is opaque and the API must be used, this makes the smallest changes to the code for all platforms. Reviewed-by: Matt Macy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #10412
* Upstream: zil_commit_waiter() can stall foreverJorgen Lundman2020-06-141-1/+1
| | | | | | | | | On macOS clock_t is unsigned, so when cv_timedwait_hires() returns -1 we loop forever. The conditional was tweaked to ignore signedness. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #10445
* Fix gcc10.1 truncation errorGeorge Amanakis2020-06-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | gcc10.1 complains with: ../../include/sys/dmu.h:373:24: error: ‘%s’ directive output may be truncated writing up to 95 bytes into a region of size 75 [-Werror=format-truncation=] 373 | #define DMU_POOL_DDT "DDT-%s-%s-%s" | ^~~~~~~~~~~~~~ ../../module/zfs/ddt.c:256:37: note: in expansion of macro ‘DMU_POOL_DDT’ 256 | (void) snprintf(name, DDT_NAMELEN, DMU_POOL_DDT, | ^~~~~~~~~~~~ ../../include/sys/dmu.h:373:32: note: format string is defined here 373 | #define DMU_POOL_DDT "DDT-%s-%s-%s" | ^~ ../../module/zfs/ddt.c:256:9: note: ‘snprintf’ output 7 or more bytes (assuming 102) into a destination of size 80 256 | (void) snprintf(name, DDT_NAMELEN, DMU_POOL_DDT, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 257 | zio_checksum_table[ddt->ddt_checksum].ci_name, | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 258 | ddt_ops[type]->ddt_op_name, ddt_class_name[class]); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Increasing DTT_NAMELEN fixes it. Reviewed-By: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #10433
* FreeBSD: Don't require zeroing new locks before initRyan Moeller2020-06-131-4/+2
| | | | | | | | | This has not shown to be of use enough to justify the inconvenience. Reviewed-by: Matt Macy <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10449
* Removing ZERO_PAGE abd_alloc_zero_scatterBrian Atkinson2020-06-103-13/+23
| | | | | | | | | | | For MIPS architectures on Linux the ZERO_PAGE macro references empty_zero_page, which is exported as a GPL symbol. The call to ZERO_PAGE in abd_alloc_zero_scatter has been removed and a single zero'd page is now allocated for each of the pages in abd_zero_scatter in the kernel ABD code path. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Brian Atkinson <[email protected]> Closes #10428
* man.8: Add bookmark to list of typesGrischa Zengel2020-06-101-1/+2
| | | | | | | | | | | | | | | While checking bash_completion I missed bookmark as type. ``` # zfs get type zpool2#b NAME PROPERTY VALUE SOURCE zpool2#b type bookmark - ``` Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Grischa Zengel <[email protected]> Closes #10419
* bash_completion: add missing attributesGrischa Zengel2020-06-101-3/+3
| | | | | | | | | | | | | | There a some attributes missing which are shown in man pages: zfs list -t type A comma-separated list of types to display, where type is one of filesystem, snapshot, volume, *bookmark*, or all. For example, specifying -t snapshot displays only snapshots. zfs get -s source A comma-separated list of sources to display. Those properties coming from a source other than those in this list are ignored. Each source must be one of the following: local, default, inherited, temporary, *received*, and none. The default value is all sources. zfs get -t type A comma-separated list of types to display, where type is one of filesystem, snapshot, volume, bookmark, or all. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Grischa Zengel <[email protected]> Closes #10418
* Remove unnecessary references to slaveryMatthew Ahrens2020-06-108-48/+42
| | | | | | | | | | | | | | | | | | | | | | The horrible effects of human slavery continue to impact society. The casual use of the term "slave" in computer software is an unnecessary reference to a painful human experience. This commit removes all possible references to the term "slave". Implementation notes: The zpool.d/slaves script is renamed to dm-deps, which uses the same terminology as `dmsetup deps`. References to the `/sys/class/block/$dev/slaves` directory remain. This directory name is determined by the Linux kernel. Although `dmsetup deps` provides the same information, it unfortunately requires elevated privileges, whereas the `/sys/...` directory is world-readable. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #10435
* Fixup "Avoid the GEOM topology lock recursion when autoexpanding a pool"Ryan Moeller2020-06-101-8/+9
| | | | | | | | | The patch was applied to vdev_geom_open instead of vdev_geom_close by mistake. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10427
* Fix VPATH builds for user configArvind Sankar2020-06-102-2/+2
| | | | | | | | | | cmd/zpool and lib/libzutil Makefile's use -I., which won't work with a VPATH build. Replace it with -I$(srcdir) instead. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10379 Closes #10421
* Cleanup linux module kbuild filesArvind Sankar2020-06-1016-113/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The linux module can be built either as an external module, or compiled into the kernel, using copy-builtin. The source and build directories are slightly different between the two cases, and currently, compiling into the kernel still refers to some files from the configured ZFS source tree, instead of the copies inside the kernel source tree. There is also duplication between copy-builtin, which creates a Kbuild file to build ZFS inside the kernel tree, and the top-level module/Makefile.in. Fix this by moving the list of modules and the CFLAGS settings into a new module/Kbuild.in, which will be used by the kernel kbuild infrastructure, and using KBUILD_EXTMOD to distinguish the two cases within the Makefiles, in order to choose appropriate include directories etc. Module CFLAGS setting is simplified by using subdir-ccflags-y (available since 2.6.30) to set them in the top-level Kbuild instead of each individual module. The disabling of -Wunused-but-set-variable is removed from the lua and zfs modules. The variable that the Makefile uses is actually not defined, so this has no effect; and the warning has long been disabled by the kernel Makefile itself. The target_cpu definition in module/{zfs,zcommon} is removed as it was replaced by use of CONFIG_SPARC64 in commit 70835c5b755e ("Unify target_cpu handling") os/linux/{spl,zfs} are removed from obj-m, as they are not modules in themselves, but are included by the Makefile in the spl and zfs module directories. The vestigial Makefiles in os and os/linux are removed. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10379 Closes #10421
* Fix typosAndrea Gelmini2020-06-0948-73/+74
| | | | | | | | | Correct various typos in the comments and tests. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Andrea Gelmini <[email protected]> Closes #10423
* File incorrectly zeroed when receiving incremental stream that toggles -LMatthew Ahrens2020-06-0915-165/+500
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Background: By increasing the recordsize property above the default of 128KB, a filesystem may have "large" blocks. By default, a send stream of such a filesystem does not contain large WRITE records, instead it decreases objects' block sizes to 128KB and splits the large blocks into 128KB blocks, allowing the large-block filesystem to be received by a system that does not support the `large_blocks` feature. A send stream generated by `zfs send -L` (or `--large-block`) preserves the large block size on the receiving system, by using large WRITE records. When receiving an incremental send stream for a filesystem with large blocks, if the send stream's -L flag was toggled, a bug is encountered in which the file's contents are incorrectly zeroed out. The contents of any blocks that were not modified by this send stream will be lost. "Toggled" means that the previous send used `-L`, but this incremental does not use `-L` (-L to no-L); or that the previous send did not use `-L`, but this incremental does use `-L` (no-L to -L). Changes: This commit addresses the problem with several changes to the semantics of zfs send/receive: 1. "-L to no-L" incrementals are rejected. If the previous send used `-L`, but this incremental does not use `-L`, the `zfs receive` will fail with this error message: incremental send stream requires -L (--large-block), to match previous receive. 2. "no-L to -L" incrementals are handled correctly, preserving the smaller (128KB) block size of any already-received files that used large blocks on the sending system but were split by `zfs send` without the `-L` flag. 3. A new send stream format flag is added, `SWITCH_TO_LARGE_BLOCKS`. This feature indicates that we can correctly handle "no-L to -L" incrementals. This flag is currently not set on any send streams. In the future, we intend for incremental send streams of snapshots that have large blocks to use `-L` by default, and these streams will also have the `SWITCH_TO_LARGE_BLOCKS` feature set. This ensures that streams from the default use of `zfs send` won't encounter the bug mentioned above, because they can't be received by software with the bug. Implementation notes: To facilitate accessing the ZPL's generation number, `zfs_space_delta_cb()` has been renamed to `zpl_get_file_info()` and restructured to fill in a struct with ZPL-specific info including owner and generation. In the "no-L to -L" case, if this is a compressed send stream (from `zfs send -cL`), large WRITE records that are being written to small (128KB) blocksize files need to be decompressed so that they can be written split up into multiple blocks. The zio pipeline will recompress each smaller block individually. A new test case, `send-L_toggle`, is added, which tests the "no-L to -L" case and verifies that we get an error for the "-L to no-L" case. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #6224 Closes #10383
* ZTS: Fix add-o_ashift.kshIgor K2020-06-091-1/+1
| | | | | | | | | Use option '-o' after action for compatibility Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Igor Kozhukhov <[email protected]> Closes #10426
* Trim L2ARCGeorge Amanakis2020-06-0918-51/+573
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The l2arc_evict() function is responsible for evicting buffers which reference the next bytes of the L2ARC device to be overwritten. Teach this function to additionally TRIM that vdev space before it is overwritten if the device has been filled with data. This is done by vdev_trim_simple() which trims by issuing a new type of TRIM, TRIM_TYPE_SIMPLE. We also implement a "Trim Ahead" feature. It is a zfs module parameter, expressed in % of the current write size. This trims ahead of the current write size. A minimum of 64MB will be trimmed. The default is 0 which disables TRIM on L2ARC as it can put significant stress to underlying storage devices. To enable TRIM on L2ARC we set l2arc_trim_ahead > 0. We also implement TRIM of the whole cache device upon addition to a pool, pool creation or when the header of the device is invalid upon importing a pool or onlining a cache device. This is dependent on l2arc_trim_ahead > 0. TRIM of the whole device is done with TRIM_TYPE_MANUAL so that its status can be monitored by zpool status -t. We save the TRIM state for the whole device and the time of completion on-disk in the header, and restore these upon L2ARC rebuild so that zpool status -t can correctly report them. Whole device TRIM is done asynchronously so that the user can export of the pool or remove the cache device while it is trimming (ie if it is too slow). We do not TRIM the whole device if persistent L2ARC has been disabled by l2arc_rebuild_enabled = 0 because we may not want to lose all cached buffers (eg we may want to import the pool with l2arc_rebuild_enabled = 0 only once because of memory pressure). If persistent L2ARC has been disabled by setting the module parameter l2arc_rebuild_blocks_min_l2size to a value greater than the size of the cache device then the whole device is trimmed upon creation or import of a pool if l2arc_trim_ahead > 0. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Adam D. Moss <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #9713 Closes #9789 Closes #10224
* Move GFP flags kernel compatibility codeMichael Niewöhner2020-06-082-9/+12
| | | | | | | Move the GFP flags kernel compat code from c file to kmem header. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Michael Niewöhner <[email protected]> Closes #10424
* Linux 5.8 compat: __vmalloc()Michael Niewöhner2020-06-085-11/+46
| | | | | | | | | | | | | | | The `pgprot` argument has been removed from `__vmalloc` in Linux 5.8, being `PAGE_KERNEL` always now [1]. Detect this during configure and define a wrapper for older kernels. [1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/mm/vmalloc.c?h=next-20200605&id=88dca4ca5a93d2c09e5bbc6a62fbfc3af83c4fca Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Sebastian Gottschall <[email protected]> Co-authored-by: Michael Niewöhner <[email protected]> Signed-off-by: Sebastian Gottschall <[email protected]> Signed-off-by: Michael Niewöhner <[email protected]> Closes #10422
* Restore support for in-kernel ZFS ioctlsPawel Jakub Dawidek2020-06-084-5/+5
| | | | | | | | | | | | | | In Illumos it is possible to call ioctl functions from within the kernel by passing the FKIOCTL flag. Neither FreeBSD nor Linux support that, but it doesn't hurt to keep it around, as all the code is there. Before this commit it was a dead code and zc_iflags was always zero. Restore this functionality by allowing to pass a flag to the zfsdev_ioctl_common() function. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Pawel Jakub Dawidek <[email protected]> Closes #10417
* Remove redundant includesPawel Jakub Dawidek2020-06-081-49/+3
| | | | | | | | | | By removing excessive includes it takes us a small step close to compiling this file in userland. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Pawel Jakub Dawidek <[email protected]> Closes #10415
* Don't erase final byte of envblockPaul Dagnelie2020-06-081-1/+1
| | | | | | | | | | | | | When we copy the envblock's contents out, we currently treat it as a normal C string. However, this functionality is supposed to more closely emulate interacting with a file. As a consequence, we were incorrectly truncating the contents of the envblock by replacing the final byte of the buffer with a null character. Reviewed-by: Pavel Zakharov <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #10405
* Replace sprintf()->snprintf() and strcpy()->strlcpy()Jorgen Lundman2020-06-0722-60/+79
| | | | | | | | | | | | | | | The strcpy() and sprintf() functions are deprecated on some platforms. Care is needed to ensure correct size is used. If some platforms miss snprintf, we can add a #define to sprintf, likewise strlcpy(). The biggest change is adding a size parameter to zfs_id_to_fuidstr(). The various *_impl_get() functions are only used on linux and have not yet been updated. Reviewed by: Sean Eric Fagan <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #10400
* Improve compatibility with C++ consumersRyan Moeller2020-06-0616-50/+64
| | | | | | | | | C++ is a little picky about not using keywords for names, or string constness. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #10409
* ztest: Fix spa_open() ENOENT failuresBrian Behlendorf2020-06-061-142/+154
| | | | | | | | | | | | | | The pool may not be imported when the previous pass is terminated. In which case, spa_open() will return ENOENT to indicate the pool is not currently imported. Refactor to code slightly to handle this case by importing the pool and then retrying the spa_open(). The ztest_import() function was moved before ztest_run() and the import logic split in to a small internal helper function. The ztest_freeze() function was also moved but no changes were made. Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10407
* mkfile: include missing headersalaviss2020-06-051-0/+2
| | | | | | | | Without these headers, compilation fails on musl libc with offset_t being undeclared and MIN being implictly declared. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Hiếu Lê <[email protected]> Closes #10406
* zfsvfs_setup(): zap_stats_t may have undefined content when accessed (#10398)Brian Behlendorf2020-06-051-3/+3
|\ | | | | | | | | Signed-off-by: Allan Jude <[email protected]> Co-authored-by: Allan Jude <[email protected]>