summaryrefslogtreecommitdiffstats
path: root/module
Commit message (Collapse)AuthorAgeFilesLines
* Mark dsl_dataset_deactivate_feature_impl staticMatthew Macy2019-12-091-1/+1
| | | | | | | | | The dsl_dataset_deactivate_feature_impl() function is private and should be marked as such. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9696
* ZTS: Fix zpool_reopen_001_posBrian Behlendorf2019-12-091-10/+25
| | | | | | | | | | | | Update the vdev_disk_open() retry logic to use a specified number of milliseconds to be more robust. Additionally, on failure log both the time waited and requested timeout to the internal log. The default maximum allowed open retry time has been increased from 500ms to 1000ms. Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9680
* Disable sysfs feature checks on FreeBSDMatthew Macy2019-12-062-2/+8
| | | | | | | | | | | The sysfs infrastructure for reporting supported features and properties is Linux specific. Disable it on FreeBSD until it can be extended to be more portable. Reviewed-by: Kjeld Schouten <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9684
* ICP: Fix out of bounds writeAttila Fülöp2019-12-061-1/+3
| | | | | | | | | | | | | | | | | | | If gcm_mode_encrypt_contiguous_blocks() is called more than once in succession, with the accumulated lengths being less than blocksize, ctx->copy_to will be incorrectly advanced. Later, if out is NULL, the bcopy at line 114 will overflow ctx->gcm_copy_to since ctx->gcm_remainder_len is larger than the ctx->gcm_copy_to buffer can hold. The fix is to set ctx->copy_to only if it's not already set. For ZoL the issue may be academic, since in all my testing I wasn't able to hit neither of both conditions needed to trigger it, but other consumers can easily do so. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #9660
* Disable EDONR on FreeBSDMatthew Macy2019-12-053-3/+26
| | | | | | | | | | | FreeBSD uses its own crypto framework in-kernel which, at this time, has no EDONR implementation. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #9664
* Refactor deadman set failmode to be cross platformMatthew Macy2019-12-052-11/+22
| | | | | | | | | Update zfs_deadman_failmode to use the ZFS_MODULE_PARAM_CALL wrapper, and split the common and platform specific portions. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9670
* Replace ASSERTV macro with compiler annotationMatthew Macy2019-12-0523-55/+56
| | | | | | | | | | | Remove the ASSERTV macro and handle suppressing unused compiler warnings for variables only in ASSERTs using the __attribute__((unused)) compiler annotation. The annotation is understood by both gcc and clang. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9671
* ICP: Fix null pointer dereference and use after freeAttila Fülöp2019-12-032-4/+9
| | | | | | | | | | | | | | | | | | | In gcm_mode_decrypt_contiguous_blocks(), if vmem_alloc() fails, bcopy is called with a NULL pointer destination and a length > 0. This results in undefined behavior. Further ctx->gcm_pt_buf is freed but not set to NULL, leading to a potential write after free and a double free due to missing return value handling in crypto_update_uio(). The code as is may write to ctx->gcm_pt_buf in gcm_decrypt_final() and may free ctx->gcm_pt_buf again in aes_decrypt_atomic(). The fix is to slightly rework error handling and check the return value in crypto_update_uio(). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #9659
* Fix use-after-free in case of L2ARC prefetch failureAlexander Motin2019-12-031-3/+16
| | | | | | | | | | | | | | | | In case L2ARC read failed, l2arc_read_done() creates _different_ ZIO to read data from the original storage device. Unfortunately pointer to the failed ZIO remains in hdr->b_l1hdr.b_acb->acb_zio_head, and if some other read try to bump the ZIO priority, it will crash. The problem is reproducible by corrupting L2ARC content and reading some data with prefetch if l2arc_noprefetch tunable is changed to 0. With the default setting the issue is probably not reproducible now. Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #9648
* Increase allowed 'special_small_blocks' maximum valueBrian Behlendorf2019-12-031-1/+1
| | | | | | | | | | | | | | There may be circumstances where it's desirable that all blocks in a specified dataset be stored on the special device. Relax the artificial 128K limit and allow the special_small_blocks property to be set up to 1M. When blocks >1MB have been enabled via the zfs_max_recordsize module option, this limit is increased accordingly. Reviewed-by: Don Brady <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9131 Closes #9355
* Wrap module_param_call() routines under __linux__Matthew Macy2019-12-032-2/+2
| | | | | | | | | The module_param_call() functionality is currently still Linux-specific and should be wrapped accordingly. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9666
* Mark write_record staticMatthew Macy2019-12-031-1/+1
| | | | | | | | The write_record() function is private and should be marked as such. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9665
* Move linux qsort def to platform headerMatthew Macy2019-12-032-7/+3
| | | | | | | | | Moving qsort to the platform header allows each platform to provide an appropriate sorting implementation. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9663
* Adapt gitignore for modulesMichael Niewöhner2019-12-021-0/+1
| | | | | | | | | | Remove the specific gitignore rules for module left-overs and add a generic one in modules/. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: Michael Niewöhner <[email protected]> Closes #9656
* Move zfs_cmd_t copyin/copyout to platform codeMatthew Macy2019-12-022-21/+24
| | | | | | | | | | | FreeBSD needs to cope with multiple version of the zfs_cmd_t structure. Allowing the platform code to pre and post process the cmd structure makes it possible to work with legacy tooling. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9624
* Restructure nvlist_nv_alloc to work on FreeBSDMatthew Macy2019-11-301-3/+3
| | | | | | | | | | | | KM_PUSHPAGE is an Illumosism - On FreeBSD it's aliased to the same malloc flag as KM_SLEEP. The compiler naturally rejects multiple case statements with the same value. This is effectively a no-op since all callers pass a specific KM_* flag. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9643
* Mark Linux fallocate extensions as specific to LinuxMatthew Macy2019-11-301-3/+5
| | | | | | | | | fallocate(2) is a Linux-specific system call which in unavailable on other platforms. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9633
* Resolve ZoF differences in zfs_ioctl.hMatthew Macy2019-11-301-1/+0
| | | | | | | | | | FreeBSD needs to be able to pass the jail id to the jail/unjail ioctls and the struct file in the device structure is unused. Reviewed-by: Kjeld Schouten <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9625
* Remove zfs_vdev_elevator module optionBrian Behlendorf2019-11-271-111/+19
| | | | | | | | | | | | | As described in commit f81d5ef6 the zfs_vdev_elevator module option is being removed. Users who require this functionality should update their systems to set the disk scheduler using a udev rule. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #8664 Closes #9417 Closes #9609
* Prevent unnecessary resilver restartsjwpoduska2019-11-273-83/+107
| | | | | | | | | | | | | | | | | | If a device is participating in an active resilver, then it will have a non-empty DTL. Operations like vdev_{open,reopen,probe}() can cause the resilver to be restarted (or deferred to be restarted later), which is unnecessary if the DTL is still covered by the current scan range. This is similar to the logic in vdev_dtl_should_excise() where the DTL can only be excised if it's max txg is in the resilvered range. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Gallagher <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Signed-off-by: John Poduska <[email protected]> Issue #840 Closes #9155 Closes #9378 Closes #9551 Closes #9588
* Check for unlinked znodes after igrab()Mauricio Faria de Oliveira2019-11-211-5/+7
| | | | | | | | | | | | | | | | | | The changes in commit 41e1aa2a / PR #9583 introduced a regression on tmpfile_001_pos: fsetxattr() on a O_TMPFILE file descriptor started to fail with errno ENODATA: openat(AT_FDCWD, "/test", O_RDWR|O_TMPFILE, 0666) = 3 <...> fsetxattr(3, "user.test", <...>, 64, 0) = -1 ENODATA The originally proposed change on PR #9583 is not susceptible to it, so just move the code/if-checks around back in that way, to fix it. Reviewed-by: Pavel Snajdr <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Original-patch-by: Heitor Alves de Siqueira <[email protected]> Signed-off-by: Mauricio Faria de Oliveira <[email protected]> Closes #9602
* Add zfs_file_* interface, remove vnodesMatthew Macy2019-11-2130-1027/+701
| | | | | | | | | | | | | | | | | | Provide a common zfs_file_* interface which can be implemented on all platforms to perform normal file access from either the kernel module or the libzpool library. This allows all non-portable vnode_t usage in the common code to be replaced by the new portable zfs_file_t. The associated vnode and kobj compatibility functions, types, and macros have been removed from the SPL. Moving forward, vnodes should only be used in platform specific code when provided by the native operating system. Reviewed-by: Sean Eric Fagan <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9556
* Partially revert 5a6ac4cBrian Behlendorf2019-11-182-0/+45
| | | | | | | | | | | | | | | | | | | | | | | | Reinstate the zpl_revalidate() functionality to resolve a regression where dentries for open files during a rollback are not invalidated. The unrelated functionality for automatically unmounting .zfs/snapshots was not reverted. Nor was the addition of shrink_dcache_sb() to the zfs_resume_fs() function. This issue was not immediately caught by the CI because the test case intended to catch it was included in the list of ZTS tests which may occasionally fail for unrelated reasons. Remove all of the rollback tests from this list to help identify the frequency of any spurious failures. The rollback_003_pos.ksh test case exposes a real issue with the long standing code which needs to be investigated. Regardless, it has been enable with a small workaround in the test case itself. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Pavel Snajdr <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9587 Closes #9592
* Break out of zfs_zget early if unlinked znodeHeitor Alves de Siqueira2019-11-151-9/+17
| | | | | | | | | | | If zp->z_unlinked is set, we're working with a znode that has been marked for deletion. If that's the case, we can skip the "goto again" loop and return ENOENT, as the znode should not be discovered. Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Heitor Alves de Siqueira <[email protected]> Closes #9583
* Prevent NULL pointer dereference in blkg_tryget() on EL8 kernelsloli10K2019-11-131-1/+1
| | | | | | | | | | | | | | blkg_tryget() as shipped in EL8 kernels does not seem to handle NULL @blkg as input; this is different from its mainline counterpart where NULL is accepted. To prevent dereferencing a NULL pointer when dealing with block devices which do not set a root_blkg on the request queue perform the NULL check in vdev_bio_associate_blkg(). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Kjeld Schouten <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #9546 Closes #9577
* Check for __GFP_RECLAIM instead of GFP_KERNELMichael Niewöhner2019-11-131-3/+8
| | | | | | | | | | | Check for __GFP_RECLAIM instead of GFP_KERNEL because zfs modifies IO and FS flags which breaks the check for GFP_KERNEL. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Sebastian Gottschall <[email protected]> Signed-off-by: Michael Niewöhner <[email protected]> Closes #9034
* Add kmem_cache flag for forcing kvmallocMichael Niewöhner2019-11-133-9/+38
| | | | | | | | | | | | | | This adds a new KMC_KVMEM flag was added to enforce use of the kvmalloc allocator in kmem_cache_create even for large blocks, which may also increase performance in some specific cases (e.g. zstd), too. Default to KVMEM instead of VMEM in spl_kmem_cache_create. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Sebastian Gottschall <[email protected]> Signed-off-by: Michael Niewöhner <[email protected]> Closes #9034
* Make use of kvmalloc if available and fix vmem_alloc implementationMichael Niewöhner2019-11-132-11/+95
| | | | | | | | | | | | | | | | | This patch implements use of kvmalloc for GFP_KERNEL allocations, which may increase performance if the allocator is able to allocate physical memory, if kvmalloc is available as a public kernel interface (since v4.12). Otherwise it will simply fall back to virtual memory (vmalloc). Also fix vmem_alloc implementation which can lead to slow allocations since the first attempt with kmalloc does not make use of the noretry flag but tells the linux kernel to retry several times before it fails. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Sebastian Gottschall <[email protected]> Signed-off-by: Michael Niewöhner <[email protected]> Closes #9034
* Add missing documentation for some KMC flagsMichael Niewöhner2019-11-131-5/+5
| | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matt Ahrens <[email protected]> Signed-off-by: Sebastian Gottschall <[email protected]> Signed-off-by: Michael Niewöhner <[email protected]> Closes #9034
* Linux compat: Minimum kernel version 3.10Brian Behlendorf2019-11-1217-541/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Increase the minimum supported kernel version from 2.6.32 to 3.10. This removes support for the following Linux enterprise distributions. Distribution | Kernel | End of Life ---------------- | ------ | ------------- Ubuntu 12.04 LTS | 3.2 | Apr 28, 2017 SLES 11 | 3.0 | Mar 32, 2019 RHEL / CentOS 6 | 2.6.32 | Nov 30, 2020 The following changes were made as part of removing support. * Updated `configure` to enforce a minimum kernel version as specified in the META file (Linux-Minimum: 3.10). configure: error: *** Cannot build against kernel version 2.6.32. *** The minimum supported kernel version is 3.10. * Removed all `configure` kABI checks and matching C code for interfaces which solely predate the Linux 3.10 kernel. * Updated all `configure` kABI checks to fail when an interface is missing which was in the 3.10 kernel up to the latest 5.1 kernel. Removed the HAVE_* preprocessor defines for these checks and updated the code to unconditionally use the verified interface. * Inverted the detection logic in several kABI checks to match the new interface as it appears in 3.10 and newer and not the legacy interface. * Consolidated the following checks in to individual files. Due the large number of changes in the checks it made sense to handle this now. It would be desirable to group other related checks in the same fashion, but this as left as future work. - config/kernel-blkdev.m4 - Block device kABI checks - config/kernel-blk-queue.m4 - Block queue kABI checks - config/kernel-bio.m4 - Bio interface kABI checks * Removed the kABI checks for sops->nr_cached_objects() and sops->free_cached_objects(). These interfaces are currently unused. Signed-off-by: Brian Behlendorf <[email protected]> Closes #9566
* Remove zpl_revalidatePavel Snajdr2019-11-112-63/+30
| | | | | | | | | | | | | | | | | | | | | | | This patch removes the need for zpl_revalidate altogether. There were 3 main reasons why we used d_revalidate: 1. periodic automounted snapshots umount deferral 2. negative dentries created before snapshot rollback 3. stale inodes referenced by dentry cache after snapshot rollback Periodic snapshots deferral solution introduces zfs_exit_fs function, which is called as a part of ZFS_EXIT(zfsvfs_t) macro. Negative dentries and stale inodes are solved by flushing the dcache for the particular dataset on zfs_resume_fs call. This patch also removes now unused HAVE_S_D_OP configure test. Reviewed-by: Aleksa Sarai <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Pavel Snajdr <[email protected]> Closes #8774 Closes #9549
* Improve logging of 128KB writesAlexander Motin2019-11-111-7/+13
| | | | | | | | | | | | | | | | | | | | | | Before my ZIL space optimization few years ago 128KB writes were logged as two 64KB+ records in two 128KB log blocks. After that change it became ~127KB+/1KB+ in two 128KB log blocks to free space in the second block for another record. Unfortunately in case of 128KB only writes, when space in the second block remained unused, that change increased write latency by unbalancing checksum computation and write times between parallel threads. It also didn't help with SLOG space efficiency in that case. This change introduces new 68KB log block size, used for both writes below 67KB and 128KB-sharp writes. Writes of 68-127KB are still using one 128KB block to not increase processing overhead. Writes above 131KB are still using full 128KB blocks, since possible saving there is small. Mixed loads will likely also fall back to previous 128KB, since code uses maximum of the last 16 requested block sizes. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes #9409
* Preliminary support for RV64GRomain Dolbeau2019-11-063-0/+95
| | | | | | | This adds basic support for RISC-V, specifically RV64G. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #9540
* Move platform specific parts of zfs_znode.h to platform codeMatthew Macy2019-11-062-2/+17
| | | | | | | | | Some of the znode fields are different and functions consuming an inode don't exist on FreeBSD. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9536
* Add tracepoints for taskq entry lifetime eventsPrakash Surya2019-11-012-4/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds some new DTRACE_PROBE* endpoints so that we can observe taskq latencies on a system. Additionally, a new "taskqlatency.bt" script is added to do this observation via "bpftrace". Lastly, a "zfs-trace.sh" script is added to wrap "bpftrace" with the proper options required to run and use "taskqlatency.bt". For example, with these changes in place, a user can run the following: $ cd ./contrib/bpftrace $ sudo ./zfs-trace.sh taskqlatency.bt Attaching 6 probes... ^C Here's some example output, showing latency information for time spent executing the taskq entry's function: @exec_lat_us[dp_sync_taskq, userquota_updates_task]: [2, 4) 5 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [4, 8) 0 | | [8, 16) 1 |@@@@@@@@@@ | [16, 32) 2 |@@@@@@@@@@@@@@@@@@@@ | @exec_lat_us[z_wr_int_h, zio_execute]: [8, 16) 16 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [16, 32) 2 |@@@@@@ | @exec_lat_us[z_wr_iss_h, zio_execute]: [16, 32) 4 |@@@@@@@@@@@@@@@@ | [32, 64) 13 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [64, 128) 1 |@@@@ | @exec_lat_us[z_ioctl_int, zio_execute]: [2, 4) 1 |@@@@ | [4, 8) 11 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [8, 16) 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | @exec_lat_us[dp_sync_taskq, sync_dnodes_task]: [2, 4) 1 |@@@@@@ | [4, 8) 7 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [8, 16) 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [16, 32) 2 |@@@@@@@@@@@@@ | [32, 64) 4 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | [64, 128) 1 |@@@@@@ | [128, 256) 0 | | [256, 512) 1 |@@@@@@ Here's some example output, showing latency information for time spent waiting on the taskq, prior to starting execution of entry's function: @queue_lat_us[dp_sync_taskq]: [2, 4) 1 |@@@@ | [4, 8) 7 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [8, 16) 2 |@@@@@@@@ | [16, 32) 3 |@@@@@@@@@@@@@ | [32, 64) 12 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [64, 128) 6 |@@@@@@@@@@@@@@@@@@@@@@@@@@ | [128, 256) 0 | | [256, 512) 1 |@@@@ | @queue_lat_us[z_wr_iss]: [4, 8) 4 |@@@@ | [8, 16) 13 |@@@@@@@@@@@@@@@ | [16, 32) 6 |@@@@@@@ | [32, 64) 2 |@@ | [64, 128) 12 |@@@@@@@@@@@@@@ | [128, 256) 15 |@@@@@@@@@@@@@@@@@@ | [256, 512) 33 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [512, 1K) 27 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [1K, 2K) 7 |@@@@@@@@ | [2K, 4K) 14 |@@@@@@@@@@@@@@@@ | [4K, 8K) 14 |@@@@@@@@@@@@@@@@ | [8K, 16K) 23 |@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [16K, 32K) 43 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| @queue_lat_us[z_wr_int]: [2, 4) 10 |@@@@@ | [4, 8) 71 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [8, 16) 88 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@| [16, 32) 50 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [32, 64) 65 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ | [64, 128) 43 |@@@@@@@@@@@@@@@@@@@@@@@@@ | [128, 256) 19 |@@@@@@@@@@@ | [256, 512) 3 |@ | [512, 1K) 1 | | Reviewed by: Brad Lewis <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Prakash Surya <[email protected]> Closes #9525
* Enable use of DTRACE_PROBE* macros in "spl" modulePrakash Surya2019-11-0119-15/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | This change modifies some of the infrastructure for enabling the use of the DTRACE_PROBE* macros, such that we can use tehm in the "spl" module. Currently, when the DTRACE_PROBE* macros are used, they get expanded to create new functions, and these dynamically generated functions become part of the "zfs" module. Since the "spl" module does not depend on the "zfs" module, the use of DTRACE_PROBE* in the "spl" module would result in undefined symbols being used in the "spl" module. Specifically, DTRACE_PROBE* would turn into a function call, and the function being called would be a symbol only contained in the "zfs" module; which results in a linker and/or runtime error. Thus, this change adds the necessary logic to the "spl" module, to mirror the tracing functionality available to the "zfs" module. After this change, we'll have a "trace_zfs.h" header file which defines the probes available only to the "zfs" module, and a "trace_spl.h" header file which defines the probes available only to the "spl" module. Reviewed by: Brad Lewis <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Prakash Surya <[email protected]> Closes #9525
* Wrap Linux module macrosMatthew Macy2019-11-017-40/+41
| | | | | | | | | | MODULE_VERSION is already defined on FreeBSD. Wrap all of the used MODULE_* macros for the sake of consistency and portability. Add a user space noop version to reduce the need for _KERNEL ifdefs. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9542
* Prefix struct rangelockMatthew Macy2019-11-014-53/+59
| | | | | | | | | | A struct rangelock already exists on FreeBSD. Add a zfs_ prefix as per our convention to prevent any conflict with existing symbols. This change is a follow up to 2cc479d0. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9534
* Fix icp build on FreeBSDMatthew Macy2019-11-012-23/+22
| | | | | | | | | | | | - ROTATE_LEFT is not used by amd64, move it down within the scope it's used to silence a clang warning. - __unused is an alias for the compiler annotation __attribute__((__unused__)) on FreeBSD. Rename the field to ____unused. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9538
* Return an error code from zfs_acl_chmod_setattrMatthew Macy2019-11-012-2/+5
| | | | | | | | The FreeBSD implementation can fail, allow this function to fail and add the required error handling for Linux. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9541
* Include prototypes for vdev_initializeMatthew Macy2019-10-311-1/+2
| | | | | | | | Address two prototype related warnings emitted by clang. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9535
* Factor Linux specific code out of spa_misc.cMatthew Macy2019-10-313-76/+117
| | | | | | | | | | Move these Linux module parameter get/set helpers in to platform specific code. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9457
* Add AVX512BW variant of fletcherRomain Dolbeau2019-10-302-0/+52
| | | | | | | | | | | It is much faster than AVX512F when byteswapping on Skylake-SP and newer, as we can do the byteswap in a single vshufb instead of many instructions. Reviewed by: Gvozden Neskovic <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #9517
* Fix 'zfs change-key' with unencrypted childTom Caputi2019-10-301-2/+6
| | | | | | | | | | | | | | Currently, when you call 'zfs change-key' on an encrypted dataset that has an unencrypted child, the code will trigger a VERIFY. This VERIFY is leftover from before we allowed unencrypted datasets to exist underneath encrypted ones. This patch fixes the issue by simply replacing the VERIFY with an early return when recursing through datasets. Reviewed by: Jason King <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #9524
* Minor spa portability fixesMatthew Macy2019-10-281-3/+3
| | | | | | | | | | | | - FreeBSD's rootpool import code uses spa_config_parse - Move the zvol_create_minors call out from under the spa_namespace_lock in spa_import. It isn't needed and it causes a lock order reversal on FreeBSD. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9499
* Fix for ARC sysctls ignored at runtimeloli10K2019-10-263-32/+65
| | | | | | | | | | | | | | This change leverage module_param_call() to run arc_tuning_update() immediately after the ARC tunable has been updated as suggested in cffa8372 code review. A simple test case is added to the ZFS Test Suite to prevent future regressions in functionality. Reviewed-by: Matt Macy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #9487 Closes #9489
* Remove non-portable pointer is valid assertMatthew Macy2019-10-251-1/+0
| | | | | | | | | | This assert makes non portable assumptions about the state of memory returned by the memory allocator. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9506
* Move final zvol_remove_minors to common codeMatthew Macy2019-10-252-10/+10
| | | | | | | | | | This logic is not platform dependent and should reside in the common code. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9505
* Remove sdt.hMatthew Macy2019-10-252-2/+0
| | | | | | | | | It's mostly a noop on ZoL and it conflicts with platforms that support dtrace. Remove this header to resolve the conflict. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9497
* Linux 4.14, 4.19, 5.0+ compat: SIMD save/restoreBrian Behlendorf2019-10-2413-195/+72
| | | | | | | | | | | | | | | | | | | | Contrary to initial testing we cannot rely on these kernels to invalidate the per-cpu FPU state and restore the FPU registers. Nor can we guarantee that the kernel won't modify the FPU state which we saved in the task struck. Therefore, the kfpu_begin() and kfpu_end() functions have been updated to save and restore the FPU state using our own dedicated per-cpu FPU state variables. This has the additional advantage of allowing us to use the FPU again in user threads. So we remove the code which was added to use task queues to ensure some functions ran in kernel threads. Reviewed-by: Fabian Grünbichler <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #9346 Closes #9403