aboutsummaryrefslogtreecommitdiffstats
path: root/include/os
Commit message (Collapse)AuthorAgeFilesLines
* Linux 6.6 compat: generic_fillattr has a new u32 request_mask added at arg2Coleman Kane2023-11-082-0/+11
| | | | | | | | | | | | | | In commit 0d72b92883c651a11059d93335f33d65c6eb653b, a new u32 argument for the request_mask was added to generic_fillattr. This is the same request_mask for statx that's present in the most recent API implemented by zpl_getattr_impl. This commit conditionally adds it to the zpl_generic_fillattr(...) macro, as well as the zfs_getattr_fast(...) implementation, when configure determines it's present in the kernel's generic_fillattr(...). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #15263
* Linux 6.6 compat: use inode_get/set_ctime*(...)Coleman Kane2023-11-081-0/+11
| | | | | | | | | | | | | | | | | | | | In Linux commit 13bc24457850583a2e7203ded05b7209ab4bc5ef, direct access to the i_ctime member of struct inode was removed. The new approach is to use accessor methods that exclusively handle passing the timestamp around by value. This change adds new tests for each of these functions and introduces zpl_* equivalents in include/os/linux/zfs/sys/zpl.h. In where the inode_get/set_ctime*() functions exist, these zpl_* calls will be mapped to the new functions. On older kernels, these macros just wrap direct-access calls. The code that operated on an address of ip->i_ctime to call ZFS_TIME_DECODE() now will take a local copy using zpl_inode_get_ctime(), and then pass the address of the local copy when performing the ZFS_TIME_DECODE() call, in all cases, rather than directly accessing the member. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #15263 Closes #15257
* Add mutex_enter_interruptible() for interruptible sleeping IOCTLsThomas Bertschinger2023-11-062-8/+14
| | | | | | | | | | | | | | | | | | | | | | Many long-running ZFS ioctls lock the spa_namespace_lock, forcing concurrent ioctls to sleep for the mutex. Previously, the only option is to call mutex_enter() which sleeps uninterruptibly. This is a usability issue for sysadmins, for example, if the admin runs `zpool status` while a slow `zpool import` is ongoing, the admin's shell will be locked in uninterruptible sleep for a long time. This patch resolves this admin usability issue by introducing mutex_enter_interruptible() which sleeps interruptibly while waiting to acquire a lock. It is implemented for both Linux and FreeBSD. The ZFS_IOC_POOL_CONFIGS ioctl, used by `zpool status`, is changed to use this new macro so that the command can be interrupted if it is issued during a concurrent `zpool import` (or other long-running operation). Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Thomas Bertschinger <[email protected]> Closes #15360
* zvol: Remove broken blk-mq optimizationTony Hutter2023-11-061-8/+0
| | | | | | | | | | | | | | | | | This fix removes a dubious optimization in zfs_uiomove_bvec_rq() that saved the iterator contents of a rq_for_each_segment(). This optimization allowed restoring the "saved state" from a previous rq_for_each_segment() call on the same uio so that you wouldn't need to iterate though each bvec on every zfs_uiomove_bvec_rq() call. However, if the kernel is manipulating the requests/bios/bvecs under the covers between zfs_uiomove_bvec_rq() calls, then it could result in corruption from using the "saved state". This optimization results in an unbootable system after installing an OS on a zvol with blk-mq enabled. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #15351
* FreeBSD: Improve taskq wrapperAlexander Motin2023-11-061-9/+9
| | | | | | | | | | | | | | | | | | | - Group tqent_task and tqent_timeout_task into a union. They are never used same time. This shrinks taskq_ent_t from 192 to 160 bytes. - Remove tqent_registered. Use tqent_id != 0 instead. - Remove tqent_cancelled. Use taskqueue pending counter instead. - Change tqent_type into uint_t. We don't need to pack it any more. - Change tqent_rc into uint_t, matching refcount(9). - Take shared locks in taskq_lookup(). - Call proper taskqueue_drain_timeout() for TIMEOUT_TASK in taskq_cancel_id() and taskq_wait_id(). - Switch from CK_LIST to regular LIST. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mateusz Guzik <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #15356
* FreeBSD: Reduce divergence from in-tree sourcesAlexander Motin2023-10-107-7/+10
| | | | | | | | | | This includes random small tweaks, primarily a build fixes, required when ZFS is built as part of FreeBSD base. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #15368
* ARC: Remove b_bufcnt/b_ebufcnt from ARC headersAlexander Motin2023-10-071-8/+4
| | | | | | | | | | | | | | | | | | In most cases we do not care about exact number of buffers linked to the header, we just need to know if it is zero, non-zero or one. That can easily be checked just looking on b_buf pointer or in some cases derefencing it. b_ebufcnt is read only once, and in that case we already traverse the list as part of arc_buf_remove(), so second traverse should not be expensive. This reduces L1 ARC header size by 8 bytes and full crypto header by 16 bytes, down to 176 and 232 bytes on FreeBSD respectively. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #15350
* Fix invalid pointer access in trace_dbuf.hChunwei Chen2023-10-031-2/+6
| | | | | | | | | | In dnode_destroy, dn_objset is invalidated. However, it will later call into dbuf_destroy, in which DTRACE_SET_STATE will try to access spa_name via dn_objset causing illegal pointer access. Reviewed-by: Brian Atkinson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #15333
* Retire z_nr_znodesMateusz Guzik2023-09-192-2/+0
| | | | | | | | | | | | Added in ab26409db753 ("Linux 3.1 compat, super_block->s_shrink"), with the only consumer which needed the count getting retired in 066e82522101 ("Linux compat: Minimum kernel version 3.10"). The counter gets in the way of not maintaining the list to begin with. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Mateusz Guzik <[email protected]> Closes #15274
* Linux 4.20 compat: wrapper function for iov_iter type accessColeman Kane2023-09-191-0/+6
| | | | | | | | | | | | An iov_iter_type() function to access the "type" member of the struct iov_iter was added at one point. Move the conditional logic to decide which method to use for accessing it into a macro and simplify the zpl_uio_init code. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #15100
* Linux 6.4 compat: iter_iov() function now used to get old iov memberColeman Kane2023-09-191-0/+6
| | | | | | | | | | | | The iov_iter->iov member is now iov_iter->__iov and must be accessed via the accessor function iter_iov(). Create a wrapper that is conditionally compiled to use the access method appropriate for the target kernel version. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #15100
* Linux 6.5 compat: blkdev changesColeman Kane2023-09-191-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Multiple changes to the blkdev API were introduced in Linux 6.5. This includes passing (void* holder) to blkdev_put, adding a new blk_holder_ops* arg to blkdev_get_by_path, adding a new blk_mode_t type that replaces uses of fmode_t, and removing an argument from the release handler on block_device_operations that we weren't using. The open function definition has also changed to take gendisk* and blk_mode_t, so update it accordingly, too. Implement local wrappers for blkdev_get_by_path() and vdev_blkdev_put() so that the in-line calls are cleaner, and place the conditionally-compiled implementation details inside of both of these local wrappers. Both calls are exclusively used within vdev_disk.c, at this time. Add blk_mode_is_open_write() to test FMODE_WRITE / BLK_OPEN_WRITE The wrapper function is now used for testing using the appropriate method for the kernel, whether the open mode is writable or not. Emphasize fmode_t arg in zvol_release is not used Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #15099
* Linux 6.5 compat: use disk_check_media_change when it existsColeman Kane2023-09-191-0/+1
| | | | | | | | | | | | | When disk_check_media_change() exists, then define zfs_check_media_change() to simply call disk_check_media_change() on the bd_disk member of its argument. Since disk_check_media_change() is newer than when revalidate_disk was present in bops, we should be able to safely do this via a macro, instead of recreating a new implementation of the inline function that forces revalidation. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #15101
* Avoid save/restoring AMX registers to avoid a SPR erratumRich Ercolani2023-08-271-5/+14
| | | | | | | | | | | | | | | | | | Intel SPR erratum SPR4 says that if you trip into a vmexit while doing FPU save/restore, your AMX register state might misbehave... and by misbehave, I mean save all zeroes incorrectly, leading to explosions if you restore it. Since we're not using AMX for anything, the simple way to avoid this is to just not save/restore those when we do anything, since we're killing preemption of any sort across our save/restores. If we ever decide to use AMX, it's not clear that we have any way to mitigate this, on Linux...but I am not an expert. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #14989 Closes #15168
* linux/spl/kmem_cache: undefine `kmem_cache_alloc` before defining itRyan Lahfa2023-08-251-0/+8
| | | | | | | | | | | | | | | | | | | | When compiling a kernel with bcachefs and zfs, the two macros will collide, making it impossible to have both filesystems. It is sufficient to just undefine the macro before calling it. On why this should be in ZFS rather than bcachefs, currently, bcachefs is not a in-tree filesystem, but, it has a reasonably high chance of getting included soon. This avoids the breakage in ZFS early, this patch may be distributed downstream in NixOS and is already used there. Reviewed-by: Brian Atkinson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Lahfa <[email protected]> Closes #15144
* linux: implement filesystem-side copy/clone functions for EL7Rob Norris2023-07-261-0/+4
| | | | | | | | | | | | | Redhat have backported copy_file_range and clone_file_range to the EL7 kernel using an "extended file operations" wrapper structure. This connects all that up to let cloning work there too. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Kay Pedersen <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Closes #15050
* linux: implement filesystem-side clone ioctlsRob Norris2023-07-261-0/+35
| | | | | | | | | | | | | | | | | | Prior to Linux 4.5, the FICLONE etc ioctls were specific to BTRFS, and were implemented as regular filesystem-specific ioctls. This implements those ioctls directly in OpenZFS, allowing cloning to work on older kernels. There's no need to gate these behind version checks; on later kernels Linux will simply never deliver these ioctls, instead calling the approprate VFS op. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Kay Pedersen <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Closes #15050
* linux: implement filesystem-side copy/clone functionsRob Norris2023-07-261-0/+14
| | | | | | | | | | | | | | | | | | | | This implements the Linux VFS ops required to service the file copy/clone APIs: .copy_file_range (4.5+) .clone_file_range (4.5-4.19) .dedupe_file_range (4.5-4.19) .remap_file_range (4.20+) Note that dedupe_file_range() and remap_file_range(REMAP_FILE_DEDUP) are hooked up here, but are not implemented yet. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Kay Pedersen <[email protected]> Signed-off-by: Rob Norris <[email protected]> Sponsored-By: OpenDrives Inc. Sponsored-By: Klara Inc. Closes #15050
* Linux 6.5 compat: disk_check_media_change() was addedColeman Kane2023-07-211-0/+2
| | | | | | | | | | | The disk_check_media_change() function was added which replaces bdev_check_media_change. This change was introduced in 6.5rc1 444aa2c58cb3b6cfe3b7cc7db6c294d73393a894 and the new function takes a gendisk* as its argument, no longer a block_device*. Thus, bdev->bd_disk is now used to pass the expected data. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #15060
* Linux 6.5 compat: BLK_STS_NEXUS renamed to BLK_STS_RESV_CONFLICTColeman Kane2023-07-211-0/+8
| | | | | | | | | This change was introduced in Linux commit 7ba150834b840f6f5cdd07ca69a4ccf39df59a66 Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #15059
* Linux 6.5 compat: intptr_t definition is canonically signedColeman Kane2023-07-211-1/+1
| | | | | | | | | | | Make the version here match that elsewhere in the kernel and system headers. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #15058 Signed-off-by: Brian Behlendorf <[email protected]>
* FreeBSD: catch up to __FreeBSD_version 1400093Mateusz Guzik2023-07-201-0/+4
| | | | | Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Mateusz Guzik <[email protected]> Closes #15036
* Add a delay to tearing down threads.Rich Ercolani2023-06-261-0/+1
| | | | | | | | | | | | | | | | | | It's been observed that in certain workloads (zvol-related being a big one), ZFS will end up spending a large amount of time spinning up taskqs only to tear them down again almost immediately, then spin them up again... I noticed this when I looked at what my mostly-idle system was doing and wondered how on earth taskq creation/destroy was a bunch of time... So I added a configurable delay to avoid it tearing down tasks the first time it notices them idle, and the total number of threads at steady state went up, but the amount of time being burned just tearing down/turning up new ones almost vanished. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #14938
* Finally drop long disabled vdev cache.Alexander Motin2023-06-091-1/+0
| | | | | | | | | | | | | | | | | | | It was a vdev level read cache, designed to aggregate many small reads by speculatively issuing bigger reads instead and caching the result. But since it has almost no idea about what is going on with exception of ZIO_FLAG_DONT_CACHE flag set by higher layers, it was found to make more harm than good, for which reason it was disabled for the past 12 years. These days we have much better instruments to enlarge the I/Os, such as speculative and prescient prefetches, I/O scheduler, I/O aggregation etc. Besides just the dead code removal this removes one extra mutex lock/unlock per write inside vdev_cache_write(), not otherwise disabled and trying to do some work. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #14953
* Use __attribute__((malloc)) on memory allocation functionsRichard Yao2023-05-263-11/+14
| | | | | | | | | | | | | | This informs the C compiler that pointers returned from these functions do not alias other functions, which allows it to do better code optimization and should make the compiled code smaller. References: https://stackoverflow.com/a/53654773 https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-malloc-function-attribute https://clang.llvm.org/docs/AttributeReference.html#malloc Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14827
* zil: Add some more statistics.Alexander Motin2023-05-251-0/+34
| | | | | | | | | | | | | | | | In addition to a number of actual log bytes written, account also a total written bytes including padding and total allocated bytes (bytes <= write <= alloc). It should allow to monitor zil traffic and space efficiency. Add dtrace probe for zil block size selection. Make zilstat report more information and fit it into less width. Reviewed-by: Ameer Hamza <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored by: iXsystems, Inc. Closes #14863
* Wrap clang specific pragmaBrian Behlendorf2023-05-021-0/+4
| | | | | | | | Clang specific pragmas need to be wrapped to prevent a build warning when compiling with gcc. Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #14814
* powerpc64: Support ELFv2 asm on Big EndianJustin Hibbits2023-04-271-1/+1
| | | | | | | | | | | FreeBSD/powerpc64 is all ELFv2 since FreeBSD 13, even big endian. The existing sha256 and sha512 asm code assumes that BE is all ELFv1, and LE is ELFv2. Minor changes to add ELFv2 in the BE side gets this working correctly on FreeBSD with latest OpenZFS import. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Justin Hibbits <[email protected]> Closes #14779
* Add loongarch64 supportHan Gao2023-04-251-1/+17
| | | | | | | | | | | Add loongarch64 definitions & lua module setjmp asm LoongArch is a new RISC ISA, which is a bit like MIPS or RISC-V. Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Han Gao <[email protected]> Signed-off-by: WANG Xuerui <[email protected]> Closes #13422
* Linux: Suppress -Wordered-compare-function-pointers in tracepoint codeRichard Yao2023-04-201-0/+4
| | | | | | | | | | | | Clang points out that there is a comparison against -1, but we cannot fix it because that is from the kernel headers, which we must support. We can workaround this by using a pragma. Sponsored-By: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Youzhong Yang <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14738
* Linux 6.3 compat: idmapped mount API changesyouzhongyang2023-04-109-37/+92
| | | | | | | | | Linux kernel 6.3 changed a bunch of APIs to use the dedicated idmap type for mounts (struct mnt_idmap), we need to detect these changes and make zfs work with the new APIs. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Youzhong Yang <[email protected]> Closes #14682
* module: freebsd: fix aarch64 fpu handlingKyle Evans2023-04-101-2/+12
| | | | | | | | | | | | Just like x86, aarch64 needs to use the fpu_kern(9) API around FPU usage, otherwise we panic promptly at boot as soon as ZFS attempts to do checksum benchmarking. Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Signed-off-by: Kyle Evans <[email protected]> Closes #14715
* Miscellaneous FreBSD compilation bugfixesMartin Matuška2023-04-064-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | Add missing machine/md_var.h to spl/sys/simd_aarch64.h and spl/sys/simd_arm.h In spl/sys/simd_x86.h, PCB_FPUNOSAVE exists only on amd64, use PCB_NPXNOSAVE on i386 In FreeBSD sys/elf_common.h redefines AT_UID and AT_GID on FreeBSD, we need a hack in vnode.h similar to Linux. sys/simd.h needs to be included early. In zfs_freebsd_copy_file_range() we pass a (size_t *)lenp to zfs_clone_range() that expects a (uint64_t *) Allow compiling armv6 world by limiting ARM macros in sha256_impl.c and sha512_impl.c to __ARM_ARCH > 6 Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Pawel Jakub Dawidek <[email protected]> Reviewed-by: Signed-off-by: WHR <[email protected]> Signed-off-by: Martin Matuska <[email protected]> Closes #14674
* linux 6.3 compat: needs REQ_PREFLUSH | REQ_OP_WRITEyouzhongyang2023-03-311-1/+1
| | | | | | | | | Modify bio_set_flush() so if kernel version is >= 4.10, flags REQ_PREFLUSH and REQ_OP_WRITE are set together. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Youzhong Yang <[email protected]> Closes #14695
* linux 6.3 compat: add another bdev_io_acct caseRich Ercolani2023-03-271-2/+8
| | | | | | | | | Linux 6.3+, and backports from it (6.2.8+), changed the signatures on bdev_io_{start,end}_acct. Add a case for it. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #14658 Closes #14668
* spl: cmn_err_once() should be usable in brace-less if else statementsAttila Fülöp2023-03-152-12/+12
| | | | | | | | | | | Commit 11913870 (#14567) added cmn_err_once() by #define'ing a compound statement but failed to consider usage in a single statement brace-less if else. Fix the problem by using the common "do {} while (0)" construct. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #14629
* Refactor CONFIG_SPE check on Linux/powerpcWHR2023-03-151-18/+10
| | | | | | | | | | | | Commit 5401472 adds a check to call enable_kernel_spe and disable_kernel_spe only if CONFIG_SPE is defined. Refactor this check in a way similar to what CONFIG_ALTIVEC and CONFIG_VSX are checked, in order to remove redundant kfpu_begin() and kfpu_end() implementations. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: WHR <[email protected]> Closes #14623
* Fix missing semicolons in commit 1f196e3WHR2023-03-151-4/+4
| | | | | | | Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: WHR <[email protected]> Closes #14623
* Silence clang static analyzer warnings about stored stack addressesRichard Yao2023-03-141-3/+0
| | | | | | | | | | | | | | Clang's static analyzer complains that nvs_xdr() and nvs_native() functions return pointers to stack memory. That is technically true, but the pointers are stored in stack memory from the caller's stack frame, are not read by the caller and are deallocated when the caller returns, so this is harmless. We set the pointers to NULL to silence the warnings. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14612
* Implementation of block cloning for ZFSPawel Jakub Dawidek2023-03-103-2/+7
| | | | | | | | | | | | | | | Block Cloning allows to manually clone a file (or a subset of its blocks) into another (or the same) file by just creating additional references to the data blocks without copying the data itself. Those references are kept in the Block Reference Tables (BRTs). The whole design of block cloning is documented in module/zfs/brt.c. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Pawel Jakub Dawidek <[email protected]> Closes #13392
* Workaround for Linux PowerPC GPL-only cpu_has_feature()Low-power2023-03-102-0/+26
| | | | | | | | | | | | | | | | | | | Linux since 4.7 makes interface 'cpu_has_feature' to use jump labels on powerpc if CONFIG_JUMP_LABEL_FEATURE_CHECKS is enabled, in this case however the inline function references GPL-only symbol 'cpu_feature_keys'. ZFS currently uses 'cpu_has_feature' either directly or indirectly from several places; while it is unknown how this issue didn't break ZFS on 64-bit little-endian powerpc, it is known to break ZFS with many Linux versions on both 32-bit and 64-bit big-endian powerpc. Until this issue is fixed in Linux, we have to workaround it by overriding affected inline functions without depending on 'cpu_feature_keys'. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: WHR <[email protected]> Closes #14590
* Fix build for Linux/powerpc without CONFIG_ALTIVEC or CONFIG_VSXLow-power2023-03-071-8/+22
| | | | | | | | | This fixes building ZFS for Linux 4.7+ powerpc* architecture, where Linux was configured without CONFIG_ALTIVEC or CONFIG_VSX. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: WHR <[email protected]> Closes #14591
* spl: Add cmn_err_once() to log a message only on the first callAttila Fülöp2023-03-072-0/+50
| | | | | | | Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #14567
* Add SHA2 SIMD feature tests for LinuxTino Reichardt2023-03-026-13/+169
| | | | | | | | | | | | | | | These are added: - zfs_neon_available() for arm and aarch64 - zfs_sha256_available() for arm and aarch64 - zfs_sha512_available() for aarch64 - zfs_shani_available() for x86_64 Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Co-Authored-By: Sebastian Gottschall <[email protected]> Closes #13741
* Add SHA2 SIMD feature tests for FreeBSDTino Reichardt2023-03-027-28/+205
| | | | | | | | | | | | | | | | | These are added: - zfs_neon_available() for arm and aarch64 - zfs_sha256_available() for arm and aarch64 - zfs_sha512_available() for aarch64 - zfs_shani_available() for x86_64 Changes: - simd_powerpc.h: change license from CDDL to BSD Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13741
* Remove old or redundant SHA2 filesTino Reichardt2023-03-024-347/+0
| | | | | | | | | | | | | | | | | We had three sha2.h headers in different places. The FreeBSD version, the Linux version and the generic solaris version. The only assembly used for acceleration was some old x86-64 openssl implementation for sha256 within the icp module. For FreeBSD the whole SHA2 files of FreeBSD were copied into OpenZFS, these files got removed also. Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13741
* Linux: Assert mutex is held in mutex_exit()Richard Yao2023-02-281-0/+1
| | | | | | | | | | | | | | A spurious mutex_exit() in a development branch caused weird issues until I identified it. An assertion prior to mutex_exit() would have caught it. Rather than adding assertions before invocations of mutex_exit() in the code, let us simply add an assertion to mutex_exit(). It is cheap and will likely improve developer productivity. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Richard Yao <[email protected]> Sponsored-By: Wasabi Technology, Inc. Closes #14541
* Use .section .rodata instead of .rodata on FreeBSDDimitry Andric2023-02-241-1/+1
| | | | | | | | | | | | | | | In commit 0a5b942d4 the FreeBSD SECTION_STATIC macro was set to ".rodata". This assembler directive is supported by LLVM (as a convenience alias for ".section .rodata") by not by GNU as. This caused the FreeBSD builds that are done with gcc to fail. Therefore, use ".section .rodata" instead, similar to the other asm_linkage.h headers. Reviewed-by: Mateusz Guzik <[email protected]> Reviewed-by: Attila Fülöp <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Dimitry Andric <[email protected]> Closes #14526
* Fix buffered/direct/mmap I/O raceBrian Behlendorf2023-02-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a page is faulted in for memory mapped I/O the page lock may be dropped before it has been read and marked up to date. If a buffered read encounters such a page in mappedread() it must wait until the page has been updated. Failure to do so will result in a panic on debug builds and incorrect data on production builds. The critical part of this change is in mappedread() where pages which are not up to date are now handled. Additionally, it includes the following simplifications. - zfs_getpage() and zfs_fillpage() could be passed an array of pages. This could be more efficient if it was used but in practice only a single page was ever provided. These interfaces were simplified to acknowledge that. - update_pages() was modified to correctly set the PG_error bit on a page when it cannot be read by dmu_read(). - Setting PG_error and PG_uptodate was moved to zfs_fillpage() from zpl_readpage_common(). This is consistent with the handling in update_pages() and mappedread(). - Minor additional refactoring to comments and variable declarations to improve readability. - Add a test case to exercise concurrent buffered, direct, and mmap IO to the same file. - Reduce the mmap_sync test case default run time. Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13608 Closes #14498
* Linux: use filemap_range_has_page()Brian Behlendorf2023-02-143-8/+20
| | | | | | | | | | As of the 4.13 kernel filemap_range_has_page() can be used to check if there is a page mapped in a given file range. When available this interface should be used which eliminates the need for the zp->z_is_mapped boolean. Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #14493