aboutsummaryrefslogtreecommitdiffstats
path: root/module/os/linux/zfs
Commit message (Collapse)AuthorAgeFilesLines
* Cleanup: Remove ineffective unsigned comparisons against 0Richard Yao2022-09-262-2/+2
| | | | | | | | | | | | | Coverity found a number of places where we either do MAX(unsigned, 0) or do assertions that a unsigned variable is >= 0. These do nothing, so let us drop them all. It also found a spot where we do `if (unsigned >= 0 && ...)`. Let us also drop the unsigned >= 0 check. Reviewed-by: Neal Gompa <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13871
* Linux: Fix uninitialized variable usage in zio_do_crypt_data()Richard Yao2022-09-261-3/+3
| | | | | | | | | | | | | | | | Coverity complained about this. An error from `hkdf_sha512()` before uio initialization will cause pointers to uninitialized memory to be passed to `zio_crypt_destroy_uio()`. This is a regression that was introduced by cf63739191b6cac629d053930a4aea592bca3819. Interestingly, this never affected FreeBSD, since the FreeBSD version never had that patch ported. Since moving uio initialization to the top of this function would slow down the qat_crypt() path, we only move the `memset()` calls to the top of the function. This is sufficient to fix this problem. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13944
* Linux: Fix use-after-free in zfsvfs_create()Richard Yao2022-09-191-3/+2
| | | | | | | | | | | | | | Coverity reported that we pass a pointer to zfsvfs to `dmu_objset_disown()` after freeing zfsvfs in zfsvfs_create_impl() after a failure in zfsvfs_init(). We have nearly identical duplicate versions of this code for FreeBSD and Linux, but interestingly, the FreeBSD version of this code differs in such a way that it does not suffer from this bug. We remove the difference from the FreeBSD version to fix this bug. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13883
* zfs_enter rework followupBrian Behlendorf2022-09-161-3/+3
| | | | | | | | The zpl_fadvise() function was recently added and was not included in the initial patch. Update it accordingly. Signed-off-by: Brian Behlendorf <[email protected]> Closes #13831
* zfs_enter reworkChunwei Chen2022-09-168-208/+252
| | | | | | | | | | | | | | | Replace ZFS_ENTER and ZFS_VERIFY_ZP, which have hidden returns, with functions that return error code. The reason we want to do this is because hidden returns are not obvious and had caused some missing fail path unwinding. This patch changes the common, linux, and freebsd parts. Also fixes fail path unwinding in zfs_fsync, zpl_fsync, zpl_xattr_{list,get,set}, and zfs_lookup(). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #13831
* Cleanup dead spa_boot codeRichard Yao2022-09-131-1/+0
| | | | | | | | | | Unused code detected by coverity. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13868
* Add Linux posix_fadvise supportFinix19792022-09-081-0/+62
| | | | | | | | | | | | | | | | | The purpose of this PR is to accepts fadvise ioctl from userland to do read-ahead by demand. It could dramatically improve sequential read performance especially when primarycache is set to metadata or zfs_prefetch_disable is 1. If the file is mmaped, generic_fadvise is also called for page cache read-ahead besides dmu_prefetch. Only POSIX_FADV_WILLNEED and POSIX_FADV_SEQUENTIAL are supported in this PR currently. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Finix Yan <[email protected]> Closes #13694
* Linux 5.20 compat: blk_cleanup_disk()Brian Behlendorf2022-08-041-0/+4
| | | | | | | | | As of the Linux 5.20 kernel blk_cleanup_disk() has been removed, all callers should use put_disk(). Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13728
* Linux 5.20 compat: bdevname()Brian Behlendorf2022-08-041-1/+11
| | | | | | | | | As of the Linux 5.20 kernel bdevname() has been removed, all callers should use snprintf() and the "%pg" format specifier. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13728
* Add support for per dataset zil stats and use wmsum countersixhamza2022-07-202-7/+15
| | | | | | | | | | | | | | | | ZIL kstats are reported in an inclusive way, i.e., same counters are shared to capture all the activities happening in zil. Added support to report zil stats for every datset individually by combining them with already exposed dataset kstats. Wmsum uses per cpu counters and provide less overhead as compared to atomic operations. Updated zil kstats to replace wmsum counters to avoid atomic operations. Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #13636
* Expose ZFS dataset case sensitivity setting via sb_optsixhamza2022-07-141-0/+12
| | | | | | | | Makes the case sensitivity setting visible on Linux in /proc/mounts. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #13607
* Replace dead opensolaris.org license linkTino Reichardt2022-07-1129-29/+29
| | | | | | | | | The commit replaces all findings of the link: http://www.opensolaris.org/os/licensing with this one: https://opensource.org/licenses/CDDL-1.0 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13619
* Linux: Align MODULE_LICENSE macro textBrian Behlendorf2022-07-111-3/+2
| | | | | | | | | Specify the lua and zstd license text in the manor in which the kernel MODULE_LICENSE macro requires it. The now duplicate entries were merged and a comment added to make it clear what they apply to. Reviewed-by: Christian Schwarz <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13641
* Remaining {=> const} char|void *tagнаб2022-06-291-2/+2
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13348
* Fix znode group permission different from acl maskgaoyanping2022-06-291-3/+3
| | | | | | | | | Zp->z_mode is set at the same time inode->i_mode is being changed. This has the effect of keeping both in sync without relying on zfs_znode_update_vfs. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: yanping.gao <[email protected]> Closes #13581
* expose snapshot count via stat(2) of .zfs/snapshot (#13559)Andrew2022-06-171-0/+17
| | | | | | | | | | | Increase nlinks in stat results of ./zfs/snapshot based on snapshot count. This provides quick and efficient method for administrators to get snapshot counts without having to use libzfs or list the snapdir contents. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Andrew Walker <[email protected]> Closes #13559
* Fix clang 13 compilation errorsDamian Szuberski2022-06-151-1/+4
| | | | | | | | | | | | | | | ``` os/linux/zfs/zvol_os.c:1111:3: error: ignoring return value of function declared with 'warn_unused_result' attribute [-Werror,-Wunused-result] add_disk(zv->zv_zso->zvo_disk); ^~~~~~~~ ~~~~~~~~~~~~~~~~~~~~ zpl_xattr.c:1579:1: warning: no previous prototype for function 'zpl_posix_acl_release_impl' [-Wmissing-prototypes] ``` Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #13551
* Add Linux namespace delegation supportWill Andrews2022-06-104-1/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows ZFS datasets to be delegated to a user/mount namespace Within that namespace, only the delegated datasets are visible Works very similarly to Zones/Jailes on other ZFS OSes As a user: ``` $ unshare -Um $ zfs list no datasets available $ echo $$ 1234 ``` As root: ``` # zfs list NAME ZONED MOUNTPOINT containers off /containers containers/host off /containers/host containers/host/child off /containers/host/child containers/host/child/gchild off /containers/host/child/gchild containers/unpriv on /unpriv containers/unpriv/child on /unpriv/child containers/unpriv/child/gchild on /unpriv/child/gchild # zfs zone /proc/1234/ns/user containers/unpriv ``` Back to the user namespace: ``` $ zfs list NAME USED AVAIL REFER MOUNTPOINT containers 129M 47.8G 24K /containers containers/unpriv 128M 47.8G 24K /unpriv containers/unpriv/child 128M 47.8G 128M /unpriv/child ``` Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Will Andrews <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Mateusz Piotrowski <[email protected]> Co-authored-by: Allan Jude <[email protected]> Co-authored-by: Mateusz Piotrowski <[email protected]> Sponsored-by: Buddy <https://buddy.works> Closes #12263
* zvol: Support blk-mq for better performanceTony Hutter2022-06-092-126/+660
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for the kernel's block multiqueue (blk-mq) interface in the zvol block driver. blk-mq creates multiple request queues on different CPUs rather than having a single request queue. This can improve zvol performance with multithreaded reads/writes. This implementation uses the blk-mq interfaces on 4.13 or newer kernels. Building against older kernels will fall back to the older BIO interfaces. Note that you must set the `zvol_use_blk_mq` module param to enable the blk-mq API. It is disabled by default. In addition, this commit lets the zvol blk-mq layer process whole `struct request` IOs at a time, rather than breaking them down into their individual BIOs. This reduces dbuf lock contention and overhead versus the legacy zvol submit_bio() codepath. sequential dd to one zvol, 8k volblocksize, no O_DIRECT: legacy submit_bio() 292MB/s write 453MB/s read this commit 453MB/s write 885MB/s read It also introduces a new `zvol_blk_mq_chunks_per_thread` module parameter. This parameter represents how many volblocksize'd chunks to process per each zvol thread. It can be used to tune your zvols for better read vs write performance (higher values favor write, lower favor read). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #13148 Issue #12483
* Linux 5.19 compat: aops->read_folio()Brian Behlendorf2022-05-311-0/+12
| | | | | | | | | As of the Linux 5.19 kernel the readpage() address space operation has been replaced by read_folio(). Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.19 compat: blkdev_issue_secure_erase()Brian Behlendorf2022-05-311-9/+28
| | | | | | | | | | | Linux 5.19 commit torvalds/linux@44abff2c0 splits the secure erase functionality from the blkdev_issue_discard() function. The blkdev_issue_secure_erase() must now be issued to issue a secure erase. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.19 compat: bdev_max_secure_erase_sectors()Brian Behlendorf2022-05-311-4/+2
| | | | | | | | | | | Linux 5.19 commit torvalds/linux@44abff2c0 removed the blk_queue_secure_erase() helper function. The preferred interface is to now use the bdev_max_secure_erase_sectors() function to check for discard support. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.19 compat: bdev_max_discard_sectors()Brian Behlendorf2022-05-312-1/+3
| | | | | | | | | | | Linux 5.19 commit torvalds/linux@70200574cc removed the blk_queue_discard() helper function. The preferred interface is to now use the bdev_max_discard_sectors() function to check for discard support. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* Linux 5.18 compat: bio_alloc()Brian Behlendorf2022-05-311-14/+39
| | | | | | | | | | As for the Linux 5.18 kernel bio_alloc() expects a block_device struct as an argument. This removes the need for the bio_set_dev() compatibility code for 5.18 and newer kernels. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13515
* autoconf: Fail when __copy_from_user_inatomic is a non-GPL symbolszubersk2022-05-111-4/+0
| | | | | | | | | A followup to 849c14e04844a2f0e1f7e42886c2cef083563f35 Fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1009242 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #13389
* abd_os: remove redundant refcount creation for abd_childrenhping2022-05-091-1/+0
| | | | | | | | | | | | | Refcount creation for abd_zero_scatter->abd_children is redundant in abd_alloc_zero_scatter, as it has been done in abd_init_struct. In addition, abd_children is undefined when ZFS_DEBUG is disabled, the reference of abd_children in abd_alloc_zero_scatter breaks build of libzpool when ZFS_DEBUG is disabled. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Brian Atkinson <[email protected]> Signed-off-by: Ping Huang <[email protected]> Closes #13429
* Speed up WB_SYNC_NONE when a WB_SYNC_ALL occurs simultaneouslyShaan Nobee2022-05-034-12/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Page writebacks with WB_SYNC_NONE can take several seconds to complete since they wait for the transaction group to close before being committed. This is usually not a problem since the caller does not need to wait. However, if we're simultaneously doing a writeback with WB_SYNC_ALL (e.g via msync), the latter can block for several seconds (up to zfs_txg_timeout) due to the active WB_SYNC_NONE writeback since it needs to wait for the transaction to complete and the PG_writeback bit to be cleared. This commit deals with 2 cases: - No page writeback is active. A WB_SYNC_ALL page writeback starts and even completes. But when it's about to check if the PG_writeback bit has been cleared, another writeback with WB_SYNC_NONE starts. The sync page writeback ends up waiting for the non-sync page writeback to complete. - A page writeback with WB_SYNC_NONE is already active when a WB_SYNC_ALL writeback starts. The WB_SYNC_ALL writeback ends up waiting for the WB_SYNC_NONE writeback. The fix works by carefully keeping track of active sync/non-sync writebacks and committing when beneficial. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Shaan Nobee <[email protected]> Closes #12662 Closes #12790
* Fix O_APPEND for Linux 3.15 and older kernelsBrian Behlendorf2022-04-271-0/+2
| | | | | | | | | | | | | | When using a Linux kernel which predates the iov_iter interface the O_APPEND flag should be applied in zpl_aio_write() via the call to generic_write_checks(). The updated pos variable was incorrectly ignored resulting in the current offset being used. This issue should only realistically impact the RHEL/CentOS 7.x kernels which are based on Linux 3.10. Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13370 Closes #13377
* Linux 5.18 compat: replace __set_page_dirty_nobuffersSatadru Pramanik2022-04-272-1/+12
| | | | | | | | | | | | | | Replace __set_page_dirty_nobuffers with filemap_dirty_folio. Upstream-commit: 6b1f86f8e9c7f9de7ca1cb987b2cf25e99b1ae3a ("Merge tag 'folio-5.18b' of git://git.infradead.org/users/willy/pagecache ") Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Authored-by: Satadru Pramanik <[email protected]> Signed-off-by: Satadru Pramanik <[email protected]> Closes #13325 Closes #13380
* PPC get_user workaroundDamian Szuberski2022-04-261-2/+6
| | | | | | | | | | | | | | | | Linux 5.12 PPC 5.12 get_user() and __copy_from_user_inatomic() inline helpers very indirectly include a reference to the GPL'd array mmu_feature_keys[] and fails to build. Workaround this by using copy_from_user() and throwing EFAULT for any calls to __copy_from_user_inatomic(). This is a workaround until a fix for Linux commit 7613f5a66becfd0e43a0f34de8518695888f5458 "powerpc/64s/kuap: Use mmu_has_feature()" is fully addressed. Reviewed-by: Brian Behlendorf <[email protected]> Authored-by: Colin Ian King <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #11958 Closes #12590 Closes #13367
* Linux 5.18 compat: kobj_type.default_attrs replaced with default_groupsнаб2022-04-221-17/+22
| | | | | | | | Upstream-commit: cdb4f26a63c391317e335e6e683a614358e70aeb ("kobject: kobj_type: remove default_attrs") Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13357
* linux: module: zfs: sysfs: constify types and attrsнаб2022-04-221-5/+5
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13357
* Corrected oversight in ZERO_RANGE behaviorRich Ercolani2022-04-201-4/+6
| | | | | | | | | | | | | | It turns out, no, in fact, ZERO_RANGE and PUNCH_HOLE do have differing semantics in some ways - in particular, one requires KEEP_SIZE, and the other does not. Also added a zero-range test to catch this, corrected a flaw that made the punch-hole test succeed vacuously, and a typo in file_write. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #13329 Closes #13338
* linux: module: weld all but spl.ko into zfs.koнаб2022-04-202-46/+57
| | | | | | | | | | | | | | Originally it was thought it would be useful to split up the kmods by functionality. This would allow external consumers to only load what was needed. However, in practice we've never had a case where this functionality would be needed, and conversely managing multiple kmods can be awkward. Therefore, this change merges all but the spl.ko kmod in to a single zfs.ko kmod. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13274
* Linux 5.17 compat: GENHD_FL_EXT_DEVT / GENHD_FL_NO_PART_SCANBrian Behlendorf2022-04-191-13/+8
| | | | | | | | | | | | | | As of the 5.17 kernel the GENHD_FL_EXT_DEVT flag has been removed and the GENHD_FL_NO_PART_SCAN flag renamed GENHD_FL_NO_PART. Update zvol_alloc() to set GENHD_FL_NO_PART for the newer kernels which is sufficient. The behavior for prior kernels remains unchanged. 1ebe2e5f ("block: remove GENHD_FL_EXT_DEVT") 46e7eac6 ("block: rename GENHD_FL_NO_PART_SCAN to GENHD_FL_NO_PART") Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #13294 Closes #13297
* Linux 5.18 compat: use address_space_operations->readaheadRiccardo Schirone2022-04-041-0/+21
| | | | | | | | | ->readpages was removed and replaced by ->readahead. Define zpl_readahead for kernels that don't have ->readpages. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Riccardo Schirone <[email protected]> Closes #13278
* Linux 5.18 compat: blkg_tryget is moved to private headersRiccardo Schirone2022-04-041-2/+5
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Riccardo Schirone <[email protected]> Closes #13278
* Linux optimize access checks when ACL is trivialAndrew2022-04-012-1/+71
| | | | | | | | | | | | | | | | | | | | | Bypass check of ZFS aces if the ACL is trivial. When an ACL is trivial its permissions are represented by the mode without any loss of information. In this case, it is safe to convert the access request into equivalent mode and then pass desired mask and inode to generic_permission(). This has the added benefit of also checking whether entries in a POSIX ACL on the file grant the desired access. This commit also skips the ACL check on looking up the xattr dir since such restrictions don't exist in Linux kernel and it makes xattr lookup behavior inconsistent between SA and file-based xattrs. We also don't want to perform a POSIX ACL check while looking up the POSIX ACL if for some reason it is located in the xattr dir rather than an SA. Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Ryan Moeller <[email protected]> Signed-off-by: Andrew Walker <[email protected]> Closes #13237
* zfs_ctldir: fix incorrect argument type of rw_destroyhpingfs2022-03-301-1/+1
| | | | | | | | | | The argument type of rw_destroy is (krwlock_t *) while currently krwlock_t is passed in zfs_ctldir.c. This error is hidden because rw_destroy is defined as ((void) 0) in linux. But anyway, this mismatch should be fixed. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ping Huang <[email protected]> Closes #13272
* zvol_os: suppress compiler warning for zvol_open_timeout_mshpingfs2022-03-301-0/+3
| | | | | | | | When HAVE_BLKDEV_GET_ERESTARTSYS is defined, compiler will complain "defined but not used" warning for zvol_open_timeout_ms. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ping Huang <[email protected]> Closes #13270
* Linux 5.18 compat: replace genhd.h with blkdev.h includesнаб2022-03-281-1/+1
| | | | | | | | | | | | | blkdev.h includes genhd.h since dawn of upstream git, so this is globally safe Upstream-commit: 322cbb50de711814c42fb088f6d31901502c711a ("block: remove genhd.h") Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13251
* Linux 5.18 compat: 4-argument bio_alloc()наб2022-03-281-0/+4
| | | | | | | | | | | | | | | | | | bio_alloc(gfp_t gfp_mask, unsigned short nr_iovecs) became bio_alloc(struct block_device *bdev, unsigned short nr_vecs, unsigned int opf, gfp_t gfp_mask) passing NULL/0 continues previous behaviour Upstream-commit: 07888c665b405b1cd3577ddebfeb74f4717a84c4 ("block: pass a block_device and opf to bio_alloc") Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13251
* Fix ACL checks for NFS kernel serverRyan Moeller2022-03-183-9/+9
| | | | | | | | | | | | This PR changes ZFS ACL checks to evaluate fsuid / fsgid rather than euid / egid to avoid accidentally granting elevated permissions to NFS clients. Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Andrew Walker <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13221
* Forbid b{copy,zero,cmp}(). Don't include <strings.h> for <string.h>наб2022-03-151-1/+1
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12996
* Remove bcopy(), bzero(), bcmp()наб2022-03-157-93/+95
| | | | | | | | | | bcopy() has a confusing argument order and is actually a move, not a copy; they're all deprecated since POSIX.1-2001 and removed in -2008, and we shim them out to mem*() on Linux anyway Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #12996
* Adding ZERO_PAGE detectionBrian Atkinson2022-03-141-3/+12
| | | | | | | | | | | On some architectures ZERO_PAGE is unavailable because it references a GPL exported symbol of empty_zero_page. Originally e08b993 removed the call to PAGE_ZERO(0) for assignment to the abd_zero_page. However, a simple check can be done to avoid a kernel allocation and free for the abd_zero_page if ZERO_PAGE is available. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Brian Atkinson <[email protected]> Closes #13199
* Expose additional file level attributesUmer Saleem2022-03-071-15/+110
| | | | | | | | | | | | | | | | | | | | | ZFS allows to update and retrieve additional file level attributes for FreeBSD. This commit allows additional file level attributes to be updated and retrieved for Linux. These include the flags stored in the upper half of z_pflags only. Two new IOCTLs have been added for this purpose. ZFS_IOC_GETDOSFLAGS can be used to retrieve the attributes, while ZFS_IOC_SETDOSFLAGS can be used to update the attributes. Attributes that are allowed to be updated include ZFS_IMMUTABLE, ZFS_APPENDONLY, ZFS_NOUNLINK, ZFS_ARCHIVE, ZFS_NODUMP, ZFS_SYSTEM, ZFS_HIDDEN, ZFS_READONLY, ZFS_REPARSE, ZFS_OFFLINE and ZFS_SPARSE. Flags can be or'd together while calling ZFS_IOC_SETDOSFLAGS. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13118
* log xattr=sa create/remove/update to ZILJitendra Patidar2022-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As such, there are no specific synchronous semantics defined for the xattrs. But for xattr=on, it does log to ZIL and zil_commit() is done, if sync=always is set on dataset. This provides sync semantics for xattr=on with sync=always set on dataset. For the xattr=sa implementation, it doesn't log to ZIL, so, even with sync=always, xattrs are not guaranteed to be synced before xattr call returns to caller. So, xattr can be lost if system crash happens, before txg carrying xattr transaction is synced. This change adds xattr=sa logging to ZIL on xattr create/remove/update and xattrs are synced to ZIL (zil_commit() done) for sync=always. This makes xattr=sa behavior similar to xattr=on. Implementation notes: The actual logging is fairly straight-forward and does not warrant additional explanation. However, it has been 14 years since we last added new TX types to the ZIL [1], hence this is the first time we do it after the introduction of zpool features. Therefore, here is an overview of the feature activation and deactivation workflow: 1. The feature must be enabled. Otherwise, we don't log the new record type. This ensures compatibility with older software. 2. The feature is activated per-dataset, since the ZIL is per-dataset. 3. If the feature is enabled and dataset is not for zvol, any append to the ZIL chain will activate the feature for the dataset. Likewise for starting a new ZIL chain. 4. A dataset that doesn't have a ZIL chain has the feature deactivated. We ensure (3) by activating on the first zil_commit() after the feature was enabled. Since activating the features requires waiting for txg sync, the first zil_commit() after enabling the feature will be slower than usual. The downside is that this is really a conservative approximation: even if we never append a 'TX_SETSAXATTR' to the ZIL chain, we pay the penalty for feature activation. The upside is that the user is in control of when we pay the penalty, i.e., upon enabling the feature. We ensure (4) by hooking into zil_sync(), where ZIL destroy actually happens. One more piece on feature activation, since it's spread across multiple functions: zil_commit() zil_process_commit_list() if lwb == NULL // first zil_commit since zil_open zil_create() if no log block pointer in ZIL header: if feature enabled and not active: // CASE 1 enable, COALESCE txg wait with dmu_tx that allocated the log block else // log block was allocated earlier than this zil_open if feature enabled and not active: // CASE 2 enable, EXPLICIT txg wait else // already have an in-DRAM LWB if feature enabled and not active: // this happens when we enable the feature after zil_create // CASE 3 enable, EXPLICIT txg wait [1] https://github.com/illumos/illumos-gate/commit/da6c28aaf62fa55f0fdb8004aa40f88f23bf53f0 Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Ahelenia Ziemiańska <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jitendra Patidar <[email protected]> Closes #8768 Closes #9078
* module: mark arguments usedнаб2022-02-185-40/+40
| | | | | | | Reviewed-by: Alejandro Colomar <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13110
* linux: module/zfs: vnops: make null_xattr staticнаб2022-02-181-1/+1
| | | | | | | Reviewed-by: Alejandro Colomar <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #13110