aboutsummaryrefslogtreecommitdiffstats
path: root/module
Commit message (Collapse)AuthorAgeFilesLines
* Add note for printing all dbgmsg entries on FreeBSDRich Ercolani2021-05-251-1/+4
| | | | | | | | | | | | | | | I looked for a bit, and couldn't find any documentation on how to print all logged dbgmsg entries, just messages since the DTrace probe started, until @allanjude kindly pointed me toward the sysctl. So let's add that note where the DTrace probe is mentioned for FreeBSD, so other people can find it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #12113
* Propagate vdev state due to invalid label corruptionvermavipinkumar2021-05-251-1/+2
| | | | | | | | | | Propagate vdev child state to parents on invalid label Add VDEV_AUX_BAD_LABEL to print_import_config() Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Co-authored-by: Srikanth N S <[email protected]> Signed-off-by: Vipin Kumar Verma <[email protected]> Closes #12088
* FreeBSD: Retry OCF ENOMEM errors.Alexander Motin2021-05-241-3/+5
| | | | | | | | | | | | | | | ZFS does not expect transient errors from crypto. For read they are counted as checksum errors, while for write end up in panic. To not panic on random low memory conditions retry ENOMEM errors in the OCF wrapper function. While there remove unneeded timeout and priority from msleep(). External-issue: https://reviews.freebsd.org/D30339 Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #12077
* Update tmpfile() existence detectionRich Ercolani2021-05-201-0/+5
| | | | | | | | | | | | Linux changed the tmpfile() signature again in torvalds/linux@6521f89, which in turn broke our HAVE_TMPFILE detection in configure. Update that macro to include the new case, and change the signature of zpl_tmpfile as appropriate. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes: #12060 Closes: #12087
* Fix dRAID sequential resilver silent damage handlingBrian Behlendorf2021-05-202-6/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change addresses two distinct scenarios which are possible when performing a sequential resilver to a dRAID pool with vdevs that contain silent unknown damage. Which in this circumstance took the form of the devices being intentionally overwritten with zeros. However, it could also result from a device returning incorrect data while a sequential resilver was in progress. Scenario 1) A sequential resilver is performed while all of the dRAID vdevs are ONLINE and there is silent damage present on the vdev being resilvered. In this case, nothing will be repaired by vdev_raidz_io_done_reconstruct_known_missing() because rc->rc_error isn't set on any of the raid columns. To address this vdev_draid_io_start_read() has been updated to always mark the resilvering column as ESTALE for sequential resilver IO. Scenario 2) Multiple columns contain silent damage for the same block and a sequential resilver is performed. In this case it's impossible to generate the correct data from parity unless all of the damaged columns are being sequentially resilvered (and thus only good data is used to generate parity). This is as expected and there's nothing which can be done about it. However, we need to be careful not to make to situation worse. Since we can't verify the data is actually good without a checksum, we must only repair the devices which are being sequentially resilvered. Otherwise, an incorrect repair to a device which previously contained good data could effectively lock in the damage and make reconstruction impossible. A check for this was added to vdev_raidz_io_done_verified() along with a new test case. Lastly, this change updates the redundancy_draid_spare1 and redundancy_draid_spare3 test cases to be more representative of normal dRAID replacement operation. Specifically, what we care about is that the scrub run after a sequential resilver does not find additional blocks which need repair. This would indicate the sequential resilver failed to rebuild a section of one of the devices. Note also the tests were switched to using the verify_pool() function which still checks for checksum errors. Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12061
* Simple change to fix building in recent environmentsRich Ercolani2021-05-191-4/+4
| | | | | | | | | | | Renamed _fini too for symmetry. Suggested-by: @ensch Reviewed-by: Tony Nguyen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Rich Ercolani <[email protected]> Closes #12059 Closes: #11987 Closes: #12056
* Scale worker threads and taskqs with number of CPUsAlexander Motin2021-05-141-22/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While use of dynamic taskqs allows to reduce number of idle threads, hardcoded 8 taskqs of each kind is a big overkill for small systems, complicating CPU scheduling, increasing I/O reorder, etc, while providing no real locking benefits, just not needed there. On another side, 12*8 worker threads per kind are able to overload almost any system nowadays. For example, pool of several fast SSDs with SHA256 checksum makes system barely responsive during scrub, or with dedup enabled barely responsive during large file deletion. To address both problems this patch introduces ZTI_SCALE macro, alike to ZTI_BATCH, but with multiple taskqs, depending on number of CPUs, to be used in places where lock scalability is needed, while request ordering is not so much. The code is made to create new taskq for ~6 worker threads (less for small systems, but more for very large) up to 80% of CPU cores (previous 75% was not good for rounding down). Both number of threads and threads per taskq are now tunable in case somebody really wants to use all of system power for ZFS. While obviously some benchmarks show small peak performance reduction (not so big really, especially on systems with SMT, where use of the second threads does not give as much performance as the first ones), they also show dramatic latency reduction and much more smooth user- space operation in case of high CPU usage by ZFS. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #11966
* Fix dmu_recv_stream test for resumablePaul Zuchowski2021-05-131-2/+2
| | | | | | | | | | | Use dsl_dataset_has_resume_receive_state() not dsl_dataset_is_zapified() to check if stream is resumable. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Alek Pinchuk <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #12034
* FreeBSD: Implement xattr=saRyan Moeller2021-05-133-147/+395
| | | | | | | | | | | | | | | | | | | | FreeBSD historically has not cared about the xattr property; it was always treated as xattr=on. With xattr=on, xattrs are stored as files in a hidden xattr directory. With xattr=sa, xattrs are stored as system attributes and get cached in nvlists during xattr operations. This makes SA xattrs simpler and more efficient to manipulate. FreeBSD needs to implement the SA xattr operations for feature parity with Linux and to ensure that SA xattrs are accessible when migrated or replicated from Linux. Following the example set by Linux, refactor our existing extattr vnops to split off the parts handling dir style xattrs, and add the corresponding SA handling parts. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11997
* FreeBSD: Use SET_ERROR to trace xattr name errorsRyan Moeller2021-05-131-4/+4
| | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11997
* Revert "Fix raw sends on encrypted datasets when copying back snapshots"Brian Behlendorf2021-05-133-31/+10
| | | | | | | | | | | | | | | Commit d1d4769 takes into account the encryption key version to decide if the local_mac could be zeroed out. However, this could lead to failure mounting encrypted datasets created with intermediate versions of ZFS encryption available in master between major releases. In order to prevent this situation revert d1d4769 pending a more comprehensive fix which addresses the mount failure case. Reviewed-by: George Amanakis <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #11294 Issue #12025 Issue #12300 Closes #12033
* libzfs: add keylocation=https://, backed by fetch(3) or libcurlнаб2021-05-121-1/+5
| | | | | | | | | | | Add support for http and https to the keylocation properly to allow encryption keys to be fetched from the specified URL. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Issue #9543 Closes #9947 Closes #11956
* linux 5.13 compat: bdevops->revalidate_disk() removedColeman Kane2021-05-111-0/+2
| | | | | | | | | | | | | | Linux kernel commit 0f00b82e5413571ed225ddbccad6882d7ea60bc7 removes the revalidate_disk() handler from struct block_device_operations. This caused a regression, and this commit eliminates the call to it and the assignment in the block_device_operations static handler assignment code, when configure identifies that the kernel doesn't support that API handler. Reviewed-by: Colin Ian King <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Coleman Kane <[email protected]> Closes #11967 Closes #11977
* Remove unimplemented virus scanning hooksRyan Moeller2021-05-104-57/+0
| | | | | | | Reviewed-by: Adam Moss <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11972
* module/zfs: remove zfs_zevent_console and zfs_zevent_colsнаб2021-05-101-313/+0
| | | | | | | | | | | | | | | zfs_zevent_console committed multiple printk()s per line without properly continuing them ‒ a single event could easily be fragmented across over thirty lines, making it useless for direct application zfs_zevent_cols exists purely to wrap the output from zfs_zevent_console The niche this was supposed to fill can be better served by something akin to the all-syslog ZEDLET Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #7082 Closes #11996
* Fix dRAID self-healing short columnsBrian Behlendorf2021-05-081-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When dRAID performs a normal read operation only the data columns in the raid map are read from disk. This is enough information to calculate the checksum, verify it, and return the needed data to the application. It's only in the event of a checksum failure that the additional parity and any empty columns must be read since they are required for parity reconstruction. Reading these additional columns is handled by vdev_raidz_read_all() which calls vdev_draid_map_alloc_empty() to expand the raid_map_t and submit IOs for the missing columns. This all works correctly, but it fails to account for any "short" columns. These are data columns which are padded with a empty skip sector at the end. Since that empty sector is not needed for a normal read it's not read when columns is first read from disk. However, like the parity and empty columns the skip sector is needed to perform reconstruction. The fix is to mark any "short" columns as never being read by clearing the rc_tried flag when expanding the raid_map_t. This will cause the entire column to re-read from disk in the event of a checksum failure allowing the self-healing functionality to repair the block. Note that this only effects the self-healing feature because when scrubbing a pool the parity, data, and empty columns are all read initially to verify their contents. Furthermore, only blocks which contain "short" columns would be effected, and only when the memory backing the skip sector wasn't already zeroed out. This change extends the existing redundancy_raidz.ksh test case to verify self-healing (as well as resilver and scrub). Then applies the same test case to dRAID with a slightly modified version of the test script called redundancy_draid.ksh. The unused variable combrec was also removed from both test cases. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #12010
* Replace ZoL with OpenZFS where applicableнаб2021-05-071-1/+1
| | | | | | | | | | | | | | | | Afterward, git grep ZoL matches: * README.md: * [ZoL Site](https://zfsonlinux.org) - Correct * etc/default/zfs.in:# ZoL userland configuration. - Changing this would induce a needless upgrade-check, if the user has modified the configuration; this can be updated the next time the defaults change * module/zfs/dmu_send.c: * ZoL < 0.7 does not handle [...] - Before 0.7 is ZoL, so fair enough Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Issue #11956
* FreeBSD: Remove !FreeBSD ifdef'd codeRyan Moeller2021-05-071-35/+1
| | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11994
* Clean up use of zfs_log_create in zfs_dirRyan Moeller2021-05-072-4/+4
| | | | | | | | | | zfs_log_create returns void, so there is no reason to cast its return value to void at the call site. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11994
* Return required size when encode_fh size too smallAlyssa Ross2021-05-072-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Quoting <linux/exportfs.h>: > encode_fh() should return the fileid_type on success and on error > returns 255 (if the space needed to encode fh is greater than > @max_len*4 bytes). On error @max_len contains the minimum size (in 4 > byte unit) needed to encode the file handle. ZFS was not setting max_len in the case where the handle was too small. As a result of this, the `t_name_to_handle_at.c' example in name_to_handle_at(2) did not work on ZFS. zfsctl_fid() will itself set max_len if called with a fid that is too small, so if we give zfs_fid() that behavior as well, the fix is quite easy: if the handle is too small, just use a zero-size fid instead of the handle. Tested by running t_name_to_handle_at on a normal file, a directory, a .zfs directory, and a snapshot. Thanks-to: Puck Meerburg <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Nguyen <[email protected]> Signed-off-by: Alyssa Ross <[email protected]> Closes #11995
* Simplify/fix dnode_move() for dn_zfetchAlexander Motin2021-05-071-7/+1
| | | | | | | | | | | | | Previous code tried to keep prefetch streams while moving dnode. But it was at least not updating per-stream zs_fetchback pointers, causing use-after-free on next access. Instead of that I see much easier and cleaner to just drop old prefetch state and start new from scratch. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Sponsored-By: iXsystems, Inc. Closes #11936 Closes #11998
* FreeBSD: Initialize/destroy zp->z_lockRyan Moeller2021-05-061-0/+2
| | | | | | | | | | zp->z_lock is used in shared code for protecting projid and scantime. We don't exercise these paths much if at all on FreeBSD, so have been lucky enough not to have issues with the uninitialized locks so far. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #12003
* Miscellaneous code cleanupRyan Moeller2021-04-303-9/+5
| | | | | | | | | | | | | Remove some extra whitespace. Use pointer-typed asserts in Linux's znode cache destructor for more info when debugging. Simplify a couple of conversions from inode to znode when we already have the znode. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11974
* FreeBSD: Clean up ASSERT/VERIFY use in moduleRyan Moeller2021-04-3023-238/+233
| | | | | | | | | | | | | | Convert use of ASSERT() to ASSERT0(), ASSERT3U(), ASSERT3S(), ASSERT3P(), and likewise for VERIFY(). In some cases it ended up making more sense to change the code, such as VERIFY on nvlist operations that I have converted to use fnvlist instead. In one place I changed an internal struct member from int to boolean_t to match its use. Some asserts that combined multiple checks with && in a single assert have been split to separate asserts, to make it apparent which check fails. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11971
* FreeBSD: Prune some unneeded definitionsRyan Moeller2021-04-302-2/+2
| | | | | | | | | IS_XATTRDIR is never used. v_count is only used in two places, one immediately followed by the use of the real name, v_usecount. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11973
* vdev_mirror: don't scrub/resilver devices that can't be readNathaniel Wesley Filardo2021-04-271-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This ensures that we don't accumulate checksum errors against offline or unavailable devices but, more importantly, means that we don't needlessly create DTL entries for offline devices that are already up-to-date. Consider a 3-way mirror, with disk A always online (and so always with an empty DTL) and B and C only occasionally online. When A & B resilver with C offline, B's DTL will effectively be appended to C's due to these spurious ZIOs even as the resilver empties B's DTL: * These ZIOs land in vdev_mirror_scrub_done() and flag an error * That flagged error causes vdev_mirror_io_done() to see unexpected_errors, so it issues a ZIO_TYPE_WRITE repair ZIO, which inherits ZIO_FLAG_SCAN_THREAD because zio_vdev_child_io() includes that flag in ZIO_VDEV_CHILD_FLAGS. * That ZIO fails, too, and eventually zio_done() gets its hands on it and calls vdev_stat_update(). * vdev_stat_update() sees the error and this zio... * is not speculative, * is not due to EIO (but rather ENXIO, since the device is closed) * has an ->io_vd != NULL (specifically, the offline leaf device) * is a write * is for a txg != 0 (but rather the read block's physical birth txg) * has ZIO_FLAG_SCAN_THREAD asserted * So: vdev_stat_update() calls vdev_dtl_dirty() on the offline vdev. Then, when A & C resilver with B offline, that story gets replayed and C's DTL will be appended to B's. In fact, one does not need this permanently-broken-mirror scenario to induce badness: breaking a mirror with no DTLs and then scrubbing will create DTLs for all offline devices. These DTLs will persist until the entire mirror is reassembled for the duration of the *resilver*, which, incidentally, will not consider the devices with good data to be sources of good data in the case of a read failure. Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Nathaniel Wesley Filardo <[email protected]> Closes #11930
* Drop "All rights reserved" from files by [email protected]Martin Matuška2021-04-271-1/+0
| | | | | | | | | | This obeys the change in freebsd/freebsd-src@bce7ee9d4 External-issue: https://reviews.freebsd.org/D26980 Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Martin Matuska <[email protected]> Closes #11947
* FreeBSD: damage control racing .. lookups in face of mkdir/rmdirMateusz Guzik2021-04-261-0/+27
| | | | | | | External-issue: https://reviews.freebsd.org/D29769 Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Mateusz Guzik <[email protected]> Closes #11926
* Fix AVX512BW Fletcher code on AVX512-but-not-BW machinesRomain Dolbeau2021-04-261-1/+7
| | | | | | | | | | Introduce a specific valid function for avx512f+avx512bw (instead of checking only for avx512f). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Adam Moss <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #11937 Closes #11938
* ICP: Silence objtool "stack pointer realignment" warningsAttila Fülöp2021-04-171-0/+6
| | | | | | | | | | | | | | Objtool requires the use of a DRAP register while aligning the stack. Since a DRAP register is a gcc concept and we are notoriously low on registers in the crypto code, it's not worth the effort to mimic gcc generated stack realignment. We simply silence the warning by adding the offending object files to OBJECT_FILES_NON_STANDARD. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #6950 Closes #11914
* Combine zio caches if possibleMateusz Guzik2021-04-171-24/+50
| | | | | | | | | | | This deduplicates 2 sets of caches which use the same allocation size. Memory savings fluctuate a lot, one sample result is FreeBSD running "make buildworld" saving ~180MB RAM in reduced page count associated with zio caches. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Mateusz Guzik <[email protected]> Closes #11877
* ICP: Add missing stack frame info to SHA asm filesAttila Fülöp2021-04-163-4/+72
| | | | | | | | | Since the assembly routines calculating SHA checksums don't use a standard stack layout, CFI directives are needed to unroll the stack. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #11733
* Fix crash in zio_done error reportingPaul Zuchowski2021-04-161-2/+3
| | | | | | | | | Fix NULL pointer dereference when reporting checksum error for gang block in zio_done. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #11872 Closes #11896
* linux/spl: proc: use global table_{min,max} values instead of local onesнаб2021-04-151-6/+6
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #11879
* linux/spl: base proc_dohostid() on proc_dostring()наб2021-04-151-76/+17
| | | | | | | | | | | | | | | | | | | | | | This fixes /proc/sys/kernel/spl/hostid on kernels with mainline commit 32927393dc1ccd60fb2bdc05b9e8e88753761469 ("sysctl: pass kernel pointers to ->proc_handler") ‒ 5.7-rc1 and up The access_ok() check in copy_to_user() in proc_copyout_string() would always fail, so all userspace reads and writes would fail with EINVAL proc_dostring() strips only the final new-line, but simple_strtoul() doesn't actually need a back-trimmed string ‒ writing "012345678 \n" is still allowed, as is "012345678zupsko", &c. This alters what happens when an invalid value is written ‒ previously it'd get set to what-ever simple_strtoul() returned (probably 0, thereby resetting it to default), now it does nothing Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #11878 Closes #11879
* module/zfs/zvol.c: purge unused zvol_volmode_cb_argнаб2021-04-151-4/+0
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #11879
* ZFS traverse_visitbp optimization to limit prefetchJitendra Patidar2021-04-151-14/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Traversal code, traverse_visitbp() does visit blocks recursively. Indirect (Non L0) Block of size 128k could contain, 1024 block pointers of 128 bytes. In case of full traverse OR incremental traverse, where all blocks were modified, it could traverse large number of blocks pointed by indirect. Traversal code does issue prefetch of blocks traversed below indirect. This could result into large number of async reads queued on vdev queue. So, account for prefetch issued for blocks pointed by indirect and limit max prefetch in one go. Module Param: zfs_traverse_indirect_prefetch_limit: Limit of prefetch while traversing an indirect block. Local counters: prefetched: Local counter to account for number prefetch done. pidx: Index for which next prefetch to be issued. ptidx: Index at which next prefetch to be triggered. Keep "ptidx" somewhere in the middle of blocks prefetched, so that blocks prefetch read gets the enough time window before their demand read is issued. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Jitendra Patidar <[email protected]> Closes #11802 Closes #11803
* Add SIGSTOP and SIGTSTP handling to issigPaul Dagnelie2021-04-151-0/+51
| | | | | | | | | | | | | | | | This change adds SIGSTOP and SIGTSTP handling to the issig function; this mirrors its behavior on Solaris. This way, long running kernel tasks can be stopped with the appropriate signals. Note that doing so with ctrl-z on the command line doesn't return control of the tty to the shell, because tty handling is done separately from stopping the process. That can be future work, if people feel that it is a necessary addition. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Issue #810 Issue #10843 Closes #11801
* FreeBSD: use vnlru_free_vfsops if availableMateusz Guzik2021-04-121-1/+21
| | | | | | | | Fixes issues when zfs is used along with other filesystems. External-issue: https://cgit.freebsd.org/src/commit/?id=e9272225e6bed840b00eef1c817b188c172338ee Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Mateusz Guzik <[email protected]> Closes #11881
* FreeBSD: add missing seqc write begin/end around zfs_acl_chown_setattrMateusz Guzik2021-04-121-0/+2
| | | | | | | | It happens to trip over an assert but does not matter for correctness at this time. Done for future proofing. Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Mateusz Guzik <[email protected]> Closes #11884
* FreeBSD: add support for lockless symlink lookupMateusz Guzik2021-04-122-2/+99
| | | | | Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Mateusz Guzik <[email protected]> Closes #11883
* ZTS: fix removal_condense_export test caseBrian Behlendorf2021-04-111-2/+5
| | | | | | | | | | | | | It's been observed in the CI that the required 25% of obsolete bytes in the mapping can be to high a threshold for this test resulting in condensing never being triggered and a test failure. To prevent these failures make the existing zfs_condense_indirect_obsolete_pct tuning available so the obsolete percentage can be reduced from 25% to 5% during this test. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11869
* Balance parentheses in parameter descriptionspstef2021-04-112-2/+2
| | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Piotr Paweł Stefaniak <[email protected]> Closes #11882
* Move zfsdev_state_{init,destroy} to common codeRyan Moeller2021-04-083-106/+92
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11833
* Eliminate zfsdev_get_state_implRyan Moeller2021-04-082-14/+4
| | | | | | | | After 3937ab20f zfsdev_get_state_impl can become zfsdev_get_state. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11833
* zpl_inode.c: Fix SMACK interoperabilityTerraTech2021-04-081-12/+24
| | | | | | | | | | | | | | | | | | | | | | SMACK needs to have the ZFS dentry security field setup before SMACK's d_instantiate() hook is called as it requires functioning '__vfs_getxattr()' calls to properly set the labels. Fxes: 1) file instantiation properly setting the object label to the subject's label 2) proper file labeling in a transmutable directory Functions Updated: 1) zpl_create() 2) zpl_mknod() 3) zpl_mkdir() 4) zpl_symlink() External-issue: https://github.com/cschaufler/smack-next/issues/1 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: TerraTech <[email protected]> Closes #11646 Closes #11839
* Use dsl_scan_setup_check() to setup a scrubBrian Behlendorf2021-04-082-3/+3
| | | | | | | | | | | | When a rebuild completes it will automatically schedule a follow up scrub to verify all of the block checksums. Before setting up the scrub execute the counterpart dsl_scan_setup_check() function to confirm the scrub can be started. Prior to this change we'd only check vdev_rebuild_active() which isn't as comprehensive, and using the check function keeps all of this logic in one place. Reviewed-by: Mark Maybee <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #11849
* Fix double sha1/sha1.o line in module/icp/Makefile.inTino Reichardt2021-04-081-1/+0
| | | | | | Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #11852
* Ratelimit deadman zevents as with delay zeventsRyan Moeller2021-04-072-3/+8
| | | | | | | | | | | | | | | Just as delay zevents can flood the zevent pipe when a vdev becomes unresponsive, so do the deadman zevents. Ratelimit deadman zevents according to the same tunable as for delay zevents. Enable deadman tests on FreeBSD and add a test for deadman event ratelimiting. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Don Brady <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #11786
* kmem_alloc(KM_SLEEP) should use kvmalloc()Matthew Ahrens2021-04-061-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | `kmem_alloc(size>PAGESIZE, KM_SLEEP)` is backed by `kmalloc()`, which finds contiguous physical memory. If there isn't enough contiguous physical memory available (e.g. due to physical page fragmentation), the OOM killer will be invoked to make more memory available. This is not ideal because processes may be killed when there is still plenty of free memory (it just happens to be in individual pages, not contiguous runs of pages). We have observed this when allocating the ~13KB `zfs_cmd_t`, for example in `zfsdev_ioctl()`. This commit changes the behavior of `kmem_alloc(size>PAGESIZE, KM_SLEEP)` when there are insufficient contiguous free pages. In this case we will find individual pages and stitch them together using virtual memory. This is accomplished by using `kvmalloc()`, which implements the described behavior by trying `kmalloc(__GFP_NORETRY)` and falling back on `vmalloc()`. The behavior of `kmem_alloc(KM_NOSLEEP)` is not changed; it continues to use `kmalloc(GPF_ATOMIC | __GFP_NORETRY)`. This is because `vmalloc()` may sleep. Reviewed-by: Tony Nguyen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Wilson <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #11461