aboutsummaryrefslogtreecommitdiffstats
path: root/lib/libzfs
Commit message (Collapse)AuthorAgeFilesLines
* nvpair: Constify string functionsRichard Yao2023-03-1412-86/+95
| | | | | | | | | | | | | | After addressing coverity complaints involving `nvpair_name()`, the compiler started complaining about dropping const. This lead to a rabbit hole where not only `nvpair_name()` needed to be constified, but also `nvpair_value_string()`, `fnvpair_value_string()` and a few other static functions, plus variable pointers throughout the code. The result became a fairly big change, so it has been split out into its own patch. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14612
* zpool_valid_proplist() should not corrupt nvpair name string on errorRichard Yao2023-03-141-0/+1
| | | | | | | | | | | | | | The strings returned from parsing nvlists should be immutable, but to simplify the code when we want a substring from it, we sometimes will write a NULL into it and then restore the value afterward. Provided there is no concurrent access, this is okay, unless we forget to restore the value afterward. This was caught when constifying string functions related to nvlists. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14612
* zcommon: Refactor FPU state handling in fletcher4Attila Fülöp2023-03-141-8/+11
| | | | | | | | | | | | | | | | | | Currently calls to kfpu_begin() and kfpu_end() are split between the init() and fini() functions of the particular SIMD implementation. This was done in #14247 as an optimization measure for the ABD adapter. Unfortunately the split complicates FPU handling on platforms that use a local FPU state buffer, like Windows and macOS. To ease porting, we introduce a boolean struct member in fletcher_4_ops_t, indicating use of the FPU, and move the FPU state handling from the SIMD implementations to the call sites. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #14600
* ABI files: sync with new Ubuntu release in CIGeorge Melikov2023-03-101-1841/+5265
| | | | | | | | | | We may try to build ZFS inside container too, but let's just sync them for now. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #14605
* Implementation of block cloning for ZFSPawel Jakub Dawidek2023-03-102-5/+12
| | | | | | | | | | | | | | | Block Cloning allows to manually clone a file (or a subset of its blocks) into another (or the same) file by just creating additional references to the data blocks without copying the data itself. Those references are kept in the Block Reference Tables (BRTs). The whole design of block cloning is documented in module/zfs/brt.c. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Christian Schwarz <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Rich Ercolani <[email protected]> Signed-off-by: Pawel Jakub Dawidek <[email protected]> Closes #13392
* Better handling for future crypto parametersRob N2023-03-072-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intent is that this is like ENOTSUP, but specifically for when something can't be done because we have no support for the requested crypto parameters; eg unlocking a dataset or receiving a stream encrypted with a suite we don't support. Its not intended to be recoverable without upgrading ZFS itself. If the request could be made to work by enabling a feature or modifying some other configuration item, then some other code should be used. load-key: In the future we might have more crypto suites (ie new values for the `encryption` property. Right now trying to load a key on such a future crypto suite will look up suite parameters off the end of the crypto table, resulting in misbehaviour and/or crashes (or, with debug enabled, trip the assertion in `zio_crypt_key_unwrap`). Instead, lets check the value we got from the dataset, and if we can't handle it, abort early. recv: When receiving a raw stream encrypted with an unknown crypto suite, `zfs recv` would report a generic `invalid backup stream` (EINVAL). While technically correct, its not super helpful, so lets ship a more specific error code and message. Reviewed-by: Tino Reichardt <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: Rob Norris <[email protected]> Closes #14577
* Add generic implementation handling and SHA2 implTino Reichardt2023-03-021-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The skeleton file module/icp/include/generic_impl.c can be used for iterating over different implementations of algorithms. It is used by SHA256, SHA512 and BLAKE3 currently. The Solaris SHA2 implementation got replaced with a version which is based on public domain code of cppcrypto v0.10. These assembly files are taken from current openssl master: - sha256-x86_64.S: x64, SSSE3, AVX, AVX2, SHA-NI (x86_64) - sha512-x86_64.S: x64, AVX, AVX2 (x86_64) - sha256-armv7.S: ARMv7, NEON, ARMv8-CE (arm) - sha512-armv7.S: ARMv7, NEON (arm) - sha256-armv8.S: ARMv7, NEON, ARMv8-CE (aarch64) - sha512-armv8.S: ARMv7, ARMv8-CE (aarch64) - sha256-ppc.S: Generic PPC64 LE/BE (ppc64) - sha512-ppc.S: Generic PPC64 LE/BE (ppc64) - sha256-p8.S: Power8 ISA Version 2.07 LE/BE (ppc64) - sha512-p8.S: Power8 ISA Version 2.07 LE/BE (ppc64) Tested-by: Rich Ercolani <[email protected]> Tested-by: Sebastian Gottschall <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13741
* Give strlcat() full buffer lengths rather than smaller buffer lengthsRichard Yao2023-02-141-2/+2
| | | | | | | | | | | | | | | | | | strlcat() is supposed to be given the length of the destination buffer, including the existing contents. Unfortunately, I had been overzealous when I wrote a51288aabbbc176a8a73a8b3cd56f79607db32cf, since I gave it the length of the destination buffer, minus the existing contents. This likely caused a regression on large strings. On the topic of being overzealous, the use of strlcat() in dmu_send_estimate_fast() was unnecessary because recv_clone_name is a fixed length string. We continue using strlcat() mostly as defensive programming, in case the string length is ever changed, even though it is unnecessary. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14476
* Fix variable shadowing in libzfs_mountReno Reckling2023-02-021-3/+3
| | | | | | | | | | We accidentally reused variable name "i" for inner and outer loops. Reviewed-by: Rich Ercolani <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: Reno Reckling <[email protected]> Closes #14452 Closes #14445
* Fix console progress reporting for recursive sendAmeer Hamza2023-02-021-2/+6
| | | | | | | | | | | After commit 19d3961, progress reporting (-v) with replication flag enabled does not report the progress on the console. This commit fixes the issue by updating the logic to check for pa->progress instead of pa_verbosity in send_progress_thread(). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #14448
* Reject streams that set ->drr_payloadlen to unreasonably large valuesRichard Yao2023-01-231-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the zstream code, Coverity reported: "The argument could be controlled by an attacker, who could invoke the function with arbitrary values (for example, a very high or negative buffer size)." It did not report this in the kernel. This is likely because the userspace code stored this in an int before passing it into the allocator, while the kernel code stored it in a uint32_t. However, this did reveal a potentially real problem. On 32-bit systems and systems with only 4GB of physical memory or less in general, it is possible to pass a large enough value that the system will hang. Even worse, on Linux systems, the kernel memory allocator is not able to support allocations up to the maximum 4GB allocation size that this allows. This had already been limited in userspace to 64MB by `ZFS_SENDRECV_MAX_NVLIST`, but we need a hard limit in the kernel to protect systems. After some discussion, we settle on 256MB as a hard upper limit. Attempting to receive a stream that requires more memory than that will result in E2BIG being returned to user space. Reported-by: Coverity (CID-1529836) Reported-by: Coverity (CID-1529837) Reported-by: Coverity (CID-1529838) Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14285
* Configure zed's diagnosis engine with vdev propertiesrob-wing2023-01-233-1/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce four new vdev properties: checksum_n checksum_t io_n io_t These properties can be used for configuring the thresholds of zed's diagnosis engine and are interpeted as <N> events in T <seconds>. When this property is set to a non-default value on a top-level vdev, those thresholds will also apply to its leaf vdevs. This behavior can be overridden by explicitly setting the property on the leaf vdev. Note that, these properties do not persist across vdev replacement. For this reason, it is advisable to set the property on the top-level vdev instead of the leaf vdev. The default values for zed's diagnosis engine (10 events, 600 seconds) remains unchanged. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Allan Jude <[email protected]> Signed-off-by: Rob Wing <[email protected]> Sponsored-by: Seagate Technology LLC Closes #13805
* zfs_receive_one: Check for the more likely error firstAllan Jude2023-01-201-5/+5
| | | | | | | | | | | | If zfs_receive_one() gets back EINVAL, check for the more likely case, embedded block pointers + encryption and return that error, before falling back to the less likely case, a resumable stream when the kernel has not been upgraded to support resume. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Sponsored-by: rsync.net Sponsored-by: Klara Inc. Closes #14379
* Use setproctitle to report progress of zfs sendAmeer Hamza2023-01-172-15/+61
| | | | | | | | | | | | | | | | | This allows parsing of zfs send progress by checking the process title. Doing so requires some changes to the send code in libzfs_sendrecv.c; primarily these changes move some of the accounting around, to allow for the code to be verbose as normal, or set the process title. Unlike BSD, setproctitle() isn't standard in Linux; thus, borrowed it from libbsd with slight modifications. Authored-by: Sean Eric Fagan <[email protected]> Co-authored-by: Ryan Moeller <[email protected]> Co-authored-by: Ameer Hamza <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #14376
* zed: add hotplug support for spare vdevsAmeer Hamza2023-01-091-4/+1
| | | | | | | | | | | | | | | | This commit supports for spare vdev hotplug. The spare vdev associated with all the pools will be marked as "Removed" when the drive is physically detached and will become "Available" when the drive is reattached. Currently, the spare vdev status does not change on the drive removal and the same is the case with reattachment. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #14295
* deadlock between spa_errlog_lock and dp_config_rwlockMatthew Ahrens2022-12-222-27/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is a lock order inversion deadlock between `spa_errlog_lock` and `dp_config_rwlock`: A thread in `spa_delete_dataset_errlog()` is running from a sync task. It is holding the `dp_config_rwlock` for writer (see `dsl_sync_task_sync()`), and waiting for the `spa_errlog_lock`. A thread in `dsl_pool_config_enter()` is holding the `spa_errlog_lock` (see `spa_get_errlog_size()`) and waiting for the `dp_config_rwlock` (as reader). Note that this was introduced by #12812. This commit address this by defining the lock ordering to be dp_config_rwlock first, then spa_errlog_lock / spa_errlist_lock. spa_get_errlog() and spa_get_errlog_size() can acquire the locks in this order, and then process_error_block() and get_head_and_birth_txg() can verify that the dp_config_rwlock is already held. Additionally, a buffer overrun in `spa_get_errlog()` is corrected. Many code paths didn't check if `*count` got to zero, instead continuing to overwrite past the beginning of the userspace buffer at `uaddr`. Tested by having some errors in the pool (via `zinject -t data /path/to/file`), one thread running `zpool iostat 0.001`, and another thread runs `zfs destroy` (in a loop, although it hits the first time). This reproduces the problem easily without the fix, and works with the fix. Reviewed-by: Mark Maybee <[email protected]> Reviewed-by: George Amanakis <[email protected]> Reviewed-by: George Wilson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #14239 Closes #14289
* Add color output to zfs diff.Ethan Coe-Renner2022-12-152-2/+47
| | | | | | | This adds support to color zfs diff (in the style of git diff) conditional on the ZFS_COLOR environment variable. Signed-off-by: Ethan Coe-Renner <[email protected]>
* Allow receiver to override encryption properties in case of replicationAmeer Hamza2022-12-131-1/+11
| | | | | | | | | | | | | | Currently, the receiver fails to override the encryption property for the plain replicated dataset with the error: "cannot receive incremental stream: encryption property 'encryption' cannot be set for incremental streams.". The problem is resolved by allowing the receiver to override the encryption property for plain replicated send. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #14253 Closes #13533
* zfs list: Allow more fields in ZFS_ITER_SIMPLE modeAllan Jude2022-12-137-51/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the fields to be listed and sorted by are constrained to those populated by dsl_dataset_fast_stat(), then zfs list is much faster, as it does not need to open each objset and reads its properties. A previous optimization by Pawel Dawidek (0cee24064a79f9c01fc4521543c37acea538405f) took advantage of this to make listing snapshot names sorted only by name much faster. However, it was limited to `-o name -s name`, this work extends this optimization to work with: - name - guid - createtxg - numclones - inconsistent - redacted - origin and could be further extended to any other properties supported by dsl_dataset_fast_stat() or similar, that do not require extra locking or reading from disk. This was committed before (9a9e2e343dfa2af28bf7910de77ae73aa006de62), but was reverted due to a regression when used with an older kernel. If the kernel does not populate zc->zc_objset_stats, we now fallback to getting the properties via the slower interface, to avoid problems with newer userland and older kernels. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Allan Jude <[email protected]> Closes #14110
* Do not pass -1 to strerror() from zfs_send_cb_impl()Richard Yao2022-12-082-2/+5
| | | | | | | | | | | | | | | | | | | | | `zfs_send_cb_impl()` calls `dump_filesystems()`, which calls `dump_filesystem()`, which will return `-1` as an error when `zfs_open()` returns `NULL`. This will be passed to `zfs_standard_error()`, which passes it to `zfs_standard_error_fmt()`, which passes it to `strerror()`. To fix this, we modify zfs_open() to set `errno` whenever it returns NULL. Most of the cases already have `errno` set (since they pass it to `zfs_standard_error_fmt()`, which makes this easy. Then we modify `dump_filesystem()` to pass `errno` instead of `-1`. Reported-by: Coverity (CID-1524598) Reviewed-by: Damian Szuberski <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14264
* Fix GCC 12 compilation errorsszubersk2022-11-301-0/+30
| | | | | | | | | Squelch false positives reported by GCC 12 with UBSan. Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: szubersk <[email protected]> Closes #14150
* FreeBSD: do_mount() passes wrong string length to helperRichard Yao2022-11-181-18/+8
| | | | | | | | | | | | | | | | | | | It should pass `MNT_LINE_MAX`, but passes `sizeof (mntpt)`. This is harmless because the strlen is not actually used by the helper, but FreeBSD's Coverity scans complained about it. This was missed in my audit of various string functions since it is not actually passed to a string function. Upon review, it was noticed that the helper function does not need to be a separate function, so I have inlined it as cleanup. Reported-by: Coverity (CID 1432079) Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: szubersk <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14136
* Allow to control failfastMariusz Zaborski2022-11-101-1/+2
| | | | | | | | | | | | | | | | | | | | | Linux defaults to setting "failfast" on BIOs, so that the OS will not retry IOs that fail, and instead report the error to ZFS. In some cases, such as errors reported by the HBA driver, not the device itself, we would wish to retry rather than generating vdev errors in ZFS. This new property allows that. This introduces a per vdev option to disable the failfast option. This also introduces a global module parameter to define the failfast mask value. Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Allan Jude <[email protected]> Signed-off-by: Allan Jude <[email protected]> Signed-off-by: Mariusz Zaborski <[email protected]> Sponsored-by: Seagate Technology LLC Submitted-by: Klara, Inc. Closes #14056
* Make 1-bit bitfields unsignedBrooks Davis2022-11-031-3/+3
| | | | | | | | | | | | | | | | This fixes -Wsingle-bit-bitfield-constant-conversion warning from clang-16 like: lib/libzfs/libzfs_dataset.c:4529:19: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion] flags.nounmount = B_TRUE; ^ ~~~~~~ Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: Brooks Davis <[email protected]> Closes #14125
* recvd_props_mode: use a uintptr_t to stash nvlistsBrooks Davis2022-11-031-5/+5
| | | | | | | | | Avoid assuming than a uint64_t can hold a pointer. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: Brooks Davis <[email protected]> Closes #14131
* FreeBSD: Fix regression from kmem_scnprintf() in libzfsRichard Yao2022-11-011-1/+3
| | | | | | | | | | | | kmem_scnprintf() is only available in libzpool. Recent buildbot issues with showing FreeBSD results kept us from seeing this before 97143b9d314d54409244f3995576d8cc8c1ebf0a was merged. The code has been changed to sanitize the output from `kmem_scnprintf()`. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14111
* Introduce kmem_scnprintf()Richard Yao2022-10-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `snprintf()` is meant to protect against buffer overflows, but operating on the buffer using its return value, possibly by calling it again, can cause a buffer overflow, because it will return how many characters it would have written if it had enough space even when it did not. In a number of places, we repeatedly call snprintf() by successively incrementing a buffer offset and decrementing a buffer length, by its return value. This is a potentially unsafe usage of `snprintf()` whenever the buffer length is reached. CodeQL complained about this. To fix this, we introduce `kmem_scnprintf()`, which will return 0 when the buffer is zero or the number of written characters, minus 1 to exclude the NULL character, when the buffer was too small. In all other cases, it behaves like snprintf(). The name is inspired by the Linux and XNU kernels' `scnprintf()`. The implementation was written before I thought to look at `scnprintf()` and had a good name for it, but it turned out to have identical semantics to the Linux kernel version. That lead to the name, `kmem_scnprintf()`. CodeQL only catches this issue in loops, so repeated use of snprintf() outside of a loop was not caught. As a result, a thorough audit of the codebase was done to examine all instances of `snprintf()` usage for potential problems and a few were caught. Fixes for them are included in this patch. Unfortunately, ZED is one of the places where `snprintf()` is potentially used incorrectly. Since using `kmem_scnprintf()` in it would require changing how it is linked, we modify its usage to make it safe, no matter what buffer length is used. In addition, there was a bug in the use of the return value where the NULL format character was not being written by pwrite(). That has been fixed. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14098
* Cleanup: Delete unnecessary pointer check from vdev_to_nvlist_iter()Richard Yao2022-10-181-1/+1
| | | | | | | | | This confused Clang's static analyzer, making it think there was a possible NULL pointer dereference. There is no NULL pointer dereference. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #14042
* Cleanup: Address Clang's static analyzer's unused code complaintsRichard Yao2022-10-145-6/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These were categorized as the following: * Dead assignment 23 * Dead increment 4 * Dead initialization 6 * Dead nested assignment 18 Most of these are harmless, but since actual issues can hide among them, we correct them. That said, there were a few return values that were being ignored that appeared to merit some correction: * `destroy_callback()` in `cmd/zfs/zfs_main.c` ignored the error from `destroy_batched()`. We handle it by returning -1 if there is an error. * `zfs_do_upgrade()` in `cmd/zfs/zfs_main.c` ignored the error from `zfs_for_each()`. We handle it by doing a binary OR of the error value from the subsequent `zfs_for_each()` call to the existing value. This is how errors are mostly handled inside `zfs_for_each()`. The error value here is passed to exit from the zfs command, so doing a binary or on it is better than what we did previously. * `get_zap_prop()` in `module/zfs/zcp_get.c` ignored the error from `dsl_prop_get_ds()` when the property is not of type string. We return an error when it does. There is a small concern that the `zfs_get_temporary_prop()` call would handle things, but in the case that it does not, we would be pushing an uninitialized numval onto the lua stack. It is expected that `dsl_prop_get_ds()` will succeed anytime that `zfs_get_temporary_prop()` does, so that not giving it a chance to fix things is not a problem. * `draid_merge_impl()` in `tests/zfs-tests/cmd/draid.c` used `nvlist_add_nvlist()` twice in ways in which errors are expected to be impossible, so we switch to `fnvlist_add_nvlist()`. A few notable ones did not merit use of the return value, so we suppressed it with `(void)`: * `write_free_diffs()` in `lib/libzfs/libzfs_diff.c` ignored the error value from `describe_free()`. A look through the commit history revealed that this was intentional. * `arc_evict_hdr()` in `module/zfs/arc.c` did not need to use the returned handle from `arc_hdr_realloc()` because it is already referenced in lists. * `spa_vdev_detach()` in `module/zfs/spa.c` has a comment explicitly saying not to use the error from `vdev_label_init()` because whatever causes the error could be the reason why a detach is being done. Unfortunately, I am not presently able to analyze the kernel modules with Clang's static analyzer, so I could have missed some cases of this. In cases where reports were present in code that is duplicated between Linux and FreeBSD, I made a conscious effort to fix the FreeBSD version too. After this commit is merged, regressions like dee8934 should become extremely obvious with Clang's static analyzer since a regression would appear in the results as the only instance of unused code. That assumes that Coverity does not catch the issue first. My local branch with fixes from all of my outstanding non-draft pull requests shows 118 reports from Clang's static anlayzer after this patch. That is down by 51 from 169. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Cedric Berger <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13986
* Handle possible null pointers from malloc/strdup/strndup()Richard Yao2022-10-061-2/+3
| | | | | | | | | | | | | | | | | | | | | | | GCC 12.1.1_p20220625's static analyzer caught these. Of the two in the btree test, one had previously been caught by Coverity and Smatch, but GCC flagged it as a false positive. Upon examining how other test cases handle this, the solution was changed from `ASSERT3P(node, !=, NULL);` to using `perror()` to be consistent with the fixes to the other fixes done to the ZTS code. That approach was also used in ZED since I did not see a better way of handling this there. Also, upon inspection, additional unchecked pointers from malloc()/calloc()/strdup() were found in ZED, so those were handled too. In other parts of the code, the existing methods to avoid issues from memory allocators returning NULL were used, such as using `umem_alloc(size, UMEM_NOFAIL)` or returning `ENOMEM`. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13979
* Add membar_sync abi changeUmer Saleem2022-10-041-0/+1
| | | | | | | | | | It appears membar_sync was not present in libzfs.abi with other membar_* functions. This commit updates libzfs.abi for membar_sync. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13969
* Expose libzutil error info in libpc_handle_tUmer Saleem2022-10-041-4/+28
| | | | | | | | | | | | | | | | | | | | In libzutil, for zpool_search_import and zpool_find_config, we use libpc_handle_t internally, which does not maintain error code and it is not exposed in the interface. Due to this, the error information is not propagated to the caller. Instead, an error message is printed on stderr. This commit adds lpc_error field in libpc_handle_t and exposes it in the interface, which can be used by the users of libzutil to get the appropriate error information and handle it accordingly. Users of the API can also control if they want to print the error message on stderr. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13969
* Fix userland dereference NULL return value bugsRichard Yao2022-09-301-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | * `zstream_do_token()` does not handle failures from `libzfs_init()` * `ztest_global_vars_to_zdb_args()` does not handle failures from `calloc()`. * `zfs_snapshot_nvl()` will pass an offset to a NULL pointer as a source to `strlcpy()` if the provided nvlist is `NULL`. We handle these by doing what the existing error handling does for other errors involving these functions. Coverity complained about these. It had complained about several more, but one was fixed by 570ca4441e0583c8dcb5c7179f5eb331d1172784 and another was a false positive. The remaining complaints labelled "dereferece null return vaue" involve fetching things stored in in-kernel data structures via `list_head()/list_next()`, `AVL_PREV()/AVL_NEXT()` and `zfs_btree_find()`. Most of them occur in void functions that have no error handling. They are much harder to analyze than the two fixed in this patch, so they are left for a follow-up patch. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13971
* zed: mark disks as REMOVED when they are removedAmeer Hamza2022-09-282-0/+43
| | | | | | | | | | | | | ZED does not take any action for disk removal events if there is no spare VDEV available. Added zpool_vdev_remove_wanted() in libzfs and vdev_remove_wanted() in vdev.c to remove the VDEV through ZED on removal event. This means that if you are running zed and remove a disk, it will be properly marked as REMOVED. Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #13797
* Fix unsafe string operationsRichard Yao2022-09-272-14/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Coverity caught unsafe use of `strcpy()` in `ztest_dmu_objset_own()`, `nfs_init_tmpfile()` and `dump_snapshot()`. It also caught an unsafe use of `strlcat()` in `nfs_init_tmpfile()`. Inspired by this, I did an audit of every single usage of `strcpy()` and `strcat()` in the code. If I could not prove that the usage was safe, I changed the code to use either `strlcpy()` or `strlcat()`, depending on which function was originally used. In some cases, `snprintf()` was used to replace multiple uses of `strcat` because it was cleaner. Whenever I changed a function, I preferred to use `sizeof(dst)` when the compiler is able to provide the string size via that. When it could not because the string was passed by a caller, I checked the entire call tree of the function to find out how big the buffer was and hard coded it. Hardcoding is less than ideal, but it is safe unless someone shrinks the buffer sizes being passed. Additionally, Coverity reported three more string related issues: * It caught a case where we do an overlapping memory copy in a call to `snprintf()`. We fix that via `kmem_strdup()` and `kmem_strfree()`. * It caught `sizeof (buf)` being used instead of `buflen` in `zdb_nicenum()`'s call to `zfs_nicenum()`, which is passed to `snprintf()`. We change that to pass `buflen`. * It caught a theoretical unterminated string passed to `strcmp()`. This one is likely a false positive, but we have the information needed to do this more safely, so we change this to silence the false positive not just in coverity, but potentially other static analysis tools too. We switch to `strncmp()`. * There was a false positive in tests/zfs-tests/cmd/dir_rd_update.c. We suppress it by switching to `snprintf()` since other static analysis tools might complain about it too. Interestingly, there is a possible real bug there too, since it assumes that the passed directory path ends with '/'. We add a '/' to fix that potential bug. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13913
* Cleanup: Switch to strlcpy from strncpyRichard Yao2022-09-272-8/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Coverity found a bug in `zfs_secpolicy_create_clone()` where it is possible for us to pass an unterminated string when `zfs_get_parent()` returns an error. Upon inspection, it is clear that using `strlcpy()` would have avoided this issue. Looking at the codebase, there are a number of other uses of `strncpy()` that are unsafe and even when it is used safely, switching to `strlcpy()` would make the code more readable. Therefore, we switch all instances where we use `strncpy()` to use `strlcpy()`. Unfortunately, we do not portably have access to `strlcpy()` in tests/zfs-tests/cmd/zfs_diff-socket.c because it does not link to libspl. Modifying the appropriate Makefile.am to try to link to it resulted in an error from the naming choice used in the file. Trying to disable the check on the file did not work on FreeBSD because Clang ignores `#undef` when a definition is provided by `-Dstrncpy(...)=...`. We workaround that by explictly including the C file from libspl into the test. This makes things build correctly everywhere. We add a deprecation warning to `config/Rules.am` and suppress it on the remaining `strncpy()` usage. `strlcpy()` is not portably avaliable in tests/zfs-tests/cmd/zfs_diff-socket.c, so we use `snprintf()` there as a substitute. This patch does not tackle the related problem of `strcpy()`, which is even less safe. Thankfully, a quick inspection found that it is used far more correctly than strncpy() was used. A quick inspection did not find any problems with `strcpy()` usage outside of zhack, but it should be said that I only checked around 90% of them. Lastly, some of the fields in kstat_t varied in size by 1 depending on whether they were in userspace or in the kernel. The origin of this discrepancy appears to be 04a479f7066ccdaa23a6546955303b172f4a6909 where it was made for no apparent reason. It conflicts with the comment on KSTAT_STRLEN, so we shrink the kernel field sizes to match the userspace field sizes. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13876
* Enforce "-F" flag on resuming recv of full/newfs on existing datasetJitendra Patidar2022-09-272-0/+43
| | | | | | | | | | | | | | | | | | | | | | When receiving full/newfs on existing dataset, then it should be done with "-F" flag. Its enforced for initial receive in checks done in zfs_receive_one function of libzfs. Similarly, on resuming full/newfs recv on existing dataset, it should be done with "-F" flag. When dataset doesn't exist, then full/new recv is done on newly created dataset and it's marked INCONSISTENT. But when receiving on existing dataset, recv is first done on %recv and its marked INCONSISTENT. Existing dataset is not marked INCONSISTENT. Resume of full/newfs receive with dataset not INCONSISTENT indicates that its resuming newfs on existing dataset. So, enforce "-F" flag in this case. Also return an error from dmu_recv_resume_begin_check() in zfs kernel, when its resuming full/newfs recv without force. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Jitendra Patidar <[email protected]> Closes #13856 Closes #13857
* Fix userland resource leaksRichard Yao2022-09-231-0/+1
| | | | | | | | | | | | Coverity caught these. With the exception of the file descriptor leak in tests/zfs-tests/cmd/draid.c, they are all memory leaks. Also, there is a piece of dead code in zfs_get_enclosure_sysfs_path(). We delete it as cleanup. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13921
* Call va_end() before return in zpool_standard_error_fmt()Richard Yao2022-09-201-1/+1
| | | | | | | | | | | | | | | | | Commit ecd6cf800b63704be73fb264c3f5b6e0dafc068d by marks in OpenSolaris at Tue Jun 26 07:44:24 2007 -0700 introduced a bug where we fail to call `va_end()` before returning. The man page for va_start() says: "Each invocation of va_start() must be matched by a corresponding invocation of va_end() in the same function." Coverity complained about this. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13904
* Handle ECKSUM as new EZFS_CKSUM ‒ "insufficient replicas"наб2022-09-161-0/+6
| | | | | | | | | | Add a meaningful error message for ECKSUM to common error messages. Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #6805 Closes #13808 Closes #13898
* zpool_load_compat() should create strings of length ZFS_MAXPROPLENRichard Yao2022-09-121-2/+2
| | | | | | | | | | Otherwise, `strlcat()` can overflow them. Coverity found this. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Neal Gompa <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #13866
* Make zfs-share service resilient to stale exportsDon Brady2022-09-092-17/+50
| | | | | | | | | | | | | | | | The are a few cases where stale entries in /etc/exports.d/zfs.exports will cause the nfs-server service to fail when starting up. Since the nfs-server startup consumes /etc/exports.d/zfs.exports, the zfs-share service (which rebuilds the list of zfs exports) should run before the nfs-server service. To make the zfs-share service resilient to stale exports, this change truncates the zfs config file as part of the zfs share -a operation. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #13775
* Revert "Avoid panic with recordsize > 128k, raw sending and no large_blocks"Brian Behlendorf2022-08-251-10/+0
| | | | | | | | | | This reverts commit 80a650b7bb04bce3aef5e4cfd1d966e3599dafd4. This change inadvertently introduced a regression in ztest where one of the new ASSERTs is triggered in dsl_scan_visitbp(). Reviewed-by: George Amanakis <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #12275 Closes #13799
* Updates for snapshots_changed propertyUmer Saleem2022-08-241-1/+1
| | | | | | | | | | | | | | | | Currently, snapshots_changed property is stored in dd_props_zapobj, due to which the property is assumed to be local. This causes a difference in behavior with respect to other readonly properties. This commit stores the snapshots_changed property in dd_object. Source is not set to local in this case, which makes it consistent with other readonly properties. This commit also updates the date string format to include seconds. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13785
* libzfs: Remove unused zpool_get_physpath()Ryan Moeller2022-08-042-155/+0
| | | | | | | | | | This is an oddly specific function that has never had any consumers in the history of this repo. Get rid of it and the pile of helper functions that exist for it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #13724
* Add snapshots_changed as propertyUmer Saleem2022-08-022-1/+22
| | | | | | | | | | | | | | | | | | | Make dd_snap_cmtime property persistent across mount and unmount operations by storing in ZAP and restore the value from ZAP on hold into dd_snap_cmtime instead of updating it. Expose dd_snap_cmtime as 'snapshots_changed' property that provides a mechanism to quickly determine whether snapshot list for dataset has changed without having to mount a dataset or iterate the snapshot list. It specifies the time at which a snapshot for a dataset was last created or deleted. This allows us to be more efficient how often we query snapshots. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Umer Saleem <[email protected]> Closes #13635
* Implement a new type of zfs receive: corrective receive (-c)Alek P2022-07-281-12/+85
| | | | | | | | | | | | | | This type of recv is used to heal corrupted data when a replica of the data already exists (in the form of a send file for example). With the provided send stream, corrective receive will read from disk blocks described by the WRITE records. When any of the reads come back with ECKSUM we use the data from the corresponding WRITE record to rewrite the corrupted block. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Paul Zuchowski <[email protected]> Signed-off-by: Alek Pinchuk <[email protected]> Closes #9372
* Add createtxg sort support for simple snapshot iteratorAmeer Hamza2022-07-251-0/+14
| | | | | | | | | | | | | | | - When iterating snapshots with name only, e.g., "-o name -s name", libzfs uses simple snapshot iterator and results are displayed in alphabetic order. This PR adds support for faster version of createtxg sort by avoiding nvlist parsing for properties. Flags "-o name -s createtxg" will enable createtxg sort while using simple snapshot iterator. - Added support to read createtxg property directly from zfs handle for filesystem, volume and snapshot types instead of parsing nvlist. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #13577
* Expose ZFS dataset case sensitivity setting via sb_optsixhamza2022-07-141-0/+7
| | | | | | | | Makes the case sensitivity setting visible on Linux in /proc/mounts. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Alexander Motin <[email protected]> Signed-off-by: Ameer Hamza <[email protected]> Closes #13607
* Replace dead opensolaris.org license linkTino Reichardt2022-07-1116-16/+16
| | | | | | | | | The commit replaces all findings of the link: http://www.opensolaris.org/os/licensing with this one: https://opensource.org/licenses/CDDL-1.0 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tino Reichardt <[email protected]> Closes #13619