aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* OpenZFS 9862 - fix typo in comment in vdev_impl.hAllan Jude2018-10-181-1/+1
| | | | | | | | | | | | | Authored by: Allan Jude <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Reviewed by: Tony Hutter <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: George Melikov <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9862 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/84927f52 Closes #8036
* Allow copy-builtin to work with modified sourcesMatthew Thode2018-10-172-25/+14
| | | | | | | | | | | | | | `scripts/make_gitrev.sh` had 'set -e' so if any command failed it would fail and cause copy-builtin to fail (copy-builtin also has `set -e`. This commit also simplifies scripts/make_gitrev.sh to always write a file by using a cleanup function. It also simplifies other areas of the script as well (making it much shorter). Reviewed-by: John Kennedy <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Thode <[email protected]> Closes #8022 Closes #8025
* zpool: allow sharing of spare device among poolsLOLi2018-10-175-3/+130
| | | | | | | | | | | ZFS allows, by default, sharing of spare devices among different pools; this commit simply restores this functionality for disk devices and adds an additional tests case to the ZFS Test Suite to prevent future regression. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7999
* Linux does not HAVE_SMB_SHAREMatthew Ahrens2018-10-173-254/+0
| | | | | | | | | | Since Linux does not have an in-kernel SMB server, we don't need the code to manage it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Richard Elling <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #8032
* Linux does not HAVE_DNLCMatthew Ahrens2018-10-173-66/+1
| | | | | | | | | | | Since Linux does not have the Directory Name Lookup Cache, we don't need the code to manage it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tim Chase <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Richard Elling <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #8031
* Advise users to retain issue/PR templatesbunder20152018-10-172-1/+4
| | | | | | | | | | | | | | Occasionally we get issues and PRs from users who delete the templates. Advise users that their issues and PRs may be closed if they do not fill out the templates as we really need this information. Also updating PR template to drop unneeded approval toggle as we are now using issue labels for status tracking. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: bunder2015 <[email protected]> Closes #8029
* Add types to featureflags in zfsPaul Dagnelie2018-10-1611-104/+276
| | | | | | | | | | | | | | | | | | | | The boolean featureflags in use thus far in ZFS are extremely useful, but because they take advantage of the zap layer, more interesting data than just a true/false value can be stored in a featureflag. In redacted send/receive, this is used to store the list of redaction snapshots for a redacted dataset. This change adds the ability for ZFS to store types other than a boolean in a featureflag. The only other implemented type is a uint64_t array. It also modifies the interfaces around dataset features to accomodate the new capabilities, and adds a few new functions to increase encapsulation. This functionality will be used by the Redacted Send/Receive feature. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #7981
* deadlock between mm_sem and tx assign in zfs_write() and page faultilbsmart2018-10-166-54/+152
| | | | | | | | | | | | | | | | | | | The bug time sequence: 1. thread #1, `zfs_write` assign a txg "n". 2. In a same process, thread #2, mmap page fault (which means the `mm_sem` is hold) occurred, `zfs_dirty_inode` open a txg failed, and wait previous txg "n" completed. 3. thread #1 call `uiomove` to write, however page fault is occurred in `uiomove`, which means it need `mm_sem`, but `mm_sem` is hold by thread #2, so it stuck and can't complete, then txg "n" will not complete. So thread #1 and thread #2 are deadlocked. Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Grady Wong <[email protected]> Closes #7939
* Add zts-report.py to python shebang exclusionBrian Behlendorf2018-10-151-1/+1
| | | | | | | | | | Include zts-report.py is the __brp_mangle_shebangs_exclude_from to resolve build failures in Fedora 28 and newer. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8020 Issue #7360
* OpenZFS 9847 - leaking dd_clones (DMU_OT_DSL_CLONES) objects (#7979)Matthew Ahrens2018-10-123-6/+253
| | | | | | | | | | | | | | | | | | | | | | OpenZFS 9847 - leaking dd_clones (DMU_OT_DSL_CLONES) objects We're leaking the dd_clones objects in dsl_dir_destroy_sync. This bug appears to have been around forever. Thankfully the amount of space typically involved is tiny. In addition this adds a mechanism in ZDB to find objects in the MOS which are leaked (not referenced anywhere). Porting notes: * Added dd_crypto_obj to ZDB MOS object leak tracking Authored by: Matthew Ahrens <[email protected]> Reviewed-by: George Wilson <[email protected]> Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Ported-by: Matthew Ahrens <[email protected]> OpenZFS-issue: https://illumos.org/issues/9847 Closes #7979
* Improved error handling for extreme rewindsBrian Behlendorf2018-10-126-67/+116
| | | | | | | | | | | | | | | | | | | | | | | The vdev_checkpoint_sm_object(), vdev_obsolete_sm_object(), and vdev_obsolete_counts_are_precise() functions assume that the only way a zap_lookup() can fail is if the requested entry is missing. While this is the most common cause, it's not the only cause. Attemping to access a damaged ZAP will result in other errors. The most likely scenario for accessing a damaged ZAP is during an extreme rewind pool import. Under these conditions the pool is expected to contain damaged objects and the import code was updated to handle this gracefully. Getting an ECKSUM error from these ZAPs after the pool in import a far less likely, therefore the behavior for call paths was not modified. Reviewed-by: Tim Chase <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Serapheim Dimitropoulos <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7809 Closes #7921
* Revert "Allow ECKSUM in vdev_checkpoint_sm_object()"Brian Behlendorf2018-10-121-8/+4
| | | | | | | | | | This reverts commit e927fc8a522e1c0db89955cc555841aa23bbd634. Reviewed by: Tim Chase <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Serapheim Dimitropoulos <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7921
* Define timestruc_t for Lustre compatibilityTony Hutter2018-10-121-0/+3
| | | | | | | | | | | Lustre 2.8 (and possibly other versions) are still using timestruc_t, which was removed in spl-0.7.10 in favor of inode_timespec_t. Add in a backwards compatibility #define for timestruc_t so that Lustre builds. Reviewed by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #8014
* OpenZFS 9689 - zfs range lock code should not be zpl-specificMatt Ahrens2018-10-1110-598/+487
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ZFS range locking code in zfs_rlock.c/h depends on ZPL-specific data structures, specifically znode_t. However, it's also used by the ZVOL code, which uses a "dummy" znode_t to pass to the range locking code. We should clean this up so that the range locking code is generic and can be used equally by ZPL and ZVOL, and also can be used by future consumers that may need to run in userland (libzpool) as well as the kernel. Porting notes: * Added missing sys/avl.h include to sys/zfs_rlock.h. * Removed 'dbuf is within the locked range' ASSERTs from dmu_sync(). This was needed because ztest does not yet use a locked_range_t. * Removed "Approved by:" tag requirement from OpenZFS commit check to prevent needless warnings when integrating changes which has not been merged to illumos. * Reverted free_list range lock changes which were originally needed to defer the cv_destroy() which was called immediately after cv_broadcast(). With d2733258 this should be safe but if not we may need to reintroduce this logic. * Reverts: The following two commits were reverted and squashed in to this change in order to make it easier to apply OpenZFS 9689. - d88895a0, which removed the dummy znode from zvol_state - e3a07cd0, which updated ztest to use range locks * Preserved optimized rangelock comparison function. Preserved the rangelock free list. The cv_destroy() function will block waiting for all processes in cv_wait() to be scheduled and drop their reference. This is done to ensure it's safe to free the condition variable. However, blocking while holding the rl->rl_lock mutex can result in a deadlock on Linux. A free list is introduced to defer the cv_destroy() and kmem_free() until after the mutex is released. Authored by: Matthew Ahrens <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Reviewed by: Serapheim Dimitropoulos <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Brad Lewis <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9689 OpenZFS-commit: https://github.com/openzfs/openzfs/pull/680 External-issue: DLPX-58662 Closes #7980
* Fix changelist mounted-dataset iterationAlek P2018-10-109-28/+201
| | | | | | | | | | | Commit 0c6d093 caused a regression in the inherit codepath. The fix is to restrict the changelist iteration on mountpoints and add proper handling for 'legacy' mountpoints Reviewed by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Signed-off-by: Alek Pinchuk <[email protected]> Closes #7988 Closes #7991
* Check scheduler for "noop" before setting "noop"Garrett Fields2018-10-101-1/+1
| | | | | | | | | | | | | | | Originally code only checked for presence of "/sys/block/$i/queue/ scheduler". "sh: write error: Invalid argument" was produced when trying to set "noop" on certain devices (eg. virtio) when it isn't a listed option. This modification continues to check for the presence of "/sys/block/$i/queue/scheduler" and also checks that it contains "noop" as an option before setting "noop". Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Richard Elling <[email protected]> Signed-off-by: Garrett Fields <[email protected]> Closes #8004
* Print "(repairing)" in zpool status againTony Hutter2018-10-094-4/+84
| | | | | | | | | | | | | | | | | | | | Historically, zpool status prints "(repairing)" for any drives that have errors during a scrub: NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 /tmp/file1 ONLINE 13 0 0 (repairing) /tmp/file2 ONLINE 0 0 0 /tmp/file3 ONLINE 0 0 0 This was accidentally broken in "OpenZFS 9166 - zfs storage pool checkpoint" (d2734cc). This patch adds it back in. Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7779 Closes #7978
* Refactor dmu_recv into its own filePaul Dagnelie2018-10-0912-2863/+2969
| | | | | | | | | | This change moves the bottom half of dmu_send.c (where the receive logic is kept) into a new file, dmu_recv.c, and does similarly for receive-related changes in header files. Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #7982
* Fix arc_release() refcountBrian Behlendorf2018-10-091-1/+1
| | | | | | | | | | Update arc_release to use arc_buf_size(). This hunk was accidentally dropped when porting compressed send/recv, 2aa34383b. Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #8000
* Add zfs_refcount_transfer_ownership_many()Brian Behlendorf2018-10-093-9/+21
| | | | | | | | | | | | | | | | | | | | | | | When debugging is enabled and a zfs_refcount_t contains multiple holders using the same key, but different ref_counts, the wrong reference_t may be transferred. Add a zfs_refcount_transfer_ownership_many() function, like the existing zfs_refcount_*_many() functions, to match and transfer the correct refcount_t; This issue may occur when using encryption with refcount debugging enabled. An arc_buf_hdr_t can have references for both the hdr->b_l1hdr.b_pabd and hdr->b_crypt_hdr.b_rabd both of which use the hdr as the reference holder. When unsharing the buffer the p_abd should be transferred. This issue does not impact production builds because refcount holders are not tracked. Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7219 Closes #8000
* Create /proc/sys/kernel/spl/gitrev with git hashMatthew Ahrens2018-10-089-9/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing mechanisms for determining what code is running in the kernel do not always correctly report the git hash. The versions reported there do not reflect changes made since `configure` was run (i.e. incremental builds do not update the version) and they are misleading if git tags are not set up properly. This applies to `modinfo zfs`, `dmesg`, and `/sys/module/zfs/version`. There are complicated requirements on how the existing version is generated. Therefore we are leaving that alone, and adding a new mechanism to record and retrieve the git hash: `cat /proc/sys/kernel/spl/gitrev` The gitrev is re-generated at compile time, when running `make` (including for incremental builds). The value is the output of `git describe` (or "unknown" if not in a git repo or there are uncommitted changes). We're also removing /proc/sys/kernel/spl/version, which was never very useful. Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Reviewed by: Tim Chase <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #7931 Closes #7965
* OpenZFS 9617 - too-frequent TXG sync causes excessive write inflationMatthew Ahrens2018-10-044-11/+18
| | | | | | | | | | | | | | | | | | | | Porting notes: * Renamed zfs_dirty_data_sync_pct to zfs_dirty_data_sync_percent and changed the type to be consistent with the other dirty module params. * Updated zfs-module-parameters.5 accordingly. Authored by: Matthew Ahrens <[email protected]> Reviewed by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Andrew Stormont <[email protected]> Reviewed-by: George Melikov <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9617 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7928f4ba Closes #7976
* Warn if checking programs are not installedMatthew Ahrens2018-10-041-0/+10
| | | | | | | | | | | | | | | | | `make checkstyle` silently skips checks if the required programs are not installed (e.g. shellcheck, mandoc). Therefore developers may not realize that they are not getting the full suite of code checks. This also applies to more specific targets like `make shellcheck`. We should print a warning message when a check is skipped due to missing tools. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Prakash Surya <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #7984
* Add codecheck make targetMatthew Ahrens2018-10-041-1/+3
| | | | | | | | | | | | | | | | | We'd like to have tooling that verifies code style, while ignoring the commit message. For example, code does not need to be signed off in order to be tested. Current workarounds are to run `git checkstyle` and ignore the commit message errors, or to run `make cstyle shellcheck flake8 mancheck testscheck`, and make sure that list stays updated. Solution is to add a new make target, `codecheck` which does all the code checks. `checkstyle` is now simply `codecheck` + `commitcheck`. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #7985
* Fix ASSERT macros to not over-expandPaul Dagnelie2018-10-032-33/+103
| | | | | | | | | The code reuse in the definitions of the ASSERT and VERIFY macros result in expansion of their arguments before they are stringified, which produces ugly and undesirable output. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #7884
* Add new fnvlist_lookup_* functionsPaul Dagnelie2018-10-032-16/+105
| | | | | | Reviewed by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes #7977
* Verify 'zfs destroy' will unshare the datasetPrakash Surya2018-10-034-22/+104
| | | | | | | | | | | | This change adds a new test case to the zfs-test suite to verify that when 'zfs destroy' is used on a shared dataset, the dataset will be unshared after the destroy operation completes. Reviewed by: loli10K <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Prakash Surya <[email protected]> Closes #7941
* Fix "zfs destroy" when "sharenfs=on" is usedPrakash Surya2018-10-033-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using "zfs destroy" on a dataset that is using "sharenfs=on" and has been automatically exported (by libzfs), the dataset will not be automatically unexported as it should be. This workflow appears to have been broken by this commit: 3fd3e56cfd543d7d7a1bf502bfc0db6e24139668 In that change, the "zfs_unmount" function was modified to use the "mnt.mnt_special" field when determining the mount point that is being unmounted, rather than "mnt.mnt_mountp". As a result, when "mntpt" is passed into "zfs_unshare_proto", it's value is now the dataset name rather than the mountpoint. Thus, when this value is used with the "is_shared" function (via "zfs_unshare_proto") it will not find a match (since that function assumes it'll be passed the mountpoint) and incorrectly reports that the dataset is not shared. This can be easily reproduced with the following commands: $ sudo zpool create tank xvdb $ sudo zfs create -o sharenfs=on tank/fish $ sudo zfs destroy tank/fish $ sudo zfs list -r tank NAME USED AVAIL REFER MOUNTPOINT tank 97.5K 7.27G 24K /tank $ sudo exportfs /tank/fish <world> $ sudo cat /etc/dfs/sharetab /tank/fish - nfs rw,crossmnt At this point, the "tank/fish" filesystem doesn't exist, but it's still listed as exported when looking at "exportfs" and "/etc/dfs/sharetab". Also note, this change brings us back in-sync with the illumos code, as it pertains to this one line; on illumos, "mnt.mnt_mountp" is used. Reviewed by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: George Wilson <[email protected]> Signed-off-by: Prakash Surya <[email protected]> Issue #6143 Closes #7941
* OpenZFS 9677 - panic from zio_write_gang_block()Brad Lewis2018-10-031-6/+14
| | | | | | | | | | | | | | | | Panic from zio_write_gang_block() when creating dump device on fragmented rpool. Authored by: Brad Lewis <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Prashanth Sreenivasa <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9677 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7341a7d Closes #7975
* OpenZFS 9616 - Bogus error when attempting to set property on read-only poolAndrew Stormont2018-10-031-3/+8
| | | | | | | | | | | | Authored by: Andrew Stormont <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9616 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f62db44d Closes #7974
* Refcounted DSL Crypto Key MappingsTom Caputi2018-10-039-101/+195
| | | | | | | | | | | | | | | | | | | | | | | | Since native ZFS encryption was merged, we have been fighting against a series of bugs that come down to the same problem: Key mappings (which must be present during all I/O operations) are created and destroyed based on dataset ownership, but I/Os can have traditionally been allowed to "leak" into the next txg after the dataset is disowned. In the past we have attempted to solve this problem by trying to ensure that datasets are disowned ater all I/O is finished by calling txg_wait_synced(), but we have repeatedly found edge cases that need to be squashed and code paths that might incur a high number of txg syncs. This patch attempts to resolve this issue differently, by adding a reference to the key mapping for each txg it is dirtied in. By doing so, we can remove many of the unnecessary calls to txg_wait_synced() we have added in the past and ensure we don't need to deal with this problem in the future. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #7949
* OpenZFS 9700 - ZFS resilvered mirror does not balance readsJerry Jelinek2018-10-021-0/+4
| | | | | | | | | | | | | | | Authored by: Jerry Jelinek <[email protected]> Reviewed by: Toomas Soome <[email protected]> Reviewed by: Sanjay Nadkarni <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Richard Elling <[email protected]> Approved by: Matthew Ahrens <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9700 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/82f63c3c Closes #7973
* OpenZFS 9763 - zfs(1M): broken formatting in allow/unallow descriptionYuri Pankov2018-10-021-2/+3
| | | | | | | | | | | | | | | | Porting notes: * Two of the three changes from the upstream patch were already applied for Linux. Only the last one is required. Authored by: Yuri Pankov <[email protected]> Reviewed by: Robert Mustacchi <[email protected]> Reviewed-by: George Melikov <[email protected]> Approved by: Gordon Ross <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9763 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/8a702e55 Closes #7972
* changelist should be able to iter on mountsAlek P2018-10-026-68/+366
| | | | | | | | | | | | | Modified changelist_gather()ing for the mountpoint property. Now instead of iterating on all dataset descendants, we read /proc/self/mounts and iterate on the mounted descendant datasets only. Switched changelist implementation from a uu_list_* to uu_avl_* in order to reduce changlist code-path's worst case time complexity. Reviewed by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alek Pinchuk <[email protected]> Closes #7967
* ZTS: Fix snapshot_009_pos, snapshot_010_posBrian Behlendorf2018-10-014-10/+9
| | | | | | | | | | | | | | | | Mitigate the likelihood of the newly created volumes being busy when the 'zfs destroy -r' is issued by waiting for udev to settle. Since this is not a iron clad fix I've added the test case to the known list of possible failures and referenced issue #7961. Finally, in the case this test does fail fix the cleanup logic so subsequent tests won't incorrectly fail. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7961 Closes #7962
* Prefix all refcount functions with zfs_Tim Schumacher2018-10-0123-403/+422
| | | | | | | | | | | | Recent changes in the Linux kernel made it necessary to prefix the refcount_add() function with zfs_ due to a name collision. To bring the other functions in line with that and to avoid future collisions, prefix the other refcount functions as well. Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Schumacher <[email protected]> Closes #7963
* Remove duplicate macro in dsl_dir.hMatthew Ahrens2018-10-011-1/+0
| | | | | | | | | | | | The DD_FIELD_LAST_REMAP_TXG macro was added twice (with the same value). This change removes one of them. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #7968
* Refine split block reconstructionBrian Behlendorf2018-10-013-123/+274
| | | | | | | | | | | | | | | | | | | | | Due to a flaw in 4589f3ae the number of unique combinations could be calculated incorrectly. This could result in the random combinations reconstruction being used when it would have been possible to check all combinations. This change fixes the unique combinations calculation and simplifies the reconstruction logic by maintaining a per- segment list of unique copies. The vdev_indirect_splits_damage() function was introduced to validate both the enumeration and random reconstruction logic with ztest. It is implemented such it will never make a known recoverable block unrecoverable. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Serapheim Dimitropoulos <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #6900 Closes #7934
* Fixes for procfs files backed by linked listsJohn Gallagher2018-09-2624-558/+1086
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are some issues with the way the seq_file interface is implemented for kstats backed by linked lists (zfs_dbgmsgs and certain per-pool debugging info): * We don't account for the fact that seq_file sometimes visits a node multiple times, which results in missing messages when read through procfs. * We don't keep separate state for each reader of a file, so concurrent readers will receive incorrect results. * We don't account for the fact that entries may have been removed from the list between read syscalls, so reading from these files in procfs can cause the system to crash. This change fixes these issues and adds procfs_list, a wrapper around a linked list which abstracts away the details of implementing the seq_file interface for a list and exposing the contents of the list through procfs. Reviewed by: Don Brady <[email protected]> Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: John Gallagher <[email protected]> External-issue: LX-1211 Closes #7819
* Fix flake 8 style warningsGregor Kopka2018-09-262-35/+42
| | | | | | | | | | | | | | Ran zts-report.py and test-runner.py from ./tests/test-runner/bin/ through the 2to3 (https://docs.python.org/2/library/2to3.html). Checked the result, fixed: - 'maxint' -> 'maxsize' that 2to3 missed. - 'cmp=' parameter for a 'sorted()' with a 'key=' version. - try/except wrapping of configparser import as there are still python 2.7 systems that lack a compatibility shim Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Gregor Kopka <[email protected]> Closes #7925 Closes #7952
* Linux 4.19-rc3+ compat: Remove refcount_t compatTim Schumacher2018-09-2632-127/+116
| | | | | | | | | | | | | | | torvalds/linux@59b57717f ("blkcg: delay blkg destruction until after writeback has finished") added a refcount_t to the blkcg structure. Due to the refcount_t compatibility code, zfs_refcount_t was used by mistake. Resolve this by removing the compatibility code and replacing the occurrences of refcount_t with zfs_refcount_t. Reviewed-by: Franz Pletz <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Schumacher <[email protected]> Closes #7885 Closes #7932
* Fix small sysfs leakBrian Behlendorf2018-09-261-10/+15
| | | | | | | | | | | | | | | | | | | | | When zfs_kobj_init() is called with an attr_cnt of 0 only the kobj->zko_default_attrs is allocated. It subsequently won't get freed in zfs_kobj_release since the free is wrapped in a kobj->zko_attr_count != 0 conditional. Split the block in zfs_kobj_release() to make sure the kobj->zko_default_attrs are freed in this case. Additionally, fix a minor spelling mistake and typo in zfs_kobj_init() which could also cause a leak but in practice is almost certain not to fail. Reviewed-by: Richard Elling <[email protected]> Reviewed-by: Tim Chase <[email protected]> Reviewed-by: John Gallagher <[email protected]> Reviewed-by: Don Brady <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7957
* Zpool iostat: remove latency/queue scalingGregor Kopka2018-09-251-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Bandwidth and iops are average per second while *_wait are averages per request for latency or, for queue depths, an instantaneous measurement at the end of an interval (according to man zpool). When calculating the first two it makes sense to do x/interval_duration (x being the increase in total bytes or number of requests over the duration of the interval, interval_duration in seconds) to 'scale' from amount/interval_duration to amount/second. But applying the same math for the latter (*_wait latencies/queue) is wrong as there is no interval_duration component in the values (these are time/requests to get to average_time/request or already an absulute number). This bug leads to the only correct continuous *_wait figures for both latencies and queue depths from 'zpool iostat -l/q' being with duration=1 as then the wrong math cancels itself (x/1 is a nop). This removes temporal scaling from latency and queue depth figures. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Gregor Kopka <[email protected]> Closes #7945 Closes #7694
* Revert "Fix flake 8 style warnings"Brian Behlendorf2018-09-242-37/+35
| | | | | | | | This reverts commit b8fd4310c54444eecb66140d99a6156f4353b29b which accidentally introduced a regression for some versions of python. Signed-off-by: Brian Behlendorf <[email protected]> Issue #7929
* Fix statfs(2) for 32-bit user spaceBrian Behlendorf2018-09-245-5/+64
| | | | | | | | | | | | | | | | | | | | | | | | When handling a 32-bit statfs() system call the returned fields, although 64-bit in the kernel, must be limited to 32-bits or an EOVERFLOW error will be returned. This is less of an issue for block counts since the default reported block size in 128KiB. But since it is possible to set a smaller block size, these values will be scaled as needed to fit in a 32-bit unsigned long. Unlike most other filesystems the total possible file counts are more likely to overflow because they are calculated based on the available free space in the pool. In order to prevent this the reported value must be capped at 2^32-1. This is only for statfs(2) reporting, there are no changes to the internal ZFS limits. Reviewed-by: Andreas Dilger <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #7927 Closes #7122 Closes #7937
* ZTS: Fix removal_resume_exportLOLi2018-09-242-48/+4
| | | | | | | | | | | | | This change simplify the test case removing part of the logic which was introducing a race condition and thus causing spurious failures: we use attempt_during_removal() from removal.kshlib instead which has been observed to be more stable. Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: John Kennedy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7894 Closes #7913
* Fix flake 8 style warningsGregor Kopka2018-09-242-35/+37
| | | | | | | | | | | | | Ran zts-report.py and test-runner.py from ./tests/test-runner/bin/ through the 2to3 (https://docs.python.org/2/library/2to3.html). Checked the result, fixed: - 'maxint' -> 'maxsize' that 2to3 missed. - 'cmp=' parameter for a 'sorted()' with a 'key=' version. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: John Wren Kennedy <[email protected]> Signed-off-by: Gregor Kopka <[email protected]> Closes #7925 Closes #7929
* vdev_disk_error() prints ASCII SOH to debug logLOLi2018-09-211-4/+3
| | | | | | | | | | | | | | | | | | Currently vdev_disk_error() prepends its messages sent to the internal ZFS debug log with KERN_WARNING, which is currently defined as follows: #define KERN_SOH "\001" #define KERN_WARNING KERN_SOH "4" Since "\001" (ASCII Start Of Header) is not printable this results in weird characters displayed when inspecting the debug log. This commit simply removes this superfluous prefix passed to zfs_dbgmsg(). Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7936
* Fix reference to zpool-features(5)DeHackEd2018-09-211-1/+1
| | | | | | | | Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: DHE <[email protected]> Closes #7938
* Add limits to spa_slop_shift tunableLOLi2018-09-201-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds limits to the possible spa_slop_shift values set via the sysfs interface. Accepted values are from a minimum of 1 to a maximum of 31 (inclusive): these limits are based on the following values observed on a 128PB file-vdev test pool: spa_slop_shift=1, spa_get_slop_space=63.5PiB spa_slop_shift=2, spa_get_slop_space=31.8PiB spa_slop_shift=3, spa_get_slop_space=15.9PiB spa_slop_shift=4, spa_get_slop_space=7.9PiB spa_slop_shift=5, spa_get_slop_space=4PiB spa_slop_shift=6, spa_get_slop_space=2PiB ... spa_slop_shift=25, spa_get_slop_space=4GiB spa_slop_shift=26, spa_get_slop_space=2GiB spa_slop_shift=27, spa_get_slop_space=1016MiB spa_slop_shift=28, spa_get_slop_space=508MiB spa_slop_shift=29, spa_get_slop_space=254MiB spa_slop_shift=30, spa_get_slop_space=128MiB spa_slop_shift=31, spa_get_slop_space=128MiB spa_slop_shift=32, spa_get_slop_space=128MiB Reviewed-by: Richard Elling <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7876 Closes #7900