summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Illumos 6268 - zfs diff confused by moving a file to another directoryJoshua M. Clulow2016-01-121-12/+2
| | | | | | | | | | | | | | 6268 zfs diff confused by moving a file to another directory Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Justin Gibbs <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/6268 https://github.com/illumos/illumos-gate/commit/aab0441 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 6171 - dsl_prop_unregister() slows down dataset eviction.Justin T. Gibbs2016-01-128-190/+186
| | | | | | | | | | | | | | | | | | | | | | | | 6171 dsl_prop_unregister() slows down dataset eviction. Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Prakash Surya <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/6171 https://github.com/illumos/illumos-gate/commit/03bad06 Porting notes: - Conflicts - 3558fd7 Prototype/structure update for Linux - 2cf7f52 Linux compat 2.6.39: mount_nodev() - 13fe019 Illumos #3464 - 241b541 Illumos 5959 - clean up per-dataset feature count code - dsl_prop_unregister() preserved until out of tree consumers like Lustre can transition to dsl_prop_unregister_all(). - Fixing 'space or tab at end of line' in include/sys/dsl_dataset.h Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 6288 - dmu_buf_will_dirty could be fasterMatthew Ahrens2016-01-121-10/+52
| | | | | | | | | | | | | | | | | | | | | 6288 dmu_buf_will_dirty could be faster Reviewed by: George Wilson <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Justin Gibbs <[email protected]> Reviewed by: Richard Elling <[email protected]> Approved by: Robert Mustacchi <[email protected]> References: https://www.illumos.org/issues/6288 https://github.com/illumos/illumos-gate/commit/0f2e7d0 Porting notes: - [module/zfs/dbuf.c] - Fix 'warning: ISO C90 forbids mixed declarations and code' by moving 'dbuf_dirty_record_t *dr' to start of code block. Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 6292 - exporting a pool while an async destroyGeorge Wilson2016-01-121-2/+15
| | | | | | | | | | | | | | | | | 6292 exporting a pool while an async destroy is running can leave entries in the deferred tree Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Andriy Gapon <[email protected]> Reviewed by: Fabian Keil <[email protected]> Approved by: Gordon Ross <[email protected]> References: https://www.illumos.org/issues/6292 https://github.com/illumos/illumos-gate/commit/a443cc8 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 6319 - assertion failed in zio_ddt_write: bp->blk_birth == txgMatthew Ahrens2016-01-122-2/+2
| | | | | | | | | | | | | | | | | 6319 assertion failed in zio_ddt_write: bp->blk_birth == txg Reviewed by: George Wilson <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/6319 https://github.com/illumos/illumos-gate/commit/b39b744 Porting notes: - Re-enabled ztest for CentOS test slaves. Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3449
* Illumos 5987 - zfs prefetch code needs workMatthew Ahrens2016-01-1210-658/+280
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5987 zfs prefetch code needs work Reviewed by: Adam Leventhal <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Approved by: Gordon Ross <[email protected]> References: https://www.illumos.org/issues/5987 zfs prefetch code needs work illumos/illumos-gate@cf6106c 5987 zfs prefetch code needs work Porting notes: - [module/zfs/dbuf.c] - 5f6d0b6 Handle block pointers with a corrupt logical size - [module/zfs/dmu_zfetch.c] - c65aa5b Fix gcc missing parenthesis warnings - 428870f Update core ZFS code from build 121 to build 141. - 79c76d5 Change KM_PUSHPAGE -> KM_SLEEP - b8d06fc Switch KM_SLEEP to KM_PUSHPAGE - Account for ISO C90 - mixed declarations and code - warnings - Module parameters (new/changed): - Replaced zfetch_block_cap with zfetch_max_distance (Max bytes to prefetch per stream (default 8MB; 8 * 1024 * 1024)) - Preserved zfs_prefetch_disable as 'int' for consistency with existing Linux module options. - [include/sys/trace_arc.h] - Added new tracepoints - DEFINE_ARC_BUF_HDR_EVENT(zfs_arc__sync__wait__for__async); - DEFINE_ARC_BUF_HDR_EVENT(zfs_arc__demand__hit__predictive__prefetch); - [man/man5/zfs-module-parameters.5] - Updated man page Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 6293 - ztest failure: error == 28 (0xc == 0x1c) in ztest_tx_assign()Brian Behlendorf2016-01-111-1/+11
| | | | | | | | | | | | | | 6293 ztest failure: error == 28 (0xc == 0x1c) in ztest_tx_assign() Reviewed by: George Wilson <[email protected]> Reviewed by: Prakash Surya <[email protected]> Reviewed by: Richard Elling <[email protected]> Approved by: Richard Lowe <[email protected]> References: https://www.illumos.org/issues/6293 https://github.com/illumos/illumos-gate/commit/8fe00bf Ported-by: Brian Behlendorf <[email protected]>
* Illumos 5039 - ztest should default to larger device sizesBrian Behlendorf2016-01-111-1/+1
| | | | | | | | | | | | | | | 5039 ztest should default to larger device sizes Reviewed by: George Wilson <[email protected]> Reviewed by: Max Grossman <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Approved by: Richard Lowe <[email protected]> References: https://www.illumos.org/issues/5039 https://github.com/illumos/illumos-gate/commit/539eed8 Ported-by: Brian Behlendorf <[email protected]>
* Revert "Illumos 3749 - zfs event processing should work on R/O root filesystems"Brian Behlendorf2016-01-114-78/+20
| | | | | | | | | | This reverts commit b47637ecdc7b647ec5bd9dfca888179eecfaa72d which introduced a regression in ztest. $ ./cmd/ztest/ztest -V 5 vdevs, 7 datasets, 23 threads, 300 seconds... *** Error in `/rpool/home/behlendo/src/git/zfs/cmd/ztest/.libs/lt-ztest': double free or corruption (fasttop): 0x0000000000d339f0 ***
* Fix vn_rdwr() compiler warningBrian Behlendorf2016-01-111-2/+2
| | | | | | | kernel.c: In function 'vn_rdwr': kernel.c:736:8: warning: unused variable 'status' [-Wunused-variable] Signed-off-by: Brian Behlendorf <[email protected]>
* Fix 'prevsnap property' build failureBrian Behlendorf2016-01-111-1/+1
| | | | | | | | | | Fix build failure accidentally introduced by 1715493. This only results in a failure when debugging is disabled. dsl_dataset.c: In function 'dsl_dataset_stats': dsl_dataset.c:1698:45: error: 'dp' undeclared (first use in this function) Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 4929 - want prevsnap propertyMatthew Ahrens2016-01-113-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | 4929 want prevsnap property Reviewed by: Adam Leventhal <[email protected]> Reviewed by: Matt Amdur <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Reviewed by: Boris Protopopov <[email protected]> Reviewed by: Richard Lowe <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/4929 https://github.com/illumos/illumos-gate/commit/b461c74 Porting notes: - [include/sys/fs/zfs.h] - f67d70 Create an 'overlay' property - 11b9ec Add full SELinux support - [fs/zfs/dsl_dataset.c] - This increases the stack size of dsl_dataset_stats() but nothing has been changed until this is shown to be an issue. Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 4891 - want zdb option to dump all metadataMatthew Ahrens2016-01-114-10/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 4891 want zdb option to dump all metadata Reviewed by: Sonu Pillai <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: Dan McDonald <[email protected]> Reviewed by: Richard Lowe <[email protected]> Approved by: Garrett D'Amore <[email protected]> We'd like a way for zdb to dump metadata in a machine-readable format, so that we can bring that back from a customer site for in-house diagnosis. Think of it as a crash dump for zpools, which can be used for post-mortem analysis of a malfunctioning pool References: https://www.illumos.org/issues/4891 https://github.com/illumos/illumos-gate/commit/df15e41 Porting notes: - [cmd/zdb/zdb.c] - a5778ea zdb: Introduce -V for verbatim import - In main() getopt 'opt' variable removed and the code was brought back in line with illumos. - [lib/libzpool/kernel.c] - 1e33ac1 Fix Solaris thread dependency by using pthreads - f0e324f Update utsname support - 4d58b69 Fix vn_open/vn_rdwr error handling - In vn_open() allocate 'dumppath' on heap instead of stack - Properly handle 'dump_fd == -1' error path - Free 'realpath' after added vn_dumpdir_code block Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 4638 - Panic in ZFS via rfs3_setattr()/rfs3_write(): dirtying snapshot!Marcel Telka2016-01-111-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | 4638 Panic in ZFS via rfs3_setattr()/rfs3_write(): dirtying snapshot! Reviewed by: Alek Pinchuk <[email protected]> Reviewed by: Ilya Usvyatsky <[email protected]> Reviewed by: Dan McDonald <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Garrett D'Amore <[email protected]> Approved by: Garrett D'Amore <[email protected]> References: https://www.illumos.org/issues/4638 https://github.com/illumos/illumos-gate/commit/2144b12 Porting notes: - [module/zfs/zfs_vnops.c] - 3558fd7 Prototype/structure update for Linux - 2cf7f52 Linux compat 2.6.39: mount_nodev() - Use zfs_is_readonly() wrapper - Remove first line of comment which doesn't apply Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Man page whitespacenathancheek2016-01-111-0/+2
| | | | | | Signed-off-by: nathancheek <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4184
* Illumos 3749 - zfs event processing should work on R/O root filesystemsWill Andrews2016-01-114-20/+78
| | | | | | | | | | | | | | | | | | | | | | | | | 3749 zfs event processing should work on R/O root filesystems Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Eric Schrock <[email protected]> Approved by: Christopher Siden <[email protected]> References: https://www.illumos.org/issues/3749 https://github.com/illumos/illumos-gate/commit/3cb69f7 Porting notes: - [include/sys/spa_impl.h] - ffe9d38 Add generic errata infrastructure - 1421c89 Add visibility in to arc_read - [include/sys/fm/fs/zfs.h] - 2668527 Add linux events - 6283f55 Support custom build directories and move includes - [module/zfs/spa_config.c] - Updated spa_config_sync() to match illumos with the exception of a Linux specific block. Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Allow 16M send/recv blocksBrian Behlendorf2016-01-081-1/+1
| | | | | | | | | Fix an off by one error introduced by fcff0f3 which triggers an assertion when 16M blocks are used with send/recv. This fix was intentionally not folder in to the Illumos commit so it can be easily cherry-picked by upstream. Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 5960, 5925Paul Dagnelie2016-01-0840-389/+1418
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin= Reviewed by: Prakash Surya <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> References: https://www.illumos.org/issues/5960 https://www.illumos.org/issues/5925 https://github.com/illumos/illumos-gate/commit/a2cdcdd Porting notes: - [lib/libzfs/libzfs_sendrecv.c] - b8864a2 Fix gcc cast warnings - 325f023 Add linux kernel device support - 5c3f61e Increase Linux pipe buffer size on 'zfs receive' - [module/zfs/zfs_vnops.c] - 3558fd7 Prototype/structure update for Linux - c12e3a5 Restructure zfs_readdir() to fix regressions - [module/zfs/zvol.c] - Function @zvol_map_block() isn't needed in ZoL - 9965059 Prefetch start and end of volumes - [module/zfs/dmu.c] - Fixed ISO C90 - mixed declarations and code - Function dmu_prefetch() 'int i' is initialized before the following code block (c90 vs. c99) - [module/zfs/dbuf.c] - fc5bb51 Fix stack dbuf_hold_impl() - 9b67f60 Illumos 4757, 4913 - 34229a2 Reduce stack usage for recursive traverse_visitbp() - [module/zfs/dmu_send.c] - Fixed ISO C90 - mixed declarations and code - b58986e Use large stacks when available - 241b541 Illumos 5959 - clean up per-dataset feature count code - 77aef6f Use vmem_alloc() for nvlists - 00b4602 Add linux kernel memory support Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Add missing -V option to zdbBrian Behlendorf2016-01-081-2/+2
| | | | | | | Add missing getopt specifier for `zdb -V` verbatim option and set flag with correct bitwise operator. Signed-off-by: Brian Behlendorf <[email protected]>
* Fix casesensitivity=insensitive deadlockRichard Sharpe2016-01-081-2/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When casesensitivity=insensitive is set for the file system, we can deadlock in a rename if the user uses different case for each path. For example rename("A/some-file.txt", "a/some-file.txt"). The simple test for this is: 1. mkdir some-dir in a ZFS file system 2. touch some-dir/some-file.txt 3. mv Some-dir/some-file.txt some-dir/some-other-file.txt This last request deadlocks trying to relock the i_mutex on the inode for the parent directory. The solution is to use d_add_ci in zpl_lookup if we are on a file system that has the casesensitivity=insensitive attribute set. This patch checks if we are working on a case insensitive file system and if so, allocates storage for the case insensitive name and passes it to zfs_lookup and then calls d_add_ci instead of d_splice_alias. The performance impact seems to be minimal even though we have introduced a kmalloc and kfree in the lookup path. The problem was found when running Microsoft's FSCT against Samba on top of ZFS On Linux. Signed-off-by: Richard Sharpe <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4136
* Make arc_summary.py and dbufstat.py compatible with python3Hajo Möller2016-01-072-6/+6
| | | | | | | | | | To make arc_summary.py and dbufstat.py compatible with python3 some minor fixes were required, this was done automatically by `2to3 -w arc_summary.py` and `2to3 -w dbufstat.py`. Signed-off-by: Hajo Möller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Reviewed-by: Richard Laager <[email protected]>
* Illumos 3604 - zdb should print bpobjs more verbosely (fix zdb hang)Matthew Ahrens2016-01-051-0/+1
| | | | | | | | | | | | | | | | 3604 zdb should print bpobjs more verbosely (fix zdb hang) References: https://github.com/illumos/illumos-gate/commit/7706186 https://www.illumos.org/issues/3604 https://lists.freebsd.org/pipermail/svn-src-vendor/2015-August/002411.html https://lists.freebsd.org/pipermail/svn-src-head/2015-August/075195.html Porting notes: In ZoL "5810 zdb should print details of bpobj" was merged prior to this change so it must be applied to the new location. Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 3139 - zdb dies when it tries to determine path of unlinked fileJeremy Jones2016-01-051-4/+30
| | | | | | | | | | | | | | | 3139 zdb dies when it tries to determine path of unlinked file Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: Eric Schrock <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://github.com/illumos/illumos-gate/commit/1ce39b5 https://www.illumos.org/issues/3139 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 2077 - lots of unreachable breaks in illumos gateMilan Jurik2015-12-301-3/+3
| | | | | | | | | | | | | | | | | 2077 lots of unreachable breaks in illumos gate Reviewed by: Dan McDonald <[email protected]> Reviewed by: Garrett D'Amore <[email protected]> Approved by: Richard Lowe <[email protected]> References: https://www.illumos.org/issues/2077 https://github.com/illumos/illumos-gate/commit/33f5ff1 Porting notes: - Only one file of the original patch applied to ZFS - Minor formating change to align copyright block with upstream Ported-by: Brian Behlendorf <[email protected]>
* Illumos 5746 - more checksumming in zfs sendMatthew Ahrens2015-12-306-307/+445
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5746 more checksumming in zfs send Reviewed by: Christopher Siden <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Bayard Bell <[email protected]> Approved by: Albert Lee <[email protected]> References: https://www.illumos.org/issues/5746 https://github.com/illumos/illumos-gate/commit/98110f0 https://github.com/zfsonlinux/zfs/issues/905 Porting notes: - Minor conflicts due to: - https://github.com/zfsonlinux/zfs/commit/2024041 - https://github.com/zfsonlinux/zfs/commit/044baf0 - https://github.com/zfsonlinux/zfs/commit/88904bb - Fix ISO C90 warnings (-Werror=declaration-after-statement) - arc_buf_t *abuf; - dmu_buf_t *bonus; - zio_cksum_t cksum_orig; - zio_cksum_t *cksump; - Fix format '%llx' format specifier warning - Align message in zstreamdump safe_malloc() with upstream Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3611
* Prevent SA length overflowNed Bass2015-12-304-10/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function sa_update() accepts a 32-bit length parameter and assigns it to a 16-bit field in sa_bulk_attr_t, potentially truncating the passed-in value. This could lead to corrupt system attribute (SA) records getting written to the pool. Add a VERIFY to sa_update() to detect cases where overflow would occur. The SA length is limited to 16-bit values by the on-disk format defined by sa_hdr_phys_t. The function zfs_sa_set_xattr() is vulnerable to this bug if the unpacked nvlist of xattrs is less than 64k in size but the packed size is greater than 64k. Fix this by appropriately checking the size of the packed nvlist before calling sa_update(). Add error handling to zpl_xattr_set_sa() to keep the cached list of SA-based xattrs consistent with the data on disk. Lastly, zfs_sa_set_xattr() calls dmu_tx_abort() on an assigned transaction if sa_update() returns an error, but the DMU only allows unassigned transactions to be aborted. Wrap the sa_update() call in a VERIFY0, remove the transaction abort, and call dmu_tx_commit() unconditionally. This is consistent practice with other callers of sa_update(). Signed-off-by: Ned Bass <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #4150
* Illumos 5745 - zfs set allows only one dataset property to be set at a timeChris Williamson2015-12-295-87/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5745 zfs set allows only one dataset property to be set at a time Reviewed by: Christopher Siden <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Bayard Bell <[email protected]> Reviewed by: Richard PALO <[email protected]> Reviewed by: Steven Hartland <[email protected]> Approved by: Rich Lowe <[email protected]> References: https://www.illumos.org/issues/5745 https://github.com/illumos/illumos-gate/commit/3092556 Porting notes: - Fix the missing braces around initializer, zfs_cmd_t zc = {"\0"}; - Remove extra format argument in zfs_do_set() - Declare at the top: - zfs_prop_t prop; - nvpair_t *elem; - nvpair_t *next; - int i; - Additionally initialize: - int added_resv = 0; - zfs_prop_t prop = 0; - Assign 0 install of NULL for uint64_t types. - zc->zc_nvlist_conf = '\0'; - zc->zc_nvlist_src = '\0'; - zc->zc_nvlist_dst = '\0'; Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3574
* Make xattr dir truncate and remove in one txChunwei Chen2015-12-281-8/+15
| | | | | | | | | | | | | | | We need truncate and remove be in the same tx when doing zfs_rmnode on xattr dir. Otherwise, if we truncate and crash, we'll end up with inconsistent zap object on the delete queue. We do this by skipping dmu_free_long_range and let zfs_znode_delete to do the work. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4114 Issue #4052 Issue #4006 Issue #3018 Issue #2861
* Fix empty xattr dir causing lockupChunwei Chen2015-12-281-0/+10
| | | | | | | | | | | | | | | | | | During zfs_rmnode on a xattr dir, if the system crash just after dmu_free_long_range, we would get empty xattr dir in delete queue. This would cause blkid=0 be passed into zap_get_leaf_byblk when doing zfs_purgedir during mount, and would try to do rw_enter on a wrong structure and cause system lockup. We fix this by returning ENOENT when blkid is zero in zap_get_leaf_byblk. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4114 Closes #4052 Closes #4006 Closes #3018 Closes #2861
* Fix z_xattr_lock/z_teardown_lock inversionBrian Behlendorf2015-12-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | There exists a lock inversion between the z_xattr_lock and the z_teardown_lock. Resolve this by taking the z_teardown_lock in all registered xattr callbacks prior to taking the z_xattr_lock. This ensures the locks are always taken is the same order thus preventing a deadlock. Note the z_teardown_lock is taken again in zfs_lookup() and this is safe because the z_teardown lock is a re-entrant read reader/writer lock. * process-1 zpl_xattr_get -> Takes zp->z_xattr_lock __zpl_xattr_get zfs_lookup -> Takes zsb->z_teardown_lock in ZFS_ENTER macro * process-2 zfs_ioc_recv -> Takes zsb->z_teardown_lock in zfs_suspend_fs() zfs_resume_fs zfs_rezget -> Takes zp->z_xattr_lock Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #3943 Closes #3969 Closes #4121
* Revert "Fix z_xattr_lock/z_teardown_lock lock inversion"Brian Behlendorf2015-12-221-10/+1
| | | | This reverts commit 6b32ef572f754efc3f9edb20d022450f8e6b02d9.
* Fix ztest truncated cache fileBrian Behlendorf2015-12-221-2/+3
| | | | | | | | | | | Commit efc412b updated spa_config_write() for Linux 4.2 kernels to truncate and overwrite rather than rename the cache file. This is the correct fix but it should have only been applied for the kernel build. In user space rename(2) is needed because ztest depends on the cache file. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4129
* Identify locks flagged by lockdepOlaf Faaland2015-12-228-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running a kernel with CONFIG_LOCKDEP=y, lockdep reports possible recursive locking in some cases and possible circular locking dependency in others, within the SPL and ZFS modules. This patch uses a mutex type defined in SPL, MUTEX_NOLOCKDEP, to mark such mutexes when they are initialized. This mutex type causes attempts to take or release those locks to be wrapped in lockdep_off() and lockdep_on() calls to silence the dependency checker and allow the use of lock_stats to examine contention. For RW locks, it uses an analogous lock type, RW_NOLOCKDEP. The goal is that these locks are ultimately changed back to type MUTEX_DEFAULT or RW_DEFAULT, after the locks are annotated to reflect their relationship (e.g. z_name_lock below) or any real problem with the lock dependencies are fixed. Some of the affected locks are: tc_open_lock: ============= This is an array of locks, all with same name, which txg_quiesce must take all of in order to move txg to next state. All default to the same lockdep class, and so to lockdep appears recursive. zp->z_name_lock: ================ In zfs_rmdir, dzp = znode for the directory (input to zfs_dirent_lock) zp = znode for the entry being removed (output of zfs_dirent_lock) zfs_rmdir()->zfs_dirent_lock() takes z_name_lock in dzp zfs_rmdir() takes z_name_lock in zp Since both dzp and zp are type znode_t, the locks have the same default class, and lockdep considers it a possible recursive lock attempt. l->l_rwlock: ============ zap_expand_leaf() sometimes creates two new zap leaf structures, via these call paths: zap_deref_leaf()->zap_get_leaf_byblk()->zap_leaf_open() zap_expand_leaf()->zap_create_leaf()->zap_expand_leaf()->zap_create_leaf() Because both zap_leaf_open() and zap_create_leaf() initialize l->l_rwlock in their (separate) leaf structures, the lockdep class is the same, and the linux kernel believes these might both be the same lock, and emits a possible recursive lock warning. Signed-off-by: Olaf Faaland <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3895
* Add lock types RW_NOLOCKDEP and MUTEX_NOLOCKDEPOlaf Faaland2015-12-221-0/+2
| | | | | | | | | | | | | | | Both lock types were introduced in SPL to allow some locks to be taken/released with linux lockdep turned off. See SPL commit for details. Add the new lock types to zfs_context.h to allow user space compilation. Depends on SPL commit 692ae8d SPL pull request refs/pull/480/head Signed-off-by: Olaf Faaland <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3895
* Skip GPL-only symbols test when cross-compilingKamil Domański2015-12-181-8/+10
| | | | | | Signed-off-by: Kamil Domański <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4107
* Make zio_taskq_batch_pct user configurableDHE2015-12-182-1/+23
| | | | | | | | | Adds zio_taskq_batch_pct as an exported module parameter, allowing users to modify it at module load time. Signed-off-by: DHE <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4110
* Activate LVM volume groups before looking for zpools.Benjamin Albrecht2015-12-182-2/+63
| | | | | | | | Original-patch-by: @jgoerzen Signed-off-by: Benjamin Albrecht <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes zfsonlinux/pkg-zfs#102 Closes #4029
* Man page white space and spelling correctionsNed Bass2015-12-187-136/+136
| | | | | | | | | Correct some misspelled words and grammatical errors, and remove trailing white space in the man pages. Signed-off-by: Ned Bass <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4115
* Fix zfs_vdev_aggregation_limit bounds checkingBrian Behlendorf2015-12-181-11/+8
| | | | | | | | | | | | | Update the bounds checking for zfs_vdev_aggregation_limit so that it has a floor of zero and a maximum value of the supported block size for the pool. Additionally add an early return when zfs_vdev_aggregation_limit equals zero to disable aggregation. For very fast solid state or memory devices it may be more expensive to perform the aggregation than to issue the IO immediately. Signed-off-by: Brian Behlendorf <[email protected]>
* Fix vdev_queue_aggregate() deadlockBrian Behlendorf2015-12-183-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This deadlock may manifest itself in slightly different ways but at the core it is caused by a memory allocation blocking on file- system reclaim in the zio pipeline. This is normally impossible because zio_execute() disables filesystem reclaim by setting PF_FSTRANS on the thread. However, kmem cache allocations may still indirectly block on file system reclaim while holding the critical vq->vq_lock as shown below. To resolve this issue zio_buf_alloc_flags() is introduced which allocation flags to be passed. This can then be used in vdev_queue_aggregate() with KM_NOSLEEP when allocating the aggregate IO buffer. Since aggregating the IO is purely a performance optimization we want this to either succeed or fail quickly. Trying too hard to allocate this memory under the vq->vq_lock can negatively impact performance and result in this deadlock. * z_wr_iss zio_vdev_io_start vdev_queue_io -> Takes vq->vq_lock vdev_queue_io_to_issue vdev_queue_aggregate zio_buf_alloc -> Waiting on spl_kmem_cache process * z_wr_int zio_vdev_io_done vdev_queue_io_done mutex_lock -> Waiting on vq->vq_lock held by z_wr_iss * txg_sync spa_sync dsl_pool_sync zio_wait -> Waiting on zio being handled by z_wr_int * spl_kmem_cache spl_cache_grow_work kv_alloc spl_vmalloc ... evict zpl_evict_inode zfs_inactive dmu_tx_wait txg_wait_open -> Waiting on txg_sync Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes #3808 Closes #3867
* Fix z_xattr_lock/z_teardown_lock lock inversionBrian Behlendorf2015-12-181-1/+10
| | | | | | | | | | | | | | | | | | | | | There exists a lock inversion between the z_xattr_lock and the z_teardown_lock. Detect this case and return EBUSY so zfs_resume_fs() will mark the inode stale and it can be safely revalidated on next access. * process-1 zpl_xattr_get -> Takes zp->z_xattr_lock __zpl_xattr_get zfs_lookup -> Takes zsb->z_teardown_lock in ZFS_ENTER macro * process-2 zfs_ioc_recv -> Takes zsb->z_teardown_lock in zfs_suspend_fs() zfs_resume_fs zfs_rezget -> Takes zp->z_xattr_lock Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #3969
* Use uio for zvol_{read,write}Chunwei Chen2015-12-154-172/+24
| | | | | | | | | Since uio now supports bvec, we can convert bio into uio and reuse dmu_{read,write}_uio. This way, we can remove some duplicate code. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4078
* Fix uio_prefaultpages for 0 length iovecChunwei Chen2015-12-151-5/+6
| | | | | | | | | | | | Userspace can freely pass in whatever iovec it feels like, and it's perfectly legal to pass an iovec which contains a zero length segment. In the current implementation, uio_prefaultpages would touch an out of bound byte in the "last byte" logic. While this probably wouldn't cause any critical error, we would like uio_prefaultpages to be able to continue gracefully. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4078
* Handle damaged blk_birth in dsl_deadlist_insert()Brian Behlendorf2015-12-151-0/+8
| | | | | | | | | | | | | | | | | If a bit were cleared in `bp->blk_birth` such that the txg birth was now lower than any other txg_birth in the deadlist, then there will be no entry before this in the tree. This should be impossible but regardless error handling code has been added for this case. By default this is left as a fatal case and the blk_birth is logged. However, setting `zfs_recover=1` will cause the bp to be placed at the start of the deadlist even though it contains an invalid blk_birth. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4086 Closes #4089
* Handle block pointers with a corrupt logical sizeBrian Behlendorf2015-12-151-9/+3
| | | | | | | | | | | Commit 5f6d0b6 was originally added to gracefully handle block pointers with a damaged logical size. However, it incorrectly assumed that all passed arc_done_func_t could handle a NULL arc_buf_t. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4069 Closes #4080
* Remove "index" column from dbufstat.pyOlaf Faaland2015-12-151-4/+3
| | | | | | | | | | | | | | Commit ca0bf58d to address arcs_mtx contention removed column "index" from the output of kstats/dbuf. dbufstat.py was not updated to reflect this, which causes it to crash when run with -bx This removes "index" from hardcoded lists of columns. Signed-off-by: Olaf Faaland <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4096
* Revert "Switch ztest mmap(2) ASSERTs to VERIFYs"Richard Yao2015-12-141-3/+3
| | | | | | | | | | This reverts commit 202619623022722f30c2ee49931a4fa6896421c7. It is no longer necessary now that we pass -DDEBUG unconditionally. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4095
* Unconditionally build zdb and ztest with -DDEBUGRichard Yao2015-12-142-0/+3
| | | | | | | | | | | | | | | | | Illumos unconditionally builds zdb and ztest with -DDEBUG. This helps catch bugs and eliminates the need for commits like 202619623022722f30c2ee49931a4fa6896421c7, which changed ASSERTs to VERIFYs. The following files in the illumos tree show this: usr/src/cmd/zdb/Makefile.com usr/src/cmd/ztest/Makefile.com Given the usefulness of having early failure in these tools, we should do it too. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4095
* Hold the zfs_snapentry_t before dispatchBrian Behlendorf2015-12-141-1/+1
| | | | | | | | | While exceptionally unlikely to cause a problem the zfs_snapentry_t hold should be taken before the dispatch to prevent any possibility of the task being processed before the hold. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]>
* Fix snapshot automount race cause EREMOTEChunwei Chen2015-12-141-1/+1
| | | | | | | | | | | When a concorrent mount finishes just before calling to zfsctl_snapshot_ismounted, if we return EISDIR, the VFS will return with EREMOTE. We should instead just return 0, so VFS may retry and would likely notice the dentry is alreadly mounted. This will be inline with when usermode helper return EBUSY. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>