summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Illumos 4891 - want zdb option to dump all metadataMatthew Ahrens2016-01-114-10/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 4891 want zdb option to dump all metadata Reviewed by: Sonu Pillai <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: Dan McDonald <[email protected]> Reviewed by: Richard Lowe <[email protected]> Approved by: Garrett D'Amore <[email protected]> We'd like a way for zdb to dump metadata in a machine-readable format, so that we can bring that back from a customer site for in-house diagnosis. Think of it as a crash dump for zpools, which can be used for post-mortem analysis of a malfunctioning pool References: https://www.illumos.org/issues/4891 https://github.com/illumos/illumos-gate/commit/df15e41 Porting notes: - [cmd/zdb/zdb.c] - a5778ea zdb: Introduce -V for verbatim import - In main() getopt 'opt' variable removed and the code was brought back in line with illumos. - [lib/libzpool/kernel.c] - 1e33ac1 Fix Solaris thread dependency by using pthreads - f0e324f Update utsname support - 4d58b69 Fix vn_open/vn_rdwr error handling - In vn_open() allocate 'dumppath' on heap instead of stack - Properly handle 'dump_fd == -1' error path - Free 'realpath' after added vn_dumpdir_code block Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 4638 - Panic in ZFS via rfs3_setattr()/rfs3_write(): dirtying snapshot!Marcel Telka2016-01-111-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | 4638 Panic in ZFS via rfs3_setattr()/rfs3_write(): dirtying snapshot! Reviewed by: Alek Pinchuk <[email protected]> Reviewed by: Ilya Usvyatsky <[email protected]> Reviewed by: Dan McDonald <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Garrett D'Amore <[email protected]> Approved by: Garrett D'Amore <[email protected]> References: https://www.illumos.org/issues/4638 https://github.com/illumos/illumos-gate/commit/2144b12 Porting notes: - [module/zfs/zfs_vnops.c] - 3558fd7 Prototype/structure update for Linux - 2cf7f52 Linux compat 2.6.39: mount_nodev() - Use zfs_is_readonly() wrapper - Remove first line of comment which doesn't apply Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Man page whitespacenathancheek2016-01-111-0/+2
| | | | | | Signed-off-by: nathancheek <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4184
* Illumos 3749 - zfs event processing should work on R/O root filesystemsWill Andrews2016-01-114-20/+78
| | | | | | | | | | | | | | | | | | | | | | | | | 3749 zfs event processing should work on R/O root filesystems Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Eric Schrock <[email protected]> Approved by: Christopher Siden <[email protected]> References: https://www.illumos.org/issues/3749 https://github.com/illumos/illumos-gate/commit/3cb69f7 Porting notes: - [include/sys/spa_impl.h] - ffe9d38 Add generic errata infrastructure - 1421c89 Add visibility in to arc_read - [include/sys/fm/fs/zfs.h] - 2668527 Add linux events - 6283f55 Support custom build directories and move includes - [module/zfs/spa_config.c] - Updated spa_config_sync() to match illumos with the exception of a Linux specific block. Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Allow 16M send/recv blocksBrian Behlendorf2016-01-081-1/+1
| | | | | | | | | Fix an off by one error introduced by fcff0f3 which triggers an assertion when 16M blocks are used with send/recv. This fix was intentionally not folder in to the Illumos commit so it can be easily cherry-picked by upstream. Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 5960, 5925Paul Dagnelie2016-01-0840-389/+1418
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin= Reviewed by: Prakash Surya <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> References: https://www.illumos.org/issues/5960 https://www.illumos.org/issues/5925 https://github.com/illumos/illumos-gate/commit/a2cdcdd Porting notes: - [lib/libzfs/libzfs_sendrecv.c] - b8864a2 Fix gcc cast warnings - 325f023 Add linux kernel device support - 5c3f61e Increase Linux pipe buffer size on 'zfs receive' - [module/zfs/zfs_vnops.c] - 3558fd7 Prototype/structure update for Linux - c12e3a5 Restructure zfs_readdir() to fix regressions - [module/zfs/zvol.c] - Function @zvol_map_block() isn't needed in ZoL - 9965059 Prefetch start and end of volumes - [module/zfs/dmu.c] - Fixed ISO C90 - mixed declarations and code - Function dmu_prefetch() 'int i' is initialized before the following code block (c90 vs. c99) - [module/zfs/dbuf.c] - fc5bb51 Fix stack dbuf_hold_impl() - 9b67f60 Illumos 4757, 4913 - 34229a2 Reduce stack usage for recursive traverse_visitbp() - [module/zfs/dmu_send.c] - Fixed ISO C90 - mixed declarations and code - b58986e Use large stacks when available - 241b541 Illumos 5959 - clean up per-dataset feature count code - 77aef6f Use vmem_alloc() for nvlists - 00b4602 Add linux kernel memory support Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Add missing -V option to zdbBrian Behlendorf2016-01-081-2/+2
| | | | | | | Add missing getopt specifier for `zdb -V` verbatim option and set flag with correct bitwise operator. Signed-off-by: Brian Behlendorf <[email protected]>
* Fix casesensitivity=insensitive deadlockRichard Sharpe2016-01-081-2/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When casesensitivity=insensitive is set for the file system, we can deadlock in a rename if the user uses different case for each path. For example rename("A/some-file.txt", "a/some-file.txt"). The simple test for this is: 1. mkdir some-dir in a ZFS file system 2. touch some-dir/some-file.txt 3. mv Some-dir/some-file.txt some-dir/some-other-file.txt This last request deadlocks trying to relock the i_mutex on the inode for the parent directory. The solution is to use d_add_ci in zpl_lookup if we are on a file system that has the casesensitivity=insensitive attribute set. This patch checks if we are working on a case insensitive file system and if so, allocates storage for the case insensitive name and passes it to zfs_lookup and then calls d_add_ci instead of d_splice_alias. The performance impact seems to be minimal even though we have introduced a kmalloc and kfree in the lookup path. The problem was found when running Microsoft's FSCT against Samba on top of ZFS On Linux. Signed-off-by: Richard Sharpe <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4136
* Make arc_summary.py and dbufstat.py compatible with python3Hajo Möller2016-01-072-6/+6
| | | | | | | | | | To make arc_summary.py and dbufstat.py compatible with python3 some minor fixes were required, this was done automatically by `2to3 -w arc_summary.py` and `2to3 -w dbufstat.py`. Signed-off-by: Hajo Möller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Reviewed-by: Richard Laager <[email protected]>
* Illumos 3604 - zdb should print bpobjs more verbosely (fix zdb hang)Matthew Ahrens2016-01-051-0/+1
| | | | | | | | | | | | | | | | 3604 zdb should print bpobjs more verbosely (fix zdb hang) References: https://github.com/illumos/illumos-gate/commit/7706186 https://www.illumos.org/issues/3604 https://lists.freebsd.org/pipermail/svn-src-vendor/2015-August/002411.html https://lists.freebsd.org/pipermail/svn-src-head/2015-August/075195.html Porting notes: In ZoL "5810 zdb should print details of bpobj" was merged prior to this change so it must be applied to the new location. Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 3139 - zdb dies when it tries to determine path of unlinked fileJeremy Jones2016-01-051-4/+30
| | | | | | | | | | | | | | | 3139 zdb dies when it tries to determine path of unlinked file Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: Eric Schrock <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://github.com/illumos/illumos-gate/commit/1ce39b5 https://www.illumos.org/issues/3139 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 2077 - lots of unreachable breaks in illumos gateMilan Jurik2015-12-301-3/+3
| | | | | | | | | | | | | | | | | 2077 lots of unreachable breaks in illumos gate Reviewed by: Dan McDonald <[email protected]> Reviewed by: Garrett D'Amore <[email protected]> Approved by: Richard Lowe <[email protected]> References: https://www.illumos.org/issues/2077 https://github.com/illumos/illumos-gate/commit/33f5ff1 Porting notes: - Only one file of the original patch applied to ZFS - Minor formating change to align copyright block with upstream Ported-by: Brian Behlendorf <[email protected]>
* Illumos 5746 - more checksumming in zfs sendMatthew Ahrens2015-12-306-307/+445
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5746 more checksumming in zfs send Reviewed by: Christopher Siden <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Bayard Bell <[email protected]> Approved by: Albert Lee <[email protected]> References: https://www.illumos.org/issues/5746 https://github.com/illumos/illumos-gate/commit/98110f0 https://github.com/zfsonlinux/zfs/issues/905 Porting notes: - Minor conflicts due to: - https://github.com/zfsonlinux/zfs/commit/2024041 - https://github.com/zfsonlinux/zfs/commit/044baf0 - https://github.com/zfsonlinux/zfs/commit/88904bb - Fix ISO C90 warnings (-Werror=declaration-after-statement) - arc_buf_t *abuf; - dmu_buf_t *bonus; - zio_cksum_t cksum_orig; - zio_cksum_t *cksump; - Fix format '%llx' format specifier warning - Align message in zstreamdump safe_malloc() with upstream Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3611
* Prevent SA length overflowNed Bass2015-12-304-10/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function sa_update() accepts a 32-bit length parameter and assigns it to a 16-bit field in sa_bulk_attr_t, potentially truncating the passed-in value. This could lead to corrupt system attribute (SA) records getting written to the pool. Add a VERIFY to sa_update() to detect cases where overflow would occur. The SA length is limited to 16-bit values by the on-disk format defined by sa_hdr_phys_t. The function zfs_sa_set_xattr() is vulnerable to this bug if the unpacked nvlist of xattrs is less than 64k in size but the packed size is greater than 64k. Fix this by appropriately checking the size of the packed nvlist before calling sa_update(). Add error handling to zpl_xattr_set_sa() to keep the cached list of SA-based xattrs consistent with the data on disk. Lastly, zfs_sa_set_xattr() calls dmu_tx_abort() on an assigned transaction if sa_update() returns an error, but the DMU only allows unassigned transactions to be aborted. Wrap the sa_update() call in a VERIFY0, remove the transaction abort, and call dmu_tx_commit() unconditionally. This is consistent practice with other callers of sa_update(). Signed-off-by: Ned Bass <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #4150
* Illumos 5745 - zfs set allows only one dataset property to be set at a timeChris Williamson2015-12-295-87/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5745 zfs set allows only one dataset property to be set at a time Reviewed by: Christopher Siden <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Bayard Bell <[email protected]> Reviewed by: Richard PALO <[email protected]> Reviewed by: Steven Hartland <[email protected]> Approved by: Rich Lowe <[email protected]> References: https://www.illumos.org/issues/5745 https://github.com/illumos/illumos-gate/commit/3092556 Porting notes: - Fix the missing braces around initializer, zfs_cmd_t zc = {"\0"}; - Remove extra format argument in zfs_do_set() - Declare at the top: - zfs_prop_t prop; - nvpair_t *elem; - nvpair_t *next; - int i; - Additionally initialize: - int added_resv = 0; - zfs_prop_t prop = 0; - Assign 0 install of NULL for uint64_t types. - zc->zc_nvlist_conf = '\0'; - zc->zc_nvlist_src = '\0'; - zc->zc_nvlist_dst = '\0'; Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3574
* Make xattr dir truncate and remove in one txChunwei Chen2015-12-281-8/+15
| | | | | | | | | | | | | | | We need truncate and remove be in the same tx when doing zfs_rmnode on xattr dir. Otherwise, if we truncate and crash, we'll end up with inconsistent zap object on the delete queue. We do this by skipping dmu_free_long_range and let zfs_znode_delete to do the work. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4114 Issue #4052 Issue #4006 Issue #3018 Issue #2861
* Fix empty xattr dir causing lockupChunwei Chen2015-12-281-0/+10
| | | | | | | | | | | | | | | | | | During zfs_rmnode on a xattr dir, if the system crash just after dmu_free_long_range, we would get empty xattr dir in delete queue. This would cause blkid=0 be passed into zap_get_leaf_byblk when doing zfs_purgedir during mount, and would try to do rw_enter on a wrong structure and cause system lockup. We fix this by returning ENOENT when blkid is zero in zap_get_leaf_byblk. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4114 Closes #4052 Closes #4006 Closes #3018 Closes #2861
* Fix z_xattr_lock/z_teardown_lock inversionBrian Behlendorf2015-12-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | There exists a lock inversion between the z_xattr_lock and the z_teardown_lock. Resolve this by taking the z_teardown_lock in all registered xattr callbacks prior to taking the z_xattr_lock. This ensures the locks are always taken is the same order thus preventing a deadlock. Note the z_teardown_lock is taken again in zfs_lookup() and this is safe because the z_teardown lock is a re-entrant read reader/writer lock. * process-1 zpl_xattr_get -> Takes zp->z_xattr_lock __zpl_xattr_get zfs_lookup -> Takes zsb->z_teardown_lock in ZFS_ENTER macro * process-2 zfs_ioc_recv -> Takes zsb->z_teardown_lock in zfs_suspend_fs() zfs_resume_fs zfs_rezget -> Takes zp->z_xattr_lock Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #3943 Closes #3969 Closes #4121
* Revert "Fix z_xattr_lock/z_teardown_lock lock inversion"Brian Behlendorf2015-12-221-10/+1
| | | | This reverts commit 6b32ef572f754efc3f9edb20d022450f8e6b02d9.
* Fix ztest truncated cache fileBrian Behlendorf2015-12-221-2/+3
| | | | | | | | | | | Commit efc412b updated spa_config_write() for Linux 4.2 kernels to truncate and overwrite rather than rename the cache file. This is the correct fix but it should have only been applied for the kernel build. In user space rename(2) is needed because ztest depends on the cache file. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4129
* Identify locks flagged by lockdepOlaf Faaland2015-12-228-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When running a kernel with CONFIG_LOCKDEP=y, lockdep reports possible recursive locking in some cases and possible circular locking dependency in others, within the SPL and ZFS modules. This patch uses a mutex type defined in SPL, MUTEX_NOLOCKDEP, to mark such mutexes when they are initialized. This mutex type causes attempts to take or release those locks to be wrapped in lockdep_off() and lockdep_on() calls to silence the dependency checker and allow the use of lock_stats to examine contention. For RW locks, it uses an analogous lock type, RW_NOLOCKDEP. The goal is that these locks are ultimately changed back to type MUTEX_DEFAULT or RW_DEFAULT, after the locks are annotated to reflect their relationship (e.g. z_name_lock below) or any real problem with the lock dependencies are fixed. Some of the affected locks are: tc_open_lock: ============= This is an array of locks, all with same name, which txg_quiesce must take all of in order to move txg to next state. All default to the same lockdep class, and so to lockdep appears recursive. zp->z_name_lock: ================ In zfs_rmdir, dzp = znode for the directory (input to zfs_dirent_lock) zp = znode for the entry being removed (output of zfs_dirent_lock) zfs_rmdir()->zfs_dirent_lock() takes z_name_lock in dzp zfs_rmdir() takes z_name_lock in zp Since both dzp and zp are type znode_t, the locks have the same default class, and lockdep considers it a possible recursive lock attempt. l->l_rwlock: ============ zap_expand_leaf() sometimes creates two new zap leaf structures, via these call paths: zap_deref_leaf()->zap_get_leaf_byblk()->zap_leaf_open() zap_expand_leaf()->zap_create_leaf()->zap_expand_leaf()->zap_create_leaf() Because both zap_leaf_open() and zap_create_leaf() initialize l->l_rwlock in their (separate) leaf structures, the lockdep class is the same, and the linux kernel believes these might both be the same lock, and emits a possible recursive lock warning. Signed-off-by: Olaf Faaland <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3895
* Add lock types RW_NOLOCKDEP and MUTEX_NOLOCKDEPOlaf Faaland2015-12-221-0/+2
| | | | | | | | | | | | | | | Both lock types were introduced in SPL to allow some locks to be taken/released with linux lockdep turned off. See SPL commit for details. Add the new lock types to zfs_context.h to allow user space compilation. Depends on SPL commit 692ae8d SPL pull request refs/pull/480/head Signed-off-by: Olaf Faaland <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3895
* Skip GPL-only symbols test when cross-compilingKamil Domański2015-12-181-8/+10
| | | | | | Signed-off-by: Kamil Domański <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4107
* Make zio_taskq_batch_pct user configurableDHE2015-12-182-1/+23
| | | | | | | | | Adds zio_taskq_batch_pct as an exported module parameter, allowing users to modify it at module load time. Signed-off-by: DHE <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4110
* Activate LVM volume groups before looking for zpools.Benjamin Albrecht2015-12-182-2/+63
| | | | | | | | Original-patch-by: @jgoerzen Signed-off-by: Benjamin Albrecht <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes zfsonlinux/pkg-zfs#102 Closes #4029
* Man page white space and spelling correctionsNed Bass2015-12-187-136/+136
| | | | | | | | | Correct some misspelled words and grammatical errors, and remove trailing white space in the man pages. Signed-off-by: Ned Bass <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4115
* Fix zfs_vdev_aggregation_limit bounds checkingBrian Behlendorf2015-12-181-11/+8
| | | | | | | | | | | | | Update the bounds checking for zfs_vdev_aggregation_limit so that it has a floor of zero and a maximum value of the supported block size for the pool. Additionally add an early return when zfs_vdev_aggregation_limit equals zero to disable aggregation. For very fast solid state or memory devices it may be more expensive to perform the aggregation than to issue the IO immediately. Signed-off-by: Brian Behlendorf <[email protected]>
* Fix vdev_queue_aggregate() deadlockBrian Behlendorf2015-12-183-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This deadlock may manifest itself in slightly different ways but at the core it is caused by a memory allocation blocking on file- system reclaim in the zio pipeline. This is normally impossible because zio_execute() disables filesystem reclaim by setting PF_FSTRANS on the thread. However, kmem cache allocations may still indirectly block on file system reclaim while holding the critical vq->vq_lock as shown below. To resolve this issue zio_buf_alloc_flags() is introduced which allocation flags to be passed. This can then be used in vdev_queue_aggregate() with KM_NOSLEEP when allocating the aggregate IO buffer. Since aggregating the IO is purely a performance optimization we want this to either succeed or fail quickly. Trying too hard to allocate this memory under the vq->vq_lock can negatively impact performance and result in this deadlock. * z_wr_iss zio_vdev_io_start vdev_queue_io -> Takes vq->vq_lock vdev_queue_io_to_issue vdev_queue_aggregate zio_buf_alloc -> Waiting on spl_kmem_cache process * z_wr_int zio_vdev_io_done vdev_queue_io_done mutex_lock -> Waiting on vq->vq_lock held by z_wr_iss * txg_sync spa_sync dsl_pool_sync zio_wait -> Waiting on zio being handled by z_wr_int * spl_kmem_cache spl_cache_grow_work kv_alloc spl_vmalloc ... evict zpl_evict_inode zfs_inactive dmu_tx_wait txg_wait_open -> Waiting on txg_sync Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes #3808 Closes #3867
* Fix z_xattr_lock/z_teardown_lock lock inversionBrian Behlendorf2015-12-181-1/+10
| | | | | | | | | | | | | | | | | | | | | There exists a lock inversion between the z_xattr_lock and the z_teardown_lock. Detect this case and return EBUSY so zfs_resume_fs() will mark the inode stale and it can be safely revalidated on next access. * process-1 zpl_xattr_get -> Takes zp->z_xattr_lock __zpl_xattr_get zfs_lookup -> Takes zsb->z_teardown_lock in ZFS_ENTER macro * process-2 zfs_ioc_recv -> Takes zsb->z_teardown_lock in zfs_suspend_fs() zfs_resume_fs zfs_rezget -> Takes zp->z_xattr_lock Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #3969
* Use uio for zvol_{read,write}Chunwei Chen2015-12-154-172/+24
| | | | | | | | | Since uio now supports bvec, we can convert bio into uio and reuse dmu_{read,write}_uio. This way, we can remove some duplicate code. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4078
* Fix uio_prefaultpages for 0 length iovecChunwei Chen2015-12-151-5/+6
| | | | | | | | | | | | Userspace can freely pass in whatever iovec it feels like, and it's perfectly legal to pass an iovec which contains a zero length segment. In the current implementation, uio_prefaultpages would touch an out of bound byte in the "last byte" logic. While this probably wouldn't cause any critical error, we would like uio_prefaultpages to be able to continue gracefully. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4078
* Handle damaged blk_birth in dsl_deadlist_insert()Brian Behlendorf2015-12-151-0/+8
| | | | | | | | | | | | | | | | | If a bit were cleared in `bp->blk_birth` such that the txg birth was now lower than any other txg_birth in the deadlist, then there will be no entry before this in the tree. This should be impossible but regardless error handling code has been added for this case. By default this is left as a fatal case and the blk_birth is logged. However, setting `zfs_recover=1` will cause the bp to be placed at the start of the deadlist even though it contains an invalid blk_birth. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4086 Closes #4089
* Handle block pointers with a corrupt logical sizeBrian Behlendorf2015-12-151-9/+3
| | | | | | | | | | | Commit 5f6d0b6 was originally added to gracefully handle block pointers with a damaged logical size. However, it incorrectly assumed that all passed arc_done_func_t could handle a NULL arc_buf_t. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4069 Closes #4080
* Remove "index" column from dbufstat.pyOlaf Faaland2015-12-151-4/+3
| | | | | | | | | | | | | | Commit ca0bf58d to address arcs_mtx contention removed column "index" from the output of kstats/dbuf. dbufstat.py was not updated to reflect this, which causes it to crash when run with -bx This removes "index" from hardcoded lists of columns. Signed-off-by: Olaf Faaland <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4096
* Revert "Switch ztest mmap(2) ASSERTs to VERIFYs"Richard Yao2015-12-141-3/+3
| | | | | | | | | | This reverts commit 202619623022722f30c2ee49931a4fa6896421c7. It is no longer necessary now that we pass -DDEBUG unconditionally. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4095
* Unconditionally build zdb and ztest with -DDEBUGRichard Yao2015-12-142-0/+3
| | | | | | | | | | | | | | | | | Illumos unconditionally builds zdb and ztest with -DDEBUG. This helps catch bugs and eliminates the need for commits like 202619623022722f30c2ee49931a4fa6896421c7, which changed ASSERTs to VERIFYs. The following files in the illumos tree show this: usr/src/cmd/zdb/Makefile.com usr/src/cmd/ztest/Makefile.com Given the usefulness of having early failure in these tools, we should do it too. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4095
* Hold the zfs_snapentry_t before dispatchBrian Behlendorf2015-12-141-1/+1
| | | | | | | | | While exceptionally unlikely to cause a problem the zfs_snapentry_t hold should be taken before the dispatch to prevent any possibility of the task being processed before the hold. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]>
* Fix snapshot automount race cause EREMOTEChunwei Chen2015-12-141-1/+1
| | | | | | | | | | | When a concorrent mount finishes just before calling to zfsctl_snapshot_ismounted, if we return EISDIR, the VFS will return with EREMOTE. We should instead just return 0, so VFS may retry and would likely notice the dentry is alreadly mounted. This will be inline with when usermode helper return EBUSY. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Change zfs_snapshot_lock from mutex to rw lockBrian Behlendorf2015-12-141-26/+26
| | | | | | | | | | By changing the zfs_snapshot_lock from a mutex to a rw lock the zfsctl_lookup_objset() function can be allowed to run concurrently. This should reduce the latency of fh_to_dentry lookups in ZFS snapshots which are being accessed over NFS. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]>
* Fix zfsctl_lookup_objset() deadlockBrian Behlendorf2015-12-141-1/+2
| | | | | | | | | | | | The zfsctl_snapshot_unmount_delay() function must not be called from zfsctl_lookup_objset() while it is currently holding the zfs_snapshot_lock. This will result in a deadlock. It is safe to call zfsctl_snapshot_unmount_delay_impl() directly because the function already has a reference on the zfs_snapentry_t. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #3997
* Set 'zfs_expire_snapshot=0' to disable auto-unmountBrian Behlendorf2015-12-141-0/+8
| | | | | | | | | There are cases where it's desirable that auto-mounted snapshots not expire after a fixed duration. They should be unmounted only when the filesystem they are a snapshot of is unmounted. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]>
* Either _ILP32 or _LP64 must be definedBrian Behlendorf2015-12-101-15/+27
| | | | | | | | | | | For some arm, powerpc, and sparc platforms it was possible that neither _ILP32 of _LP64 would be defined. Update the isa_defs.h header to explicitly set these macros and generate a compile error in the case neither are defined. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4048
* Use spa as key besides objsetid for snapentryChunwei Chen2015-12-083-12/+26
| | | | | | | | | | | | | objsetid is not unique across pool, so using it solely as key would cause panic when automounting two snapshot on different pools with the same objsetid. We fix this by adding spa pointer as additional key. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Issue #3948 Issue #3786 Issue #3887
* Use large stacks when availableBrian Behlendorf2015-12-074-19/+67
| | | | | | | | | | | | | | | | | While stack size will vary by architecture it has historically defaulted to 8K on x86_64 systems. However, as of Linux 3.15 the default thread stack size was increased to 16K. These kernels are now the default in most non- enterprise distributions which means we no longer need to assume 8K stacks. This patch takes advantage of that fact by appropriately reverting stack conservation changes which were made to ensure stability. Changes which may have had a negative impact on performance for certain workloads. This also has the side effect of bringing the code slightly more in line with upstream. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #4059
* Update arcstat.py to remove deprecated rmis reference.cable29992015-12-041-2/+2
| | | | | | | | | Running arcstat.py -x currently throws KeyError due to rmis being absent, it was removed in commit ca0bf58. Signed-off-by: cable2999 <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3931
* Fix cstyle issue from 7a02327ilovezfs2015-12-041-2/+2
| | | | | | | | Continuations should be indented four spaces. Signed-off-by: ilovezfs <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4062
* Illumos 5959 - clean up per-dataset feature count codeMatthew Ahrens2015-12-0411-178/+237
| | | | | | | | | | | | | | | | | | | | | | | 5959 clean up per-dataset feature count code Reviewed by: Toomas Soome <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Alex Reece <[email protected]> Approved by: Richard Lowe <[email protected]> References: https://www.illumos.org/issues/5959 https://github.com/illumos/illumos-gate/commit/ca0cc39 Porting notes: illumos code doesn't check for feature_get_refcount() returning ENOTSUP (which means feature is disabled) in zdb. zfsonlinux added a check in https://github.com/zfsonlinux/zfs/commit/784652c due to #3468. The check was reintroduced here. Ported-by: Witaut Bajaryn <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3965
* Add zap_prefetch() interfaceBrian Behlendorf2015-12-042-0/+24
| | | | | | | | | | | Provide a generic interface to prefetch ZAP entries by name. This functionality is being added for external consumers such as Lustre. It is based of the existing zap_prefetch_uint64() version which is used by the deduplication code. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #4061
* Ext4's typical GPT partition type not recognizedilovezfs2015-12-042-1/+187
| | | | | | | | | | | | | | | | | | | | | Adding additional entries to the efi conversion array will help prevent the overwriting of the GPTs of disks with in-use file systems in more cases. Most notably, this adds partition type 8300 "Linux filesystem" (0FC63DAF-8483-4772-8E79-3D69D8477DE4), which is often used for ext4 and btrfs, among others. This commit itself does nothing to address the underlying problematic behavior that check_slice() isn't called on partitions of an unrecognized type, even when they contain a currently mounted file system. The additional entries were derived from these two resources: https://en.wikipedia.org/wiki/GUID_Partition_Table http://sourceforge.net/p/gptfdisk/code/ci/master/tree/parttypes.cc Signed-off-by: ilovezfs <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4016
* Illumos 934 - FreeBSD's GPT not recognizedYuri Pankov2015-12-042-49/+66
| | | | | | | | | | | | | | | | Reviewed by: Alexander Eremin <[email protected]> Reviewed by: Garrett D'Amore <[email protected]> Reviewed by: Andrew Stormont <[email protected]> Reviewed by: Richard Elling <[email protected]> Approved by: Gordon Ross <[email protected]> References: https://www.illumos.org/issues/934 https://github.com/illumos/illumos-gate/commit/e21ea67 Ported-by: ilovezfs <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4016