summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Fix vdev_queue_aggregate() deadlockBrian Behlendorf2015-12-183-1/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This deadlock may manifest itself in slightly different ways but at the core it is caused by a memory allocation blocking on file- system reclaim in the zio pipeline. This is normally impossible because zio_execute() disables filesystem reclaim by setting PF_FSTRANS on the thread. However, kmem cache allocations may still indirectly block on file system reclaim while holding the critical vq->vq_lock as shown below. To resolve this issue zio_buf_alloc_flags() is introduced which allocation flags to be passed. This can then be used in vdev_queue_aggregate() with KM_NOSLEEP when allocating the aggregate IO buffer. Since aggregating the IO is purely a performance optimization we want this to either succeed or fail quickly. Trying too hard to allocate this memory under the vq->vq_lock can negatively impact performance and result in this deadlock. * z_wr_iss zio_vdev_io_start vdev_queue_io -> Takes vq->vq_lock vdev_queue_io_to_issue vdev_queue_aggregate zio_buf_alloc -> Waiting on spl_kmem_cache process * z_wr_int zio_vdev_io_done vdev_queue_io_done mutex_lock -> Waiting on vq->vq_lock held by z_wr_iss * txg_sync spa_sync dsl_pool_sync zio_wait -> Waiting on zio being handled by z_wr_int * spl_kmem_cache spl_cache_grow_work kv_alloc spl_vmalloc ... evict zpl_evict_inode zfs_inactive dmu_tx_wait txg_wait_open -> Waiting on txg_sync Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes #3808 Closes #3867
* Fix z_xattr_lock/z_teardown_lock lock inversionBrian Behlendorf2015-12-181-1/+10
| | | | | | | | | | | | | | | | | | | | | There exists a lock inversion between the z_xattr_lock and the z_teardown_lock. Detect this case and return EBUSY so zfs_resume_fs() will mark the inode stale and it can be safely revalidated on next access. * process-1 zpl_xattr_get -> Takes zp->z_xattr_lock __zpl_xattr_get zfs_lookup -> Takes zsb->z_teardown_lock in ZFS_ENTER macro * process-2 zfs_ioc_recv -> Takes zsb->z_teardown_lock in zfs_suspend_fs() zfs_resume_fs zfs_rezget -> Takes zp->z_xattr_lock Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #3969
* Use uio for zvol_{read,write}Chunwei Chen2015-12-154-172/+24
| | | | | | | | | Since uio now supports bvec, we can convert bio into uio and reuse dmu_{read,write}_uio. This way, we can remove some duplicate code. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4078
* Fix uio_prefaultpages for 0 length iovecChunwei Chen2015-12-151-5/+6
| | | | | | | | | | | | Userspace can freely pass in whatever iovec it feels like, and it's perfectly legal to pass an iovec which contains a zero length segment. In the current implementation, uio_prefaultpages would touch an out of bound byte in the "last byte" logic. While this probably wouldn't cause any critical error, we would like uio_prefaultpages to be able to continue gracefully. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4078
* Handle damaged blk_birth in dsl_deadlist_insert()Brian Behlendorf2015-12-151-0/+8
| | | | | | | | | | | | | | | | | If a bit were cleared in `bp->blk_birth` such that the txg birth was now lower than any other txg_birth in the deadlist, then there will be no entry before this in the tree. This should be impossible but regardless error handling code has been added for this case. By default this is left as a fatal case and the blk_birth is logged. However, setting `zfs_recover=1` will cause the bp to be placed at the start of the deadlist even though it contains an invalid blk_birth. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4086 Closes #4089
* Handle block pointers with a corrupt logical sizeBrian Behlendorf2015-12-151-9/+3
| | | | | | | | | | | Commit 5f6d0b6 was originally added to gracefully handle block pointers with a damaged logical size. However, it incorrectly assumed that all passed arc_done_func_t could handle a NULL arc_buf_t. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4069 Closes #4080
* Remove "index" column from dbufstat.pyOlaf Faaland2015-12-151-4/+3
| | | | | | | | | | | | | | Commit ca0bf58d to address arcs_mtx contention removed column "index" from the output of kstats/dbuf. dbufstat.py was not updated to reflect this, which causes it to crash when run with -bx This removes "index" from hardcoded lists of columns. Signed-off-by: Olaf Faaland <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4096
* Revert "Switch ztest mmap(2) ASSERTs to VERIFYs"Richard Yao2015-12-141-3/+3
| | | | | | | | | | This reverts commit 202619623022722f30c2ee49931a4fa6896421c7. It is no longer necessary now that we pass -DDEBUG unconditionally. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4095
* Unconditionally build zdb and ztest with -DDEBUGRichard Yao2015-12-142-0/+3
| | | | | | | | | | | | | | | | | Illumos unconditionally builds zdb and ztest with -DDEBUG. This helps catch bugs and eliminates the need for commits like 202619623022722f30c2ee49931a4fa6896421c7, which changed ASSERTs to VERIFYs. The following files in the illumos tree show this: usr/src/cmd/zdb/Makefile.com usr/src/cmd/ztest/Makefile.com Given the usefulness of having early failure in these tools, we should do it too. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4095
* Hold the zfs_snapentry_t before dispatchBrian Behlendorf2015-12-141-1/+1
| | | | | | | | | While exceptionally unlikely to cause a problem the zfs_snapentry_t hold should be taken before the dispatch to prevent any possibility of the task being processed before the hold. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]>
* Fix snapshot automount race cause EREMOTEChunwei Chen2015-12-141-1/+1
| | | | | | | | | | | When a concorrent mount finishes just before calling to zfsctl_snapshot_ismounted, if we return EISDIR, the VFS will return with EREMOTE. We should instead just return 0, so VFS may retry and would likely notice the dentry is alreadly mounted. This will be inline with when usermode helper return EBUSY. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Change zfs_snapshot_lock from mutex to rw lockBrian Behlendorf2015-12-141-26/+26
| | | | | | | | | | By changing the zfs_snapshot_lock from a mutex to a rw lock the zfsctl_lookup_objset() function can be allowed to run concurrently. This should reduce the latency of fh_to_dentry lookups in ZFS snapshots which are being accessed over NFS. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]>
* Fix zfsctl_lookup_objset() deadlockBrian Behlendorf2015-12-141-1/+2
| | | | | | | | | | | | The zfsctl_snapshot_unmount_delay() function must not be called from zfsctl_lookup_objset() while it is currently holding the zfs_snapshot_lock. This will result in a deadlock. It is safe to call zfsctl_snapshot_unmount_delay_impl() directly because the function already has a reference on the zfs_snapentry_t. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #3997
* Set 'zfs_expire_snapshot=0' to disable auto-unmountBrian Behlendorf2015-12-141-0/+8
| | | | | | | | | There are cases where it's desirable that auto-mounted snapshots not expire after a fixed duration. They should be unmounted only when the filesystem they are a snapshot of is unmounted. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]>
* Either _ILP32 or _LP64 must be definedBrian Behlendorf2015-12-101-15/+27
| | | | | | | | | | | For some arm, powerpc, and sparc platforms it was possible that neither _ILP32 of _LP64 would be defined. Update the isa_defs.h header to explicitly set these macros and generate a compile error in the case neither are defined. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4048
* Use spa as key besides objsetid for snapentryChunwei Chen2015-12-083-12/+26
| | | | | | | | | | | | | objsetid is not unique across pool, so using it solely as key would cause panic when automounting two snapshot on different pools with the same objsetid. We fix this by adding spa pointer as additional key. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Issue #3948 Issue #3786 Issue #3887
* Use large stacks when availableBrian Behlendorf2015-12-074-19/+67
| | | | | | | | | | | | | | | | | While stack size will vary by architecture it has historically defaulted to 8K on x86_64 systems. However, as of Linux 3.15 the default thread stack size was increased to 16K. These kernels are now the default in most non- enterprise distributions which means we no longer need to assume 8K stacks. This patch takes advantage of that fact by appropriately reverting stack conservation changes which were made to ensure stability. Changes which may have had a negative impact on performance for certain workloads. This also has the side effect of bringing the code slightly more in line with upstream. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #4059
* Update arcstat.py to remove deprecated rmis reference.cable29992015-12-041-2/+2
| | | | | | | | | Running arcstat.py -x currently throws KeyError due to rmis being absent, it was removed in commit ca0bf58. Signed-off-by: cable2999 <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3931
* Fix cstyle issue from 7a02327ilovezfs2015-12-041-2/+2
| | | | | | | | Continuations should be indented four spaces. Signed-off-by: ilovezfs <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4062
* Illumos 5959 - clean up per-dataset feature count codeMatthew Ahrens2015-12-0411-178/+237
| | | | | | | | | | | | | | | | | | | | | | | 5959 clean up per-dataset feature count code Reviewed by: Toomas Soome <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Alex Reece <[email protected]> Approved by: Richard Lowe <[email protected]> References: https://www.illumos.org/issues/5959 https://github.com/illumos/illumos-gate/commit/ca0cc39 Porting notes: illumos code doesn't check for feature_get_refcount() returning ENOTSUP (which means feature is disabled) in zdb. zfsonlinux added a check in https://github.com/zfsonlinux/zfs/commit/784652c due to #3468. The check was reintroduced here. Ported-by: Witaut Bajaryn <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3965
* Add zap_prefetch() interfaceBrian Behlendorf2015-12-042-0/+24
| | | | | | | | | | | Provide a generic interface to prefetch ZAP entries by name. This functionality is being added for external consumers such as Lustre. It is based of the existing zap_prefetch_uint64() version which is used by the deduplication code. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #4061
* Ext4's typical GPT partition type not recognizedilovezfs2015-12-042-1/+187
| | | | | | | | | | | | | | | | | | | | | Adding additional entries to the efi conversion array will help prevent the overwriting of the GPTs of disks with in-use file systems in more cases. Most notably, this adds partition type 8300 "Linux filesystem" (0FC63DAF-8483-4772-8E79-3D69D8477DE4), which is often used for ext4 and btrfs, among others. This commit itself does nothing to address the underlying problematic behavior that check_slice() isn't called on partitions of an unrecognized type, even when they contain a currently mounted file system. The additional entries were derived from these two resources: https://en.wikipedia.org/wiki/GUID_Partition_Table http://sourceforge.net/p/gptfdisk/code/ci/master/tree/parttypes.cc Signed-off-by: ilovezfs <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4016
* Illumos 934 - FreeBSD's GPT not recognizedYuri Pankov2015-12-042-49/+66
| | | | | | | | | | | | | | | | Reviewed by: Alexander Eremin <[email protected]> Reviewed by: Garrett D'Amore <[email protected]> Reviewed by: Andrew Stormont <[email protected]> Reviewed by: Richard Elling <[email protected]> Approved by: Gordon Ross <[email protected]> References: https://www.illumos.org/issues/934 https://github.com/illumos/illumos-gate/commit/e21ea67 Ported-by: ilovezfs <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4016
* Only trigger SET_ERROR tracepoint event on errorRichard Yao2015-12-021-0/+9
| | | | | | | | | | | | Currently, the SET_ERROR tracepoint triggers regardless of whether there is an error or not. On Illumos, SET_ERROR only triggers on an actual error, which is avoids irrelevant noise. Linux 2.6.38 added support for conditional tracepoints, so we modify SET_ERROR to use them when they are avaliable for functionality equivalent to the Illumos functionality. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4043
* Fix zdb_dump_block on little endian systemsChunwei Chen2015-12-021-0/+4
| | | | | | | | | | | | | | | | When dumping a block on a little endian system the data must be byte swapped to display correctly. Example incorrect output: $ echo 0123456789abcdef > aaa $ zdb -eR pp 3:1ee00:200 3:1ee00:200 0 1 2 3 4 5 6 7 8 9 a b c d e f 0123456789abcdef 000000: 3736353433323130 6665646362613938 0123456789abcdef 000010: 000000000000000a 0000000000000000 ................ Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4020
* Fix zdb calling behavior in ztestChunwei Chen2015-12-021-13/+41
| | | | | | | | | | | | | | The current zdb calling behaviour is really fragile, and is guaranteed to segfault if ztest is not installed in either /sbin or /usr/sbin. With this patch, the ztest will try to call zdb in the following order. 1. Use environmental variable ZDB_PATH if provided. 2. If ztest resides in build tree, guess the in tree zdb path. 3. Just pass zdb to popen and let it search it in PATH. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3126
* Prevent rm modules.* when make installtuxoko2015-12-021-1/+1
| | | | | | | | | | | This was originally in fe0ed8f910c1e4288dc190546cfe98ecf545b547, but somehow was changed and not working anymore. And it will cause the following error: modprobe: ERROR: ../libkmod/libkmod.c:506 lookup_builtin_file() could not open builtin file '/lib/modules/4.2.0-18-generic/modules.builtin.bin' Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4027
* Fix --enable-linux-builtinBrian Behlendorf2015-12-021-0/+2
| | | | | | | | | | Adding VPATH support, commit 47a4a6f, required that a `src` and `obj` line be added to the top of the Makefiles. They must be removed from the Makefiles when builtin. Signed-off-by: Brian Behlendorf <[email protected]> Issue zfsonlinux/spl#481 Issue zfsonlinux/spl#498
* Linux 4.4 compat: xattr operations takes xattr_handlerChunwei Chen2015-12-013-2/+164
| | | | | | | | | | The xattr_hander->{list,get,set} were changed to take a xattr_handler, and handler_flags argument was removed and should be accessed by handler->flags. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4021
* Linux 4.4 compat: make_request_fn returns blk_qc_tChunwei Chen2015-12-012-2/+26
| | | | | | | | | | | As part of block polling support in Linux 4.4, make_request_fn should return a cookie value of type blk_qc_t. For now, we make zvol_request always return BLK_QC_T_NONE until we assess whether and how we want to support block polling. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4021
* Fix zfs_dirty_data_max overflow on 32-bittuxoko2015-11-191-2/+2
| | | | | | | | | | | | | On 32 bit, the calculation of zfs_dirty_data_max from phymem will overflow, causing it to be smaller than zfs_dirty_data_sync, and will cause txg being delayed while no one write to disk. The end result is horrendous write speed. On 4G ram 32-bit VM, before this patch, simple dd results in ~7MB/s. Now it can reach speed on par with 64-bit VM. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3973
* Fix null pointer in arc_kmem_reap_now on 32-bittuxoko2015-11-191-0/+5
| | | | | | | | | On 32 bit system, zio_buf_cache is limit to 1M. Larger than that is all NULL. So we need to avoid reaping them. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #3973
* Fix snapshot automount behavior when concurrent or failChunwei Chen2015-11-191-17/+32
| | | | | | | | | | | | | | | | | | | | When concurrent threads accessing the snapdir, one will succeed the user helper mount while others will get EBUSY. However, the original code treats those EBUSY threads as success and goes on to do zfsctl_snapshot_add, which causes repeated avl_add and thus panic. Also, if the snapshot is already mounted somewhere else, a thread accessing the snapdir will also get EBUSY from user helper mount. And it will cause strange things as doing follow_down_one will fail and then follow_up will jump up to the mountpoint of the filesystem and confuse the hell out of VFS. The patch fix both behavior by returning 0 immediately for the EBUSY threads. Note, this will have a side effect for the second case where the VFS will retry several times before returning ELOOP. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4018
* sysmacros: Make P2ROUNDUP not trigger int overflowJason Zaman2015-11-161-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The original P2ROUNDUP and P2ROUNDUP_TYPED macros contain -x which triggers PaX's integer overflow detection for unsigned integers. Replace the macros with an equivalent version that does not trigger the overflow. Axioms: A. (-(x)) === (~((x) - 1)) === (~(x) + 1) under two's complement. B. ~(x & y) === ((~(x)) | (~(y))) under De Morgan's law. C. ~(~x) === x under the law of excluded middle. Proof: 0. (-(-(x) & -(align))) original 1. (~(-(x) & -(align)) + 1) by A 2. (((~(-(x))) | (~(-(align)))) + 1) by B 3. (((~(~((x) - 1))) | (~(~((align) - 1)))) + 1) by A 4. (((((x) - 1)) | (((align) - 1))) + 1) by C Q.E.D. Signed-off-by: Jason Zaman <[email protected]> Reviewed-by: Chris Dunlop <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3949
* zimport.sh: Add configure/make option supportBrian Behlendorf2015-11-161-7/+12
| | | | | | | | | | | | Allow the following environment variables to control the build behavior of the zimport.sh script. This can be useful when you want a debug build or require specific build options. The default values are: CONFIG_OPTIONS="" MAKE_OPTIONS="-s -j$(nproc)" Signed-off-by: Brian Behlendorf <[email protected]>
* Follow 0/-E convention for module load errorsBrian Behlendorf2015-11-162-6/+2
| | | | | | | | | | Because errors during module load are so rare it went unnoticed that it was possible that a positive errno was returned. This would result in the module being loaded, nothing being initialized, and a system panic shortly thereafter. This is what was causing the hard failures in the automated testing. Signed-off-by: Brian Behlendorf <[email protected]>
* Obey arc_meta_limit default size when changing arc_maxAndCycle2015-11-131-1/+1
| | | | | | | | | | When decreasing the maximum ARC size preserve the 3/4 default ratio for the arc_meta_limit. Otherwise, the arc_meta_limit may be set the same as arc_max. Signed-off-by: AndCycle <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4001
* Add TEST configuration file for buildbotBrian Behlendorf2015-11-101-0/+95
| | | | | | | | | | | | The TEST file is provided as a hint to the automated test infra- structure. It controls which regression tests are run and how they are run. This file along with any lines in the commit messages which start with TEST_* are sourced by the test scripts and can be used to override the default values. For complete details see: https://github.com/zfsonlinux/zfs-buildbot/ Signed-off-by: Brian Behlendorf <[email protected]>
* Fix maybe uninitializedBrian Behlendorf2015-11-091-1/+1
| | | | | | | | | | | | As of gcc 5.1.1 20150618 (Red Hat 5.1.1-4) the -Werror=maybe-uninitialized check detects that 'snapname' in recv_incremental_replication() may not be initialized. Explicitly initialize the variable to resolved the warning. libzfs_sendrecv.c: In function ‘recv_incremental_replication’: libzfs_sendrecv.c:2019:2: error: ‘snapname’ may be used uninitialized in (void) snprintf(buf, sizeof (buf), "%s@%s", fsname, snapname); Signed-off-by: Brian Behlendorf <[email protected]>
* Remove shareiscsi description and example from zfs(8).Turbo Fredriksson2015-10-131-47/+9
| | | | | Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Unmount is part of the shutdown process, not the boot process.Turbo Fredriksson2015-10-131-1/+1
| | | | | | Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes: #3762
* Fix fail path in zfs_znode_allocChunwei Chen2015-10-131-2/+1
| | | | | | | | | | | | | | | | When sa_bulk_lookup() fails, unlock_new_inode() will spit out a WARNING. It will also recursive deadlock on ZFS_OBJ_HOLD_ENTER in zfs_zinactive(). Since we never call insert_inode_locked in fail path, I_NEW is never set, the inode is never hashed. So unlock_new_inode() can be safely remove it. We set z_sa_hdl to NULL in fail path so that iput path will stop at zfs_inactive() without entering zfs_zinactive(). This way we can avoid the deadlock and prevent double sa_handle_destroy(). Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #3899
* Fix use-after-free in vdev_disk_physio_completionChunwei Chen2015-10-131-7/+10
| | | | | | | | | | | | | | | Currently, vdev_disk_physio_completion will try to wake up an waiter without first checking the existence. This creates a race window in which complete is called after dr is freed. We add dr_wait in dio_request to indicate the existence of waiter. Also, remove dr_rw since no one is using it, and reorder dr_ref to make the struct more compact in 64bit. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #3917 Issue #3880
* Illumos 6267 - dn_bonus evicted too earlyJustin T. Gibbs2015-10-135-43/+38
| | | | | | | | | | | | | | | | | 6267 dn_bonus evicted too early Reviewed by: Richard Yao <[email protected]> Reviewed by: Xin LI <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Approved by: Richard Lowe <[email protected]> References: https://www.illumos.org/issues/6267 https://github.com/illumos/illumos-gate/commit/d205810 Signed-off-by: Brian Behlendorf <[email protected]> Ported-by: Ned Bass [email protected] Issue #3865 Issue #3443
* zfs-import: Perform verbatim import using cache fileJames Lee2015-10-132-64/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change modifies the import service to use the default cache file to perform a verbatim import of pools at boot. This fixes code that searches all devices and imported all visible pools. Using the cache file is in keeping with the way ZFS has always worked, how Solaris, Illumos, FreeBSD, and systemd performs imports, and is how it is written in the man page (zpool(1M,8)): All pools in this cache are automatically imported when the system boots. Importantly, the cache contains important information for importing multipath devices, and helps control which pools get imported in more dynamic environments like SANs, which may have thousands of visible and constantly changing pools, which the ZFS_POOL_EXCEPTIONS variable is not equipped to handle. Verbatim imports prevent rogue pools from being automatically imported and mounted where they shouldn't be. The change also stops the service from exporting pools at shutdown. Exporting pools is only meant to be performed explicitly by the administrator of the system. The old behavior of searching and importing all visible pools is preserved and can be switched on by heeding the warning and toggling the ZPOOL_IMPORT_ALL_VISIBLE variable in /etc/default/zfs. Signed-off-by: James Lee <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3777 Closes #3526
* zdb: segfault in dump_bpobj_subobjs()Tim Chase2015-10-131-2/+2
| | | | | | | | | Avoid buffer overrun on all-zero bpobj subobjects by using signed array index. Also fix the type cast on the printf() argument. Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3905
* libzfs: handle EDOM errorsDHE2015-10-131-0/+5
| | | | | | | | | | | EDOM may occur if a user tries to set `recordsize` too large without use "zfs set". This can be demonstrated with: > zpool create testpool -O recordsize=32M /dev/... Signed-off-by: DHE <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3911
* Fix 'arc_c < arc_c_min' panicBrian Behlendorf2015-10-131-1/+2
| | | | | | | | | Strictly enforce keeping 'arc_c >= arc_c_min'. The ASSERTs are left in place to catch this in a debug build but logic has been added to gracefully handle in a production build. Signed-off-by: Brian Behlendorf <[email protected]> Issue #3904
* Rename 'zed.service' to 'zfs-zed.service'Turbo Fredriksson2015-10-023-3/+6
| | | | | | | | | | | | | | | For consistency all systemd unit files and init scripts now share the same names. This prevents an issue where the zed is started twice on systems where both the systemd and sysv infrastructure is installed concurrently. For backward compatibility a 'zed' alias has been added. This allows the user to interact with the service using either the name 'zed' or 'zfs-zed'. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #3837
* Fix zfs-dkms uninstall/updateBrian Behlendorf2015-10-021-5/+6
| | | | | | | | | | | | | Modern versions of dkms cleanup the build directory after installing. This resulted in 'dkms uninstall' never running because the check added by commit 866c162 which verifies the existance of the zfs.release build product would never be true. This patch resolves the issue by updating the conditional to check in the explicitly installed zfs_config.h file for the version. Signed-off-by: Brian Behlendorf <[email protected]> Closes #3862