openzfs/zfs.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	deadlock between mm_sem and tx assign in zfs_write() and page fault	ilbsmart	2019-02-22	2	-44/+104
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The bug time sequence: 1. thread #1, `zfs_write` assign a txg "n". 2. In a same process, thread #2, mmap page fault (which means the `mm_sem` is hold) occurred, `zfs_dirty_inode` open a txg failed, and wait previous txg "n" completed. 3. thread #1 call `uiomove` to write, however page fault is occurred in `uiomove`, which means it need `mm_sem`, but `mm_sem` is hold by thread #2, so it stuck and can't complete, then txg "n" will not complete. So thread #1 and thread #2 are deadlocked. Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Signed-off-by: Grady Wong <[email protected]> Closes #7939
*	ZTS: Update O_TMPFILE support check	Brian Behlendorf	2018-11-08	2	-9/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In CentOS 7.5 the kernel provided a compatibility wrapper to support O_TMPFILE. This results in the test setup script correctly detecting kernel support. But the ZFS module was built without O_TMPFILE support due to the non-standard CentOS kernel interface. Handle this case by updating the setup check to fail either when the kernel or the ZFS module fail to provide support. The reason will be clearly logged in the test results. Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7528
*	Skip import activity test in more zdb code paths	Olaf Faaland	2018-11-08	2	-0/+75
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Since zdb opens the pools read-only, it cannot damage the pool in the event the pool is already imported either on the same host or on another one. If the pool vdev structure is changing while zdb is importing the pool, it may cause zdb to crash. However this is unlikely, and in any case it's a user space process and can simply be run again. For this reason, zdb should disable the multihost activity test on import that is normally run. This commit fixes a few zdb code paths where that had been overlooked. It also adds tests to ensure that several common use cases handle this properly in the future. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Gu Zheng <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #7797 Closes #7801
*	Revert "zpool reopen should detect expanded devices"	Tony Hutter	2018-09-13	1	-38/+16
\| \| \| \| \| \| \| \| \|	This reverts commit 2a16d4cfaf791ba2a6f61b29d1e3f2e7b675f913. The commit was causing a "attempt to access beyond the end of device" error: list.zfsonlinux.org/pipermail/zfs-discuss/2018-September/032217.html
*	Fix object reclaim when using large dnodes	Tom Caputi	2018-07-06	1	-4/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, when the receive_object() code wants to reclaim an object, it always assumes that the dnode is the legacy 512 bytes, even when the incoming bonus buffer exceeds this length. This causes a buffer overflow if --enable-debug is not provided and triggers an ASSERT if it is. This patch resolves this issue and adds an ASSERT to ensure this can't happen again. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #7097 Closes #7433
*	Fix problems receiving reallocated dnodes	Tim Chase	2018-07-06	2	-1/+100
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a port of 047116ac - Raw sends must be able to decrease nlevels, to the zfs-0.7-stable branch. It includes the various fixes to the problem of receiving incremental streams which include reallocated dnodes in which the number of dnode slots has changed but excludes the parts which are related to raw streams. From 047116ac: Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. This also includes a one-liner fix from loli10K to fix a test failure: https://github.com/zfsonlinux/zfs/pull/7792#discussion_r212769264 Authored-by: Tom Caputi <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Ported-by: Tim Chase <[email protected]> Closes #6821 Closes #6864 NOTE: This is the first of the port of 3 related patches patches to the zfs-0.7-release branch of ZoL. The other two patches should immediately follow this one.
*	Fedora 28: Fix misc bounds check compiler warnings	Joao Carlos Mendes Luis	2018-07-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Fix a bunch of truncation compiler warnings that show up on Fedora 28 (GCC 8.0.1). Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Issue #7368 Closes #7826 Closes #7830
*	Allow inherited properties in zfs_check_settable()	LOLi	2018-07-06	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change modifies how 'checksum' and 'dedup' properties are verified in zfs_check_settable() handling the case where they are explicitly inherited in the dataset hierarchy when receiving a recursive send stream. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7755 Closes #7576 Closes #7757
*	Fix zfs incremental send remove '-o' properties	LOLi	2018-07-06	1	-9/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When receiving an incremental send stream with intermediary snapshots zfs_receive_one() does not correctly identify the top-level dataset: consequently we restore said snapshots as if they were children datasets in the hierarchy, forcing inheritance of any property received with 'zfs send -o' and effectively removing any locally set value. The test case did not correctly verify this situation because it uses adjacent snapshots, basically testing 'zfs send -i' instead of 'zfs send -I': this commit adds an additional intermediary snapshot to the test script. Reviewed-by: Paul Dagnelie <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7478
*	Add pool state /proc entry, "SUSPENDED" pools	Tony Hutter	2018-07-06	6	-0/+250
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	1. Add a proc entry to display the pool's state: $ cat /proc/spl/kstat/zfs/tank/state ONLINE This is done without using the spa config locks, so it will never hang. 2. Fix 'zpool status' and 'zpool list -o health' output to print "SUSPENDED" instead of "ONLINE" for suspended pools. Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Richard Elling <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7331 Closes #7563
*	zpool reopen should detect expanded devices	Sara Hartse	2018-07-06	1	-16/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update bdev_capacity to have wholedisk vdevs query the size of the underlying block device (correcting for the size of the efi parition and partition alignment) and therefore detect expanded space. Correct vdev_get_stats_ex so that the expandsize is aligned to metaslab size and new space is only reported if it is large enough for a new metaslab. Reviewed by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed by: John Wren Kennedy <[email protected]> Signed-off-by: sara hartse <[email protected]> External-issue: LX-165 Closes #7546 Issue #7582
*	Fix ENOSPC in "Handle zap_add() failures in ..."	Chunwei Chen	2018-07-06	9	-0/+353
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit cc63068 caused ENOSPC error when copy a large amount of files between two directories. The reason is that the patch limits zap leaf expansion to 2 retries, and return ENOSPC when failed. The intent for limiting retries is to prevent pointlessly growing table to max size when adding a block full of entries with same name in different case in mixed mode. However, it turns out we cannot use any limit on the retry. When we copy files from one directory in readdir order, we are copying in hash order, one leaf block at a time. Which means that if the leaf block in source directory has expanded 6 times, and you copy those entries in that block, by the time you need to expand the leaf in destination directory, you need to expand it 6 times in one go. So any limit on the retry will result in error where it shouldn't. Note that while we do use different salt for different directories, it seems that the salt/hash function doesn't provide enough randomization to the hash distance to prevent this from happening. Since cc63068 has already been reverted. This patch adds it back and removes the retry limit. Also, as it turn out, failing on zap_add() has a serious side effect for mzap_upgrade(). When upgrading from micro zap to fat zap, it will call zap_add() to transfer entries one at a time. If it hit any error halfway through, the remaining entries will be lost, causing those files to become orphan. This patch add a VERIFY to catch it. Reviewed-by: Sanjeev Bagewadi <[email protected]> Reviewed-by: Richard Yao <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Albert Lee <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7401 Closes #7421
*	Add test with two kinds of file creation orders	Antonio Russo	2018-05-07	3	-1/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Data loss was identified in #7401 when many small files were copied. This adds a reproducer for this bug and other similar ones: randomly generate N files. Then, listing M of them by `ls -U` order, produce those same files in a directory of the same name. This triggers the bug consistently, provided N and M are large enough. Here, N=2^16 and M=2^13. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Antonio Russo <[email protected]> Closes #7411
*	Allow mounting datasets more than once	Seth Forshee	2018-05-07	2	-1/+106
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently mounting an already mounted zfs dataset results in an error, whereas it is typically allowed with other filesystems. This causes some bad interactions with mount namespaces. Take this sequence for example: - Create a dataset - Create a snapshot of the dataset - Create a clone of the snapshot - Create a new mount namespace - Rename the original dataset The rename results in unmounting and remounting the clone in the original mount namespace, however the remount fails because the dataset is still mounted in the new mount namespace. (Note that this means the mount in the new mount namespace is never being unmounted, so perhaps the unmount/remount of the clone isn't actually necessary.) The problem here is a result of the way mounting is implemented in the kernel module. Since it is not mounting block devices it uses mount_nodev() instead of the usual mount_bdev(). However, mount_nodev() is written for filesystems for which each mount is a new instance (i.e. a new super block), and zfs should be able to detect when a mount request can be satisfied using an existing super block. Change zpl_mount() to call sget() directly with it's own test callback. Passing the objset_t object as the fs data allows checking if a superblock already exists for the dataset, and in that case we just need to return a new reference for the sb's root dentry. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Signed-off-by: Alek Pinchuk <[email protected]> Signed-off-by: Seth Forshee <[email protected]> Closes #5796 Closes #7207
*	Clean up (k)shlib and cfg file shebangs	Giuseppe Di Natale	2018-05-07	23	-16/+490
\| \| \| \| \| \| \| \| \| \| \| \| \|	Most kshlib files are imported by other scripts and do not have a shebang at the top of their files. Make all kshlib follow this convention. Remove shebangs from cfg files as well. Reviewed-by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Close #7406
*	Fix "file is executable, but no shebang" warnings	Tony Hutter	2018-05-07	73	-111/+243
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fedora 28's RPM build checks warn when executable files don't have a shebang line. These warnings are caused when we (incorrectly) include data & config files in the_SCRIPTS automake lines. Files in _SCRIPTS are marked executable by automake. This patch fixes the issue by including non-executable scripts in a _DATA line instead. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7359 Closes #7395
*	Update mmp_delay on sync or skipped, failed write	Olaf Faaland	2018-05-07	3	-6/+25
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When an MMP write is skipped, or fails, and time since mts->mmp_last_write is already greater than mts->mmp_delay, increase mts->mmp_delay. The original code only updated mts->mmp_delay when a write succeeded, but this results in the write(s) after delays and failed write(s) reporting an ub_mmp_delay which is too low. Update mmp_last_write and mmp_delay if a txg sync was successful. At least one uberblock was written, thus extending the time we can be sure the pool will not be imported by another host. Do not allow mmp_delay to go below (MSEC2NSEC(zfs_multihost_interval) / vdev_count_leaves()) so that a period of frequent successful MMP writes, e.g. due to frequent txg syncs, does not result in an import activity check so short it is not reliable based on mmp thread writes alone. Remove unnecessary local variable, start. We do not use the start time of the loop iteration. Add a debug message in spa_activity_check() to allow verification of the import_delay value and to prove the activity check occurred. Alter the tests that import pools and attempt to detect an activity check. Calculate the expected duration of spa_activity_check() based on module parameters at the time the import is performed, rather than a fixed time set in mmp.cfg. The fixed time may be wrong. Also, use the default zfs_multihost_interval value so the activity check is longer and easier to recognize. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #7330
*	Fedora 28: Fix misc bounds check compiler warnings	Tony Hutter	2018-05-07	4	-21/+47
\| \| \| \| \| \| \| \| \| \|	Fix a bunch of (mostly) sprintf/snprintf truncation compiler warnings that show up on Fedora 28 (GCC 8.0.1). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7361 Closes #7368
*	Fix mmap / libaio deadlock	Brian Behlendorf	2018-05-07	7	-1/+163
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Calling uiomove() in mappedread() under the page lock can result in a deadlock if the user space page needs to be faulted in. Resolve the issue by dropping the page lock before the uiomove(). The inode range lock protects against concurrent updates via zfs_read() and zfs_write(). Reviewed-by: Albert Lee <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7335 Closes #7339
*	Remove libattr requirement	DeHackEd	2018-05-07	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \|	RHEL/CentOS 6 supports sys/xattr.h eliminating the need for libattr-devel as a dependency. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: DHE <[email protected]> Closes #7344 Closes #7351
*	Allow to limit zed's syslog chattiness	Tony Hutter	2018-05-07	4	-6/+142
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some usage patterns like send/recv of replication streams can produce a large number of events. In such a case, the current all-syslog.sh zedlet will hold up to its name, and flood the logs with mostly redundant information. Two mitigate this situation, this changeset introduces to new variables ZED_SYSLOG_SUBCLASS_INCLUDE and ZED_SYSLOG_SUBCLASS_EXCLUDE to zed.rc that give more control over which event classes end up in the syslog. Reviewed-by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Signed-off-by: Daniel Kobras <[email protected]> Closes #6886 Closes #7260
*	Record skipped MMP writes in multihost_history	Olaf Faaland	2018-05-07	6	-7/+94
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once per pass through the MMP thread's loop, the vdev tree is walked to find a suitable leaf to write the next MMP block to. If no such leaf is found, the thread sleeps for a while and resumes at the top of the loop. Add an entry to multihost_history when no leaf can be found, and record the reason in the error column. The error code for such entries is a bitfield, displayed in hex: 0x1 At least one vdev (interior or leaf) was not writeable. 0x2 At least one writeable leaf vdev was found, but it had a pending MMP write. timestamp = the time in seconds since the epoch when no leaf could be found originally. duration = the time (in ns) during which no MMP block was written for this reason. This does not include the preceeding inter-write period nor the following inter-write period. vdev_guid = the number of sequential cycles of the MMP thread looop when this occurred. Sample output, truncated to fit: For records of skipped MMP writes the right-most column, vdev_path, is reported as "-". id txg timestamp error duration mmp_delay vdev_guid ... 936 11 1520036441 0 146264 891422313 1740883117838 ... 937 11 1520036441 0 163956 888356657 7320395061548 ... 938 11 1520036442 0 130690 885314969 7320395061548 ... 939 11 1520036442 0 2001068577 882296582 1740883117838 ... 940 11 1520036443 0 161806 882296582 7320395061548 ... 941 11 1520036443 0x2 0 998020546 1 ... 942 11 1520036444 0 136585 998020546 7320395061548 ... 943 11 1520036444 0x2 0 998020257 1 ... 944 11 1520036445 5 2002662964 994160219 1740883117838 ... 945 11 1520036445 0x2 998073118 994160219 3 ... 946 11 1520036447 0 247136 994160219 7320395061548 ... Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #7212
*	Introduce a destroy_dataset helper	Giuseppe Di Natale	2018-05-07	4	-23/+46
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Datasets can be busy when calling zfs destroy. Introduce a helper function to destroy datasets and use it to destroy datasets in zfs_allow_004_pos, zfs_promote_008_pos, and zfs_destroy_002_pos. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Closes #7224 Closes #7246 Closes #7249 Closes #7267
*	Revert "Handle zap_add() failures in mixed ... "	Tony Hutter	2018-04-09	2	-137/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	This reverts commit cc63068e95ee725cce03b1b7ce50179825a6cda5. Under certain circumstances this change can result in an ENOSPC error when adding new files to a directory. See #7401 for full details. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Issue #7401 Closes #7416
*	zdb and inuse tests don't pass with real disks	Paul Zuchowski	2018-03-14	7	-15/+59
\| \| \| \| \| \| \| \| \| \| \|	Due to zpool create auto-partioning in Linux (i.e. sdb1), certain utilities need to use the parition (sdb1) while others use the whole disk name (sdb). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #6939 Closes #7261
*	Take user namespaces into account in policy checks	Wolfgang Bumiller	2018-03-14	12	-0/+388
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Change file related checks to use user namespaces and make sure involved uids/gids are mappable in the current namespace. Note that checks without file ownership information will still not take user namespaces into account, as some of these should be handled via 'zfs allow' (otherwise root in a user namespace could issue commands such as `zpool export`). This also adds an initial user namespace regression test for the setgid bit loss, with a user_ns_exec helper usable in further tests. Additionally, configure checks for the required user namespace related features are added for: * ns_capable * kuid/kgid_has_mapping() * user_ns in cred_t Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Wolfgang Bumiller <[email protected]> Closes #6800 Closes #7270
*	Fix some typos	John Eismeier	2018-03-14	12	-24/+24
\| \| \| \| \| \| \| \|	Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: George Melikov <[email protected]> Signed-off-by: John Eismeier <[email protected]> Closes #7237
*	Add scrub after resilver zed script	Tony Hutter	2018-03-14	8	-12/+115
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	* Add a zed script to kick off a scrub after a resilver. The script is disabled by default. * Add a optional $PATH (-P) option to zed to allow it to use a custom $PATH for its zedlets. This is needed when you're running zed under the ZTS in a local workspace. * Update test scripts to not copy in all-debug.sh and all-syslog.sh by default. They can be optionally copied in as part of zed_setup(). These scripts slow down zed considerably under heavy events loads and can cause events to be dropped or their delivery delayed. This was causing some sporadic failures in the 'fault' tests. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #4662 Closes #7086
*	Correct count_uberblocks in mmp.kshlib	Giuseppe Di Natale	2018-03-14	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	A log_must call was causing count_uberblocks to return more than just the uberblock count. Remove the log_must since it was only logging a sleep. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Closes #7191
*	Handle zap_add() failures in mixed case mode	sanjeevbagewadi	2018-03-14	2	-0/+137
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With "casesensitivity=mixed", zap_add() could fail when the number of files/directories with the same name (varying in case) exceed the capacity of the leaf node of a Fatzap. This results in a ASSERT() failure as zfs_link_create() does not expect zap_add() to fail. The fix is to handle these failures and rollback the transactions. Reviewed by: Matt Ahrens <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Sanjeev Bagewadi <[email protected]> Closes #7011 Closes #7054
*	Fix zdb -ed on objset for exported pool	Chunwei Chen	2018-03-14	3	-3/+68
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	zdb -ed on objset for exported pool would failed with: failed to own dataset 'qq/fs0': No such file or directory The reason is that zdb pass objset name to spa_import, it uses that name to create a spa. Later, when dmu_objset_own tries to lookup the spa using real pool name, it can't find one. We fix this by make sure we pass pool name rather than objset name to spa_import. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7099 Closes #6464
*	ZTS: Fix create-o_ashift test case	LOLi	2018-03-14	1	-25/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	The function that fills the uberblock ring buffer on every device label has been reworked to avoid occasional failures caused by a race condition that prevents 'zpool sync' from writing some uberblock sequentially: this happens when the pool sync ioctl dispatch code calls txg_wait_synced() while we're already waiting for a TXG to sync. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6924 Closes #6977
*	Remove vn_rename and vn_remove dependency	Brian Behlendorf	2018-03-14	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The only place vn_rename and vn_remove are used is when writing out an updated pool configuration file. By truncating the file instead of renaming and removing it we can avoid having to implement these interfaces entirely. Functionally an empty cache file is treated the same as a missing cache file. This is particularly advantageous because the Linux kernel has never provided a way to reliably implement vn_rename and vn_remove. The cachefile_004_pos.ksh test case was updated to understand that an empty cache file is the same as a missing one. The zfs-import-* systemd service files were not updated to use ConditionFileNotEmpty in place of ConditionPathExists. This means that after exporting all pools and rebooting new pools will not the scanned for on the next boot. This small change should not impact normal usage since pools are not exported as part of a normal shutdown. Documentation was updated accordingly. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Arkadiusz Bubała <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes zfsonlinux/spl#648 Closes #6753
*	Add configure option to enable gcov analysis	Brian Behlendorf	2018-03-14	2	-1/+1
\| \| \| \| \| \| \| \| \| \|	* Add configure option to enable gcov analysis. * Includes a few minor ctime fixes. * Add codecov.yml configuration. Reviewed-by: Prakash Surya <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6642
*	Fix 'zfs receive -o' when used with '-e\|-d'	LOLi	2018-01-30	1	-0/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When used in conjunction with one of '-e' or '-d' zfs receive options none of the properties requested to be set (-o) are actually applied: this is caused by a wrong assumption made about the toplevel dataset in zfs_receive_one(). Fix this by correctly detecting the toplevel dataset. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7088 Requires-spl: refs/pull/679/head
*	Fix shellcheck v0.4.6 warnings	Brian Behlendorf	2018-01-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Resolve new warnings reported after upgrading to shellcheck version 0.4.6. This patch contains no functional changes. * egrep is non-standard and deprecated. Use grep -E instead. [SC2196] * Check exit code directly with e.g. 'if mycmd;', not indirectly with $?. [SC2181] Suppressed. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7040
*	Fix 'zpool add' handling of nested interior VDEVs	LOLi	2018-01-30	3	-3/+177
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When replacing a faulted device which was previously handled by a spare multiple levels of nested interior VDEVs will be present in the pool configuration; the following example illustrates one of the possible situations: NAME STATE READ WRITE CKSUM testpool DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 spare-0 DEGRADED 0 0 0 replacing-0 DEGRADED 0 0 0 /var/tmp/fault-dev UNAVAIL 0 0 0 cannot open /var/tmp/replace-dev ONLINE 0 0 0 /var/tmp/spare-dev1 ONLINE 0 0 0 /var/tmp/safe-dev ONLINE 0 0 0 spares /var/tmp/spare-dev1 INUSE currently in use This is safe and allowed, but get_replication() needs to handle this situation gracefully to let zpool add new devices to the pool. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6678 Closes #6996
*	Handle broken pipes in arc_summary	Giuseppe Di Natale	2018-01-30	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	Using a command similar to 'arc_summary.py \| head' causes a broken pipe exception. Gracefully exit in the case of a broken pipe in arc_summary.py. Reviewed-by: Richard Elling <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Closes #6965 Closes #6969
*	Handle invalid options in arc_summary	LOLi	2018-01-30	2	-0/+39
\| \| \| \| \| \| \| \| \| \|	If an invalid option is provided to arc_summary.py we handle any error thrown from the getopt Python module and print the usage help message. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6983
*	Fix bug in distclean which removes needed files	David Quigley	2018-01-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Running distclean removes the following files because of an error in Makefile.am deleted: tests/zfs-tests/include/commands.cfg deleted: tests/zfs-tests/include/libtest.shlib deleted: tests/zfs-tests/include/math.shlib deleted: tests/zfs-tests/include/properties.shlib deleted: tests/zfs-tests/include/zpool_script.shlib Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: David Quigley <[email protected]> Closes #6636
*	Fix multihost stale cache file import	Brian Behlendorf	2017-12-18	3	-5/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the multihost property is enabled it should be impossible to import an active pool even using the force (-f) option. This patch prevents a forced import from succeeding when importing with a stale cache file. The root cause of the problem is that the kernel modules trusted the hostid provided in configuration. This is always correct when the configuration is generated by scanning for the pool. However, when using an existing cache file the hostid could be stale which would result in the activity check being skipped. Resolve the issue by always using the hostid read from the label configuration where the best uberblock was found. Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6933 Closes #6971
*	Fix ZTS MMP tests and ztest -M behavior	Olaf Faaland	2017-12-18	3	-12/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Quote "$MMP_IMPORT_MSG" when it is passed as an argument, as it is a multi-word string. Some tests were passing when they should not have, because the grep was only testing for the first word. Correct the message expected when no hostid is set and the test attempts to enable multihost. It did not match the actual output in that situation. Disable ztest_reguid() when ztest is invoked with the -M option. If ztest performs a reguid, a concurrent import attempt may fail with the error "one or more devices is currently unavailable" if the guid sum is calculated on the original device guids but compared against the guid sum ztest wrote based on the new device guids. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #6666
*	Fix 'zpool create\|add' replication level check	Brian Behlendorf	2017-12-04	1	-10/+56
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	When the pool configuration contains a hole due to a previous device removal ignore this top level vdev. Failure to do so will result in the current configuration being assessed to have a non-uniform replication level and the expected warning will be disabled. The zpool_add_010_pos test case was extended to cover this scenario. Reviewed-by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6907 Closes #6911
*	Fix 'zfs get {user\|group}objused@' functionality	LOLi	2017-12-04	2	-6/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix a regression accidentally introduced in 1b81ab4 that prevents 'zfs get {user\|group}objused@' from correctly reporting the requested value. Update "userspace_003_pos.ksh" and "groupspace_003_pos.ksh" to verify this functionality. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6908
*	Emit history events for 'zpool create'	Brian Behlendorf	2017-12-04	5	-47/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	History commands and events were being suppressed for the 'zpool create' command since the history object did not yet exist. Create the object earlier so this history doesn't get lost. Split the pool_destroy event in to pool_destroy and pool_export so they may be distinguished. Updated events_001_pos and events_002_pos test cases. They now check for the expected history events and were reworked to be more reliable. Reviewed-by: Nathaniel Clark <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6712 Closes #6486 Conflicts: tests/zfs-tests/tests/functional/events/events_002_pos.ksh
*	Fix truncate(2) mtime and ctime handling	LOLi	2017-11-21	5	-1/+218
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Linux, ftruncate(2) always changes the file timestamps, even if the file size is not changed. However, in case of a successfull truncate(2), the timestamps are updated only if the file size changes. This translates to the VFS calling the ZFS Posix Layer "setattr" function (zpl_setattr) with ATTR_MTIME and ATTR_CTIME unconditionally set on the iattr mask only when doing a ftruncate(2), while the truncate(2) is left to the filesystem implementation to be dealt with. This behaviour is consistent with POSIX:2004/SUSv3 specifications where there's no explicit requirement for file size changes to update the timestamps only for ftruncate(2): http://pubs.opengroup.org/onlinepubs/009695399/functions/truncate.html http://pubs.opengroup.org/onlinepubs/009695399/functions/ftruncate.html This has been later updated in POSIX:2008/SUSv4 where, for both truncate(2)/ftruncate(2), there's no mention of this size change requirement: http://austingroupbugs.net/view.php?id=489 http://pubs.opengroup.org/onlinepubs/9699919799/functions/truncate.html http://pubs.opengroup.org/onlinepubs/9699919799/functions/ftruncate.html Unfortunately the Linux VFS is still calling into the ZPL without ATTR_MTIME/ATTR_CTIME set in the truncate(2) case: we fix this by explicitly updating the timestamps when detecting the ATTR_SIZE bit, which is always set in do_truncate(), on the iattr mask. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6811 Closes #6819
*	Fix chattr/cleanup failure	Brian Behlendorf	2017-10-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The chattr cleanup step may fail to delete the user if there is still an active process running as that user. Retry the userdel when this occurs to eliminate spurious false positves. ERROR: userdel quser1 exited 8 userdel: user quser1 is currently used by process 26814 Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6749
*	Fix intra-pool resumable 'zfs send -t <token>'	LOLi	2017-10-16	7	-12/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because resuming from a token requires "guid" -> "snapshot" mapping we have to walk the whole dataset hierarchy to find the right snapshot to send; when both source and destination exists, for an incremental resumable stream, libzfs gets confused and picks up the wrong snapshot to send from: this results in attempting to send "destination@snap1 -> source@snap2" instead of "source@snap1 -> source@snap2" which fails with a "Invalid cross-device link" error (EXDEV). Fix this by adjusting the logic behind dataset traversal in zfs_iter_children() to pick the right snapshot to send from. Additionally update dry-run 'zfs send -t' to print its output to stderr: this is consistent with other dry-run commands. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6618 Closes #6619 Closes #6623
*	Fix ARC behavior on 32-bit systems	Brian Behlendorf	2017-10-16	4	-18/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the addition of the ABD changes consumption of the virtual address space has been greatly reduced. This exposed an issue on CONFIG_HIGHMEM systems where free memory was being calculated incorrectly. Functionally this didn't cause any major problems prior to ABD because a lack of available virtual address space was used as an indicator of low memory. This patch makes the following changes to address the issue and in the process realigns the code further with OpenZFS. There are no substantive changes in behavior for 64-bit systems. * Added CONFIG_HIGHMEM case to the arc_all_memory() and arc_free_memory() functions to only consider low memory pages on CONFIG_HIGHMEM systems. * The arc_free_memory() function was updated to return bytes instead of pages to be consistent with the other helper functions. In user space we make up some reasonable values since currently only testing is performed in this context. * Adds three new values to the arcstats kstat to provide visibility in to the ARC's assessment of the memory situation: memory_all_bytes, memory_free_bytes, and memory_available_bytes. * Added kmem_reap() call to arc_available_memory() for 32-bit builds to realign code with OpenZFS. * Reduced size of test file in /async_destroy_001_pos.ksh to speed up test case. Multiple txgs are still required. * Move vdevs used by zpool_clear_001_pos and zpool_upgrade_002_pos to TEST_BASE_DIR location to speed up test cases. Reviewed-by: David Quigley <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #5352 Closes #6734
*	receive_freeobjects() skips freeing some objects	Ned Bass	2017-10-16	2	-1/+83
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When receiving a FREEOBJECTS record, receive_freeobjects() incorrectly skips a freed object in some cases. Specifically, this happens when the first object in the range to be freed doesn't exist, but the second object does. This leaves an object allocated on disk on the receiving side which is unallocated on the sending side, which may cause receiving subsequent incremental streams to fail. The bug was caused by an incorrect increment of the object index variable when current object being freed doesn't exist. The increment is incorrect because incrementing the object index is handled by a call to dmu_object_next() in the increment portion of the for loop statement. Add test case that exposes this bug. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ned Bass <[email protected]> Closes #6694 Closes #6695