summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Add logic to try and recover an inode with an invalid modeBrian Behlendorf2015-07-171-4/+11
| | | | | | | | | | | | | | | | | | | | | | | When an inode is detected with invalid mode bits the safe thing to do is panic the system. This indicates a problem with the contents of a dnode and it should never be possible. This is the default behavior. Unfortunately, due to flaws in the system attribute (SA) implementation (on all platforms) it was possible that ZFS could create a damaged dnode. This was a rare issue which only impacted dnodes which used a spill block. Normally only symlinks and files with ACLs would require a spill block. However, if the dataset had the xattr=sa property set and extended attributes were used this problem could occur. As of the 0.6.4 tag the root cause of this issue has been fixed. For pools which are exhibiting this damage the 'zfs_recover=1' module option may be set. This will cause ZFS to interpret the dnode with invalid mode bits as a normal file. This may allow the files to be accessed for recovery purposes. Signed-off-by: Brian Behlendorf <[email protected]> Closes #3548
* Support parallel build trees (VPATH builds)Turbo Fredriksson2015-07-1742-388/+476
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Build products from an out of tree build should be written relative to the build directory. Sources should be referred to by their locations in the source directory. This is accomplished by adding the 'src' and 'obj' variables for the module Makefile.am, using relative paths to reference source files, and by setting VPATH when source files are not co-located with the Makefile. This enables the following: $ mkdir build $ cd build $ ../configure \ --with-spl=$HOME/src/git/spl/ \ --with-spl-obj=$HOME/src/git/spl/build $ make -s This change also has the advantage of resolving the following warning which is generated by modern versions of automake. Makefile.am:00: warning: source file 'xxx' is in a subdirectory, Makefile.am:00: but option 'subdir-objects' is disabled Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #1082
* Update inode under range lockBrian Behlendorf2015-07-171-1/+1
| | | | | | | | | | | | After a successful write the inode must be updated under the range lock. If it is updated after dropping the lock there exists a race where the znode and inode wile disagree about the file size. This could result in narrow window of time where read(2) is able to access data beyond what fstat(2) reports as the file size. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Ned Bass <[email protected]> Closes #3601
* Linux 4.2 compat: follow_link() / put_link()Brian Behlendorf2015-07-176-8/+80
| | | | | | | | | | | | | | | | | | | | As of Linux 4.2 the kernel has completely retired the nameidata structure. One of the few remaining consumers of this interface were the follow_link() and put_link() callbacks. This patch adds the required checks to configure to detect the interface change and updates the functions accordingly. Migrating to the simple_follow_link() interface was considered but was decided against ironically due to the increased complexity. It also should be noted that the kernel follow_link() and put_link() interfaces changes several times after 4.1 and but before 4.2. This means there is a narrow range of kernel commits which never appear in an official tag of the Linux kernel which ZoL will not build. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Issue #3596
* Linux 4.2 compat: remove bio->bi_cnt accessBrian Behlendorf2015-07-171-9/+0
| | | | | | | | | | | | Linux 4.2 commit torvalds/linux@dac5621 renamed bio->bi_cnt to bio->__bi_cnt. Because this value is only used once in a block of debug code it simplest just to remove the PANIC. To my knowledge this debugging has never been hit or proved useful so this is no great loss. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #3596
* Linux 4.2 compat: bdi_setup_and_register()Brian Behlendorf2015-07-171-0/+1
| | | | | | | | | | | The vfs_compat.h header should include the linux/backing-dev.h header because it depends on the bdi_* functions defined there. In previous kernels this header was being indirectly included which prevented a build failure. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #3596
* Illumos 5347 - idle pool may run itself out of spaceMatthew Ahrens2015-07-144-25/+51
| | | | | | | | | | | | | | | | | | | | | | | | 5347 idle pool may run itself out of space Reviewed by: Alex Reece <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Steven Hartland <[email protected]> Reviewed by: Richard Elling <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://github.com/illumos/illumos-gate/commit/231aab8 https://github.com/illumos/illumos-gate/commit/4a92375 3642 https://www.illumos.org/issues/5347 https://github.com/zfsonlinux/zfs/commit/89b1cd6 (partial commit & fix) https://github.com/zfsonlinux/zfs/commit/fbeddd6 Illumos 4390 https://github.com/zfsonlinux/zfs/commit/2696dfa Illumos 3642, 3643 Porting notes: This is completing the partial fix from FreeBSD Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3586
* Illumos 5764 - "zfs send -nv" directs output to stderrManoj Joseph2015-07-142-10/+18
| | | | | | | | | | | | | | | | | | 5764 "zfs send -nv" directs output to stderr Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Basil Crow <[email protected]> Reviewed by: Steven Hartland <[email protected]> Reviewed by: Bayard Bell <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://github.com/illumos/illumos-gate/commit/dc5f28a https://www.illumos.org/issues/5764 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3585
* Illumos 5610 - zfs clone from different source and target pools produces ↵Alexander Eremin2015-07-141-11/+2
| | | | | | | | | | | | | | | | | | | | coredump 5610 zfs clone from different source and target pools produces coredump Reviewed by: Josef 'Jeff' Sipek <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://github.com/illumos/illumos-gate/commit/03b1c29 https://www.illumos.org/issues/5610 https://www.illumos.org/issues/5824 https://github.com/zfsonlinux/zfs/issues/2911 https://github.com/zfsonlinux/zfs/commit/9063f65 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3584
* Illumos 1765 - assert triggered in libzfs_import.cPrasad Joshi2015-07-141-1/+4
| | | | | | | | | | | | | | | | 1765 assert triggered in libzfs_import.c trying to import pool name beginning with a number Reviewed-by: Garrett D'Amore <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Approved by: Robert Mustacchi <[email protected]> References: https://github.com/illumos/illumos-gate/commit/9edf9eb https://www.illumos.org/issues/1765 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3562
* Failure of userland copy should return EFAULTRichard Yao2015-07-141-1/+1
| | | | | | | | | | | | | | | | | Many key internal functions pass system return codes that are safe to return to userland. In the case of ddi_copyin(9F), an error passes -1 and the documentation states very clearly that drivers should pass EFAULT to userland when this happens. http://illumos.org/man/9F/ddi_copyin This does not happen in the ZFS source code. I believe it should be changed to pass EFAULT. I caught this when writing man pages for the libzfs_core API. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3575
* Translate sync zio to sync bioBoris Protopopov2015-07-131-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Translate zio requests with ZIO_PRIORITY_SYNC_READ and ZIO_PRIORITY_SYNC_WRITE into synchronous bio requests by setting READ_SYNC and WRITE_SYNC flags. Specifically, WRITE_SYNC flag turns out to have a pronounced effect when writing to an SSD-based SLOG. When WRITE_SYNC is not set (WRITE is set instead), the block trace for a SLOG device looks as follows: ... 130,96 0 3 0.008968390 0 C W 830464 + 136 [0] 130,96 0 4 0.011999161 0 C W 830720 + 136 [0] 130,96 0 5 0.023955549 0 C W 831744 + 136 [0] 130,96 0 6 0.024337663 19775 A W 832000 + 136 <- (130,97) 829952 130,96 0 7 0.024338823 19775 Q W 832000 + 136 [z_wr_iss/6] 130,96 0 8 0.024340523 19775 G W 832000 + 136 [z_wr_iss/6] 130,96 0 9 0.024343187 19775 P N [z_wr_iss/6] 130,96 0 10 0.024344120 19775 I W 832000 + 136 [z_wr_iss/6] 130,96 0 11 0.026784405 0 UT N [swapper] 1 130,96 0 12 0.026805339 202 U N [kblockd/0] 1 130,96 0 13 0.026807199 202 D W 832000 + 136 [kblockd/0] 130,96 0 14 0.026966948 0 C W 832000 + 136 [0] 130,96 3 1 0.000449358 19788 A W 829952 + 136 <- (130,97) 827904 130,96 3 2 0.000450951 19788 Q W 829952 + 136 [z_wr_iss/19] 130,96 3 3 0.000453212 19788 G W 829952 + 136 [z_wr_iss/19] 130,96 3 4 0.000455956 19788 P N [z_wr_iss/19] 130,96 3 5 0.000457076 19788 I W 829952 + 136 [z_wr_iss/19] 130,96 3 6 0.002786349 0 UT N [swapper] 1 ... Here the 130,197 is the partition created on the log device when adding it to the pool, whereas the base device is 130,96. As one can see, the writes to the SLOG are not marked synchronous (the S is missing next to W), and the queue unplugs occur based on the timer (UT event) resulting in slightly over 2 msec latency of writes. This results in a sub-par performance of single stream synchronous writes (limited by latency of the SLOG). When the WRITE_SYNC is set, a similar trace looks as follows: ... 130,96 4 1 0.000000000 70714 A WS 4280576 + 136 <- (130,97) 4278528 130,96 4 2 0.000000832 70714 Q WS 4280576 + 136 [(null)] 130,96 4 3 0.000002109 70714 G WS 4280576 + 136 [(null)] 130,96 4 4 0.000003394 70714 P N [(null)] 130,96 4 5 0.000003846 70714 I WS 4280576 + 136 [(null)] 130,96 4 6 0.000004854 70714 D WS 4280576 + 136 [(null)] 130,96 5 1 0.000354487 70713 A WS 4280832 + 136 <- (130,97) 4278784 130,96 5 2 0.000355072 70713 Q WS 4280832 + 136 [(null)] 130,96 5 3 0.000356383 70713 G WS 4280832 + 136 [(null)] 130,96 5 4 0.000357635 70713 P N [(null)] 130,96 5 5 0.000358088 70713 I WS 4280832 + 136 [(null)] 130,96 5 6 0.000359191 70713 D WS 4280832 + 136 [(null)] 130,96 0 76 0.000159539 0 C WS 4280576 + 136 [0] 130,96 16 85 0.000742108 70718 A WS 4281088 + 136 <- (130,97) 4279040 130,96 16 86 0.000743197 70718 Q WS 4281088 + 136 [z_wr_iss/15] 130,96 16 87 0.000744450 70718 G WS 4281088 + 136 [z_wr_iss/15] 130,96 16 88 0.000745817 70718 P N [z_wr_iss/15] 130,96 16 89 0.000746705 70718 I WS 4281088 + 136 [z_wr_iss/15] 130,96 16 90 0.000747848 70718 D WS 4281088 + 136 [z_wr_iss/15] 130,96 0 77 0.000604063 0 C WS 4280832 + 136 [0] 130,96 0 78 0.000899858 0 C WS 4281088 + 136 [0] As one can see, all the writes are synchronous (WS), and I/O completions (e.g. from issue I to completion C) take 160-250 usec, or about 10x faster. Since WRITE_SYNC or READ_SYNC flags are among several factors that are considered when processing bio requests, it seems prudent to mark all the zio requests of synchronous priority with the READ/WRITE_SYNC flags to make them eligible for consideration as such by the Linux block I/O layer. Signed-off-by: Boris Protopopov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3529
* Fix switch-bool warningBrian Behlendorf2015-07-131-1/+1
| | | | | | | | | | | As of gcc version 5.1.1 a new warning has been added to detect the use of a boolean in a switch statement (-Wswitch-bool). Resolve the warning by explicitly casting the value to an integer type. zfs-0.6.4/module/zfs/zvol.c: In function 'zvol_request': error: switch condition has boolean value [-Werror=switch-bool] Signed-off-by: Brian Behlendorf <[email protected]>
* Disable gcc bool-compare warningBrian Behlendorf2015-07-134-0/+30
| | | | | | | | | | As of gcc version 5.1.1 a new boolean comparison warning has been introduced. This warning is harmless but is triggered several places in the ZFS code base. Because warnings are promoted to errors when building with debugging enabled it is necessary to disable the warning when using versions of gcc which automatically enabling this check. Signed-off-by: Brian Behlendorf <[email protected]>
* Use truncate instead of fallocate in ziltest.shBrian Behlendorf2015-07-131-2/+2
| | | | | | | | | | For the purposes of creating sparse files the truncate command is preferable to fallocate because generic sparse files are more widely supported by older platforms. Specifically Debian Wheezy which is based on a 2.6.32 kernel used ext3 by default which at the time did not support it. Signed-off-by: Brian Behlendorf <[email protected]>
* Fix Xen Virtual Block Device detectionRichard Yao2015-07-102-2/+11
| | | | | | | | | | | We fail to make partitions on xvd (Xen Virtual Block) devices. This also causes debug builds of zpool create to return an error when given xen virtual block devices. These devices should be given the same treatment as vd (KVM Virtual Block) devices, so we adjust the relevant code paths. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3576
* Illumos 5813 - zfs_setprop_error(): Handle errno value E2BIG.Will Andrews2015-07-101-0/+6
| | | | | | | | | | | | | | | | | 5813 zfs_setprop_error(): Handle errno value E2BIG. Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Prakash Surya <[email protected]> Reviewed by: Richard Elling <[email protected]> Approved by: Garrett D'Amore <[email protected]> References: https://github.com/illumos/illumos-gate/commit/6fdcb3d https://www.illumos.org/issues/5813 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3572
* Illumos 5661 - ZFS: "compression = on" should use lz4 if feature is enabledJustin T. Gibbs2015-07-105-31/+61
| | | | | | | | | | | | | | | | 5661 ZFS: "compression = on" should use lz4 if feature is enabled Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Josef 'Jeff' Sipek <[email protected]> Reviewed by: Xin LI <[email protected]> Approved by: Robert Mustacchi <[email protected]> References: https://github.com/illumos/illumos-gate/commit/db1741f https://www.illumos.org/issues/5661 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3571
* Illumos 5427 - memory leak in libzfs when doing rollbackJan Kryl2015-07-101-4/+1
| | | | | | | | | | | | | | | 5427 memory leak in libzfs when doing rollback Reviewed by: Michael Tsymbalyuk <[email protected]> Reviewed by: Steven Hartland <[email protected]> Approved by: Dan McDonald <[email protected]> References https://github.com/illumos/illumos-gate/commit/b7070b7 https://www.illumos.org/issues/5427 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3569
* Illumos 5118 - When verifying or creating a storage pool, error messages ↵Basil Crow2015-07-101-16/+18
| | | | | | | | | | | | | | | | | | | | only show one device 5118 When verifying or creating a storage pool, error messages only show one device Reviewed by: Adam Leventhal <[email protected]> Reviewed by: Dan Kimmel <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Boris Protopopov <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://github.com/illumos/illumos-gate/commit/75fbdf9 https://www.illumos.org/issues/5118 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3567
* Illumos 4966 - zpool list iterator does not update outputGeorge Wilson2015-07-101-10/+10
| | | | | | | | | | | | | | | | 4966 zpool list iterator does not update output Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: Dan McDonald <[email protected]> Approved by: Garrett D'Amore <[email protected]> References: https://github.com/illumos/illumos-gate/commit/cd67d23 https://www.illumos.org/issues/4966 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3566
* Illumos 4745 - fix AVL code misspellingsJosef 'Jeff' Sipek2015-07-102-7/+8
| | | | | | | | | | | | | | | 4745 fix AVL code misspellings Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Richard Lowe <[email protected]> Approved by: Robert Mustacchi <[email protected]> References: https://github.com/illumos/illumos-gate/commit/6907ca4 https://www.illumos.org/issues/4745 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3565
* Illumos 4626 - libzfs memleak in zpool_in_use()Josef 'Jeff' Sipek2015-07-101-4/+11
| | | | | | | | | | | | | | | | 4626 libzfs memleak in zpool_in_use() Reviewed by: Tony Nguyen <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Reviewed by: George Wilson <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://github.com/illumos/illumos-gate/commit/fb13f48 https://www.illumos.org/issues/4626 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3563
* Move dracut directory to contribBrian Behlendorf2015-07-0914-31/+31
| | | | | | | The dracut code is analogous to the initramfs code and as such it should be located in the contrib with initramfs for consistency. Signed-off-by: Brian Behlendorf <[email protected]>
* Initramfs scripts for ZoL.Turbo Fredriksson2015-07-089-6/+1219
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Supports booting of a ZFS snapshot. Do this by cloning the snapshot into a dataset. If this, the resulting dataset, already exists, destroy it. Then mount it on root. * If snapshot does not exist, use base dataset (the part before '@') as boot filesystem instead. * If no snapshot is specified on the 'root=' kernel command line, but there is an '@', then get a list of snapshots below that filesystem and ask the user which to use. * Clone with 'mountpoint=none' and 'canmount=noauto' - we mount manually and explicitly. * For sub-filesystems, that doesn't have a mountpoint property set, we use the 'org.zol:mountpoint' to keep track of it's mountpoint. * Allow rollback of snapshots instead of clone it and boot from the clone. * Allow mounting a root- and subfs with mountpoint=legacy set * Allow mounting a filesystem which is using nativ encryption. * Support all currently used kernel command line arguments All the different distributions have their own standard on what to specify on the kernel command line to boot of a ZFS filesystem. * Extra options: * zfsdebug=(on,yes,1) Show extra debugging information * zfsforce=(on,yes,1) Force import the pool * rollback=(on,yes,1) Rollback (instead of clone) the snapshot * Only try to import pool if it haven't already been imported * This will negate the need to force import a pool that have not been exported cleanly. * Support exclusion of pools to import by setting ZFS_POOL_EXCEPTIONS in /etc/default/zfs. * Support additional configuration variable ZFS_INITRD_ADDITIONAL_DATASETS to mount additional filesystems not located under your root dataset. * Include /etc/modprobe.d/{zfs,spl}.conf in the initrd if it/they exist. * Include the udev rule to use by-vdev for pool imports. * Include the /etc/default/zfs file to the initrd. * Only try /dev/disk/by-* in the initrd if USE_DISK_BY_ID is set. * Use /dev/disk/by-vdev before anything. * Add /dev as a last ditch attempt. * Fallback to using the cache file if that exist if nothing else worked. * Use /sbin/modprobe instead of built-in (BusyBox) modprobe. This gets rid of the message "modprobe: can't load module zcommon". Thanx to pcoultha for finding this. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #2116 Closes #2114
* Prevent reclaim in metaslab preload threadsTim Chase2015-07-061-0/+2
| | | | | | | | | Reclaim during metaslab preloading can cause deadlocks involving znode z_lock and ARC buffer header ht_lock. Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3532.
* Illumos 5008 - lock contention (rrw_exit) while running a read only loadAlexander Motin2015-07-066-13/+126
| | | | | | | | | | | | | | | | | | | | | | | | | | 5008 lock contention (rrw_exit) while running a read only load Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Alex Reece <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: Richard Yao <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Approved by: Garrett D'Amore <[email protected]> Porting notes: This patch ported perfectly cleanly to ZoL. During testing 100% cached small-block reads, extreme contention was noticed on rrl->rr_lock from rrw_exit() due to the frequent entering and leaving ZPL. Illumos picked up this patch from FreeBSD and it also helps under Linux. On a 1-minute 4K cached read test with 10 fio processes pinned to a single socket on a 4-socket (10 thread per socket) NUMA system, contentions on rrl->rr_lock were reduced from 508799 to 43085. Ported-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3555
* Illumos 5911 - ZFS "hangs" while deleting fileMatthew Ahrens2015-07-065-34/+91
| | | | | | | | | | | | | | | | | | | | | | 5911 ZFS "hangs" while deleting file Reviewed by: Bayard Bell <[email protected]> Reviewed by: Alek Pinchuk <[email protected]> Reviewed by: Simon Klinkert <[email protected]> Reviewed by: Dan McDonald <[email protected]> Approved by: Richard Lowe <[email protected]> References: https://www.illumos.org/issues/5911 https://github.com/illumos/illumos-gate/commit/46e1baa Porting notes: Resolved ISO C90 forbids mixed declarations and code wanting in the dnode_free_range() function. Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3554
* Illumos 5981 - Deadlock in dmu_objset_find_dpArne Jansen2015-07-065-4/+40
| | | | | | | | | | | | | | | 5981 Deadlock in dmu_objset_find_dp Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Dan McDonald <[email protected]> Approved by: Robert Mustacchi <[email protected]> References: https://www.illumos.org/issues/5981 https://github.com/illumos/illumos-gate/commit/1d3f896 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3553
* Illumos 5946, 5945Andriy Gapon2015-07-062-0/+12
| | | | | | | | | | | | | | | | | 5946 zfs_ioc_space_snaps must check that firstsnap and lastsnap refer to snapshots 5945 zfs_ioc_send_space must ensure that fromsnap refers to a snapshot Reviewed by: Steven Hartland <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Approved by: Gordon Ross <[email protected]> References: https://www.illumos.org/issues/5946 https://www.illumos.org/issues/5945 https://github.com/illumos/illumos-gate/commit/24218be Ported-by: Andriy Gapon <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3552
* Illumos 5870 - dmu_recv_end_check() leaks origin_head hold if error happens ↵Andriy Gapon2015-07-061-2/+6
| | | | | | | | | | | | | | | | | in drc_force branch 5870 dmu_recv_end_check() leaks origin_head hold if error happens in drc_force branch Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Andrew Stormont <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/5870 https://github.com/illumos/illumos-gate/commit/beddaa9 Ported-by: Andriy Gapon <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3551
* Illumos 5909 - ensure that shared snap names don't become too long after ↵Andriy Gapon2015-07-061-0/+6
| | | | | | | | | | | | | | | | | promotion 5909 ensure that shared snap names don't become too long after promotion Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/5909 https://github.com/illumos/illumos-gate/commit/cb5842f Ported-by: Andriy Gapon <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3550
* Illumos 5912 - full stream can not be force-received into a dataset if it ↵Andriy Gapon2015-07-061-4/+6
| | | | | | | | | | | | | | | | | has a snapshot 5912 full stream can not be force-received into a dataset if it has a snapshot Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/5912 https://github.com/illumos/illumos-gate/commit/5bae108 Ported-by: Andriy Gapon <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3549
* Illumos 6033 - arc_adjust() should search MFU listsAlek Pinchuk2015-07-011-1/+1
| | | | | | | | | | | | | | | | | 6033 arc_adjust() should search MFU lists for oldest buffer when adjusting MFU size Reviewed by: Saso Kiselkov <[email protected]> Reviewed by: Xin Li <[email protected]> Reviewed by: Prakash Surya <[email protected]> Approved by: Matthew Ahrens <[email protected]> References: https://www.illumos.org/issues/6033 https://github.com/illumos/illumos-gate/commit/31c46cf Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3545
* man: fix spelling mistakes in manualColin Ian King2015-07-014-6/+6
| | | | | | | | | | | | | | | | | | | | | A few minor mistakes than should be fixed: zpool: compatability -> compatibility zfs: accessable -> accessible availible -> available zfs-events: availible -> available zfs-module-parameters: proceding -> proceeding Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3544
* Illumos 5175 - implement dmu_read_uio_dbuf() to improve cached read performanceMatthew Ahrens2015-06-293-11/+78
| | | | | | | | | | | | | | | | | | | | | | | | 5175 implement dmu_read_uio_dbuf() to improve cached read performance Reviewed by: Adam Leventhal <[email protected]> Reviewed by: Alex Reece <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Richard Elling <[email protected]> Approved by: Robert Mustacchi <[email protected]> References: https://www.illumos.org/issues/5175 https://github.com/illumos/illumos-gate/commit/f8554bb Porting notes: This patch doesn't include the changes for the COMSTAR (Common Multiprotocol SCSI Target) - since it's not available for ZoL. http://thegreyblog.blogspot.co.at/2010/02/setting-up-solaris-comstar-and.html Ported by: kernelOfTruth <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3392
* Add /dev/mapper to the list of possible sources for pool devices.Turbo Fredriksson2015-06-291-0/+9
| | | | | | | | This is especially needed when using LUKS backed pools. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3536
* Remove l2ad_evict from zfs_l2arc_evict_classTim Chase2015-06-291-5/+2
| | | | | | | | | | Illumos 5701 (zpool list reports incorrect "alloc" value for cache devices) removed l2ad_evict from l2arc_dev_t. It should also be removed from the zfs_l2arc_evict_class event class. Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3534
* Add ziltest.shBrian Behlendorf2015-06-262-0/+302
| | | | | | | | | | | | | | | The ziltest.sh script is a test case designed to verify the correct functioning of the ZIL. It's being added to the scripts directory so it can be easily added to the automated regression testing. The general idea is to build up an intent log from a bunch of diverse user commands without actually committing them to the file system. Then copy the file system, replay the intent log and compare the file system and the copy. Ported-by: Don Brady <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3531
* Fix for recent zdb -h | -i crashes (seg fault)Don Brady2015-06-262-7/+20
| | | | | | | | | | Allocating SPA_MAXBLOCKSIZE on the stack is a bad idea (even with the old 128K size). Use malloc instead when allocating temporary block buffer memory. Signed-off-by: Don Brady <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3522
* zdb -d has false positive warning when feature@large_blocks=disabledDon Brady2015-06-261-11/+16
| | | | | | | | Skip large blocks feature refcount checking if feature is disabled. Signed-off-by: Don Brady <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3468
* Additional SYSV init script fixes (3).Turbo Fredriksson2015-06-252-54/+22
| | | | | | | | | | | | | | | | | | * In read_mtab(), fix problems (!?) in the mounts file. It will record 'rpool 1' as 'rpool\0401' instead of 'rpool\00401' which seems to be the correct (at least as far as 'printf' is concerned). Use this using the external 'echo' command (and not the one built in to the shell) because the internal one would interpret the backslash code (incorrectly), giving us a  instead. * Remove reregister_mounts() - no longer needed. * For Gentoo, the zfs_log_failure_msg() should use eend(), not eerror() (which requires an error message, which we don't have). Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3488 Closes #3509 Closes #3514
* Revert "Additional SYSV init script fixes."Turbo Fredriksson2015-06-251-10/+7
| | | | | | | | | | | This reverts commit 036391c980c1e6504352b770eb385806a951b1cb. Because #3509 came just after this commit was accepted and is related to the original problem the commit was supposed to fix, we need to solve the problem in another way. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 5368 - ARC should cache more metadataMatthew Ahrens2015-06-251-0/+6
| | | | | | | | | | | | | | | | | | | | 5368 ARC should cache more metadata Reviewed by: Alex Reece <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Richard Elling <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/5368 https://github.com/illumos/illumos-gate/commit/3a5286a Porting Notes: The vast majority of this patch was already merged in the context of the 06358ea changes. This is just a small hunk which was missed. Ported-by: Brian Behlendorf <[email protected]>
* Illumos 5163 - arc should reap range_seg_cacheGeorge Wilson2015-06-254-2/+9
| | | | | | | | | | | | | | | | | | | | 5163 arc should reap range_seg_cache Reviewed by: Christopher Siden <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Richard Elling <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/5163 https://github.com/illumos/illumos-gate/commit/83803b5 Porting Notes: Added umem_cache_reap_now() wrapped to suppress unused variable warning for user space build in arc_kmem_reap_now(). Ported-by: Brian Behlendorf <[email protected]>
* Update all default taskq settingsBrian Behlendorf2015-06-2510-37/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Over the years the default values for the taskqs used on Linux have differed slightly from illumos. In the vast majority of cases this was done to avoid creating an obnoxious number of idle threads which would pollute the process listing. With the addition of support for dynamic taskqs all multi-threaded queues should be created as dynamic taskqs. This allows us to get the best of both worlds. * The illumos default values for the I/O pipeline can be restored. These values are known to work well for most workloads. The only exception is the zio write interrupt taskq which is changed to ZTI_P(12, 8). At least under Linux more threads has been shown to improve performance, see commit 7e55f4e. * Reduces the number of idle threads on the system when it's not under heavy load. The maximum number of threads will only be created when they are required. * Remove the vdev_file_taskq and rely on the system_taskq instead which is now dynamic and may have up to 64-threads. Again this brings us back inline with upstream. * Tasks dispatched with taskq_dispatch_ent() are allowed to use dynamic taskqs. The Linux taskq implementation supports this. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes #3507
* Account for ashift when gathering buffers to be written to l2arc deviceAndriy Gapon2015-06-251-13/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If we don't account for that, then we might end up overwriting disk area of buffers that have not been evicted yet, because l2arc_evict operates in terms of disk addresses. The discrepancy between the write size calculation and the actual increment to l2ad_hand was introduced in commit 3a17a7a9. The change that introduced l2ad_hand alignment was almost correct as the write size was accumulated as a sum of rounded buffer sizes. See commit illumos/illumos-gate@e14bb32. Also, we now consistently use asize / a_sz for the allocated size and psize / p_sz for the physical size. The latter accounts for a possible size reduction because of the compression, whereas the former accounts for a possible subsequent size expansion because of the alignment requirements. The code still assumes that either underlying storage subsystems or hardware is able to do read-modify-write when an L2ARC buffer size is not a multiple of a disk's block size. This is true for 4KB sector disks that provide 512B sector emulation, but may not be true in general. In other words, we currently do not have any code to make sure that an L2ARC buffer, whether compressed or not, which is used for physical I/O has a suitable size. Note that currently the cache device utilization is calculated based on the physical size, not the allocated size. The same applies to l2_asize kstat. That is wrong, but this commit does not fix that. The accounting problem was introduced partially in commit 3a17a7a9 and partially in 3038a2b (accounting became consistent but in favour of the wrong size). Porting Notes: Reworked to be C90 compatible and the 'write_psize' variable was removed because it is now unused. References: https://reviews.csiden.org/r/229/ https://reviews.freebsd.org/D2764 Ported-by: kernelOfTruth <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3400 Closes #3433 Closes #3451
* Illumos 5701 - zpool list reports incorrect "alloc" value for cache devicesPrakash Surya2015-06-252-43/+140
| | | | | | | | | | | | | | | | | | | | 5701 zpool list reports incorrect "alloc" value for cache devices Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Alek Pinchuk <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/5701 https://github.com/illumos/illumos-gate/commit/a52fc31 Porting Notes: arc_space_return(HDR_L2ONLY_SIZE, ARC_SPACE_L2HDRS); correctly placed at arc_hdr_l2hdr_destroy(arc_buf_hdr_t *hdr). Ported by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Add IMPLY() and EQUIV() macrosBrian Behlendorf2015-06-241-1/+17
| | | | | | | | | Added for upstream compatibility, they are of the form: * IMPLY(a, b) - if (a) then (b) * EQUIV(a, b) - if (a) then (b) *AND* if (b) then (a) Signed-off-by: Brian Behlendorf <[email protected]>
* zfsdev_getminor() should check for invalid file handlesRichard Yao2015-06-224-9/+39
| | | | | | | | | | | | | | | | | | | | | | Unit testing at ClusterHQ found that passing an invalid file handle to zfs_ioc_hold results in a NULL pointer dereference on a system without assertions: IP: [<ffffffffa0218aa0>] zfsdev_getminor+0x10/0x20 [zfs] Call Trace: [<ffffffffa021b4b0>] zfs_onexit_fd_hold+0x20/0x40 [zfs] [<ffffffffa0214043>] zfs_ioc_hold+0x93/0xd0 [zfs] [<ffffffffa0215890>] zfsdev_ioctl+0x200/0x500 [zfs] An assertion would have caught this had they been enabled, but this is something that the kernel module should handle without failing. We resolve this by searching the linked list to ensure that the file handle's private_data points to a valid zfsdev_state_t. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Andriy Gapon <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3506