aboutsummaryrefslogtreecommitdiffstats
path: root/tests
Commit message (Collapse)AuthorAgeFilesLines
* OpenZFS 9164 - assert: newds == os->os_dsl_datasetAndriy Gapon2018-03-303-13/+22
| | | | | | | | | | | | | | | | | | | | Authored by: Andriy Gapon <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Don Brady <[email protected]> Reviewed-by: loli10K <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Approved by: Richard Lowe <[email protected]> Ported-by: Giuseppe Di Natale <[email protected]> Porting Notes: * Re-enabled and tweaked the zpool_upgrade_007_pos test case to successfully run in under 5 minutes. OpenZFS-issue: https://www.illumos.org/issues/9164 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0e776dc06a Closes #6112 Closes #7336
* Fix mmap / libaio deadlockBrian Behlendorf2018-03-288-2/+164
| | | | | | | | | | | | | | Calling uiomove() in mappedread() under the page lock can result in a deadlock if the user space page needs to be faulted in. Resolve the issue by dropping the page lock before the uiomove(). The inode range lock protects against concurrent updates via zfs_read() and zfs_write(). Reviewed-by: Albert Lee <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7335 Closes #7339
* Remove libattr requirementDeHackEd2018-03-272-2/+2
| | | | | | | | | | RHEL/CentOS 6 supports sys/xattr.h eliminating the need for libattr-devel as a dependency. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: DHE <[email protected]> Closes #7344 Closes #7351
* Add JSON output support to channel programsAlek P2018-03-196-0/+204
| | | | | | | | | | | | | | | | | | | The changes piggyback JSON output support on top of channel programs (#6558). This way the JSON output support is targeted to scripting use cases and is easily maintainable since it really only touches one function (zfs_do_channel_program()). This patch ports Joyent's JSON nvlist library from illumos to enable easy JSON printing of channel program output nvlist. To keep the delta small I also took advantage of the fact that printing in zfs_do_channel_program() was almost always done before exiting the program. Reviewed by: Matt Ahrens <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Richard Elling <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Alek Pinchuk <[email protected]> Closes #7281
* OpenZFS 9076 - Adjust perf test concurrency settingsStephen Blinick2018-03-155-10/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ZFS Performance test concurrency should be lowered for better latency Work by Stephen Blinick. Nightly performance runs typically consist of two levels of concurrency; and both are fairly high. Since the IO runs are to a ZFS filesystem, within a zpool, which is based on some variable number of vdev's, the amount of IO driven to each device is variable. Additionally, different device types (HDD vs SSD, etc) can generally handle a different amount of concurrent IO before saturating. Nevertheless, in practice, it appears that most tests are well past the concurrency saturation point and therefore both perform with the same throughput, the maximum of the device. Because the queuedepth to the device(s) is so high however, the latency is much higher than the best possible at that throughput, and increases linearly with the increase in concurrency. This means that changes in code that impact latency during normal operation (before saturation) may not be apparent when a large component of the measured latency is from the IO sitting in a queue to be serviced. Therefore, changing the concurrency settings is recommended Authored by: Stephen Blinick <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Dan Kimmel <[email protected]> Reviewed by: John Wren Kennedy <[email protected]> Ported-by: John Wren Kennedy <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9076 OpenZFS-commit: https://github.com/openzfs/openzfs/pull/562 Upstream bug: DLPX-45477 Closes #7302
* zdb and inuse tests don't pass with real disksPaul Zuchowski2018-03-078-15/+59
| | | | | | | | | | Due to zpool create auto-partioning in Linux (i.e. sdb1), certain utilities need to use the parition (sdb1) while others use the whole disk name (sdb). Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Paul Zuchowski <[email protected]> Closes #6939 Closes #7261
* Take user namespaces into account in policy checksWolfgang Bumiller2018-03-0713-0/+392
| | | | | | | | | | | | | | | | | | | | | | | | | Change file related checks to use user namespaces and make sure involved uids/gids are mappable in the current namespace. Note that checks without file ownership information will still not take user namespaces into account, as some of these should be handled via 'zfs allow' (otherwise root in a user namespace could issue commands such as `zpool export`). This also adds an initial user namespace regression test for the setgid bit loss, with a user_ns_exec helper usable in further tests. Additionally, configure checks for the required user namespace related features are added for: * ns_capable * kuid/kgid_has_mapping() * user_ns in cred_t Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Wolfgang Bumiller <[email protected]> Closes #6800 Closes #7270
* ZTS: fix send-c_stream_size_estimateBrian Behlendorf2018-03-071-0/+1
| | | | | | | | | | | | The test could fail when attempting to write to a newly created volume which was missing its device node. Resolve the issue by calling block_device_wait() which blocks until udev creates the needed entry. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7276 Closes #7277
* Fix dbufstats_001_posGiuseppe Di Natale2018-03-072-4/+28
| | | | | | | | | | | | | | Implement a new helper within_tolerance to test if a value is within range of a target. Because the dbufstats and dbufs kstat file are being read at slightly different times, it is possible for stats to be slightly off. Use within_tolerance to determine if the value is "close enough" to the target. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Closes #7239 Closes #7266
* Allow to limit zed's syslog chattinessTony Hutter2018-03-065-7/+143
| | | | | | | | | | | | | | | | | | Some usage patterns like send/recv of replication streams can produce a large number of events. In such a case, the current all-syslog.sh zedlet will hold up to its name, and flood the logs with mostly redundant information. Two mitigate this situation, this changeset introduces to new variables ZED_SYSLOG_SUBCLASS_INCLUDE and ZED_SYSLOG_SUBCLASS_EXCLUDE to zed.rc that give more control over which event classes end up in the syslog. Reviewed-by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Signed-off-by: Daniel Kobras <[email protected]> Closes #6886 Closes #7260
* Record skipped MMP writes in multihost_historyOlaf Faaland2018-03-067-8/+95
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Once per pass through the MMP thread's loop, the vdev tree is walked to find a suitable leaf to write the next MMP block to. If no such leaf is found, the thread sleeps for a while and resumes at the top of the loop. Add an entry to multihost_history when no leaf can be found, and record the reason in the error column. The error code for such entries is a bitfield, displayed in hex: 0x1 At least one vdev (interior or leaf) was not writeable. 0x2 At least one writeable leaf vdev was found, but it had a pending MMP write. timestamp = the time in seconds since the epoch when no leaf could be found originally. duration = the time (in ns) during which no MMP block was written for this reason. This does not include the preceeding inter-write period nor the following inter-write period. vdev_guid = the number of sequential cycles of the MMP thread looop when this occurred. Sample output, truncated to fit: For records of skipped MMP writes the right-most column, vdev_path, is reported as "-". id txg timestamp error duration mmp_delay vdev_guid ... 936 11 1520036441 0 146264 891422313 1740883117838 ... 937 11 1520036441 0 163956 888356657 7320395061548 ... 938 11 1520036442 0 130690 885314969 7320395061548 ... 939 11 1520036442 0 2001068577 882296582 1740883117838 ... 940 11 1520036443 0 161806 882296582 7320395061548 ... 941 11 1520036443 0x2 0 998020546 1 ... 942 11 1520036444 0 136585 998020546 7320395061548 ... 943 11 1520036444 0x2 0 998020257 1 ... 944 11 1520036445 5 2002662964 994160219 1740883117838 ... 945 11 1520036445 0x2 998073118 994160219 3 ... 946 11 1520036447 0 247136 994160219 7320395061548 ... Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #7212
* Introduce a destroy_dataset helperGiuseppe Di Natale2018-03-064-24/+47
| | | | | | | | | | | | | Datasets can be busy when calling zfs destroy. Introduce a helper function to destroy datasets and use it to destroy datasets in zfs_allow_004_pos, zfs_promote_008_pos, and zfs_destroy_002_pos. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Closes #7224 Closes #7246 Closes #7249 Closes #7267
* Change checksum & IO delay ratelimit valuesTony Hutter2018-03-041-6/+0
| | | | | | | | | | | | | | | | Change checksum & IO delay ratelimit thresholds from 5/sec to 20/sec. This allows zed to actually trigger if a bunch of these events arrive in a short period of time (zed has a threshold of 10 events in 10 sec). Previously, if you had, say, 100 checksum errors in 1 sec, it would get ratelimited to 5/sec which wouldn't trigger zed to fault the drive. Also, convert the checksum and IO delay thresholds to module params for easy testing. Reviewed-by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7252
* ZTS: fix spurious failures in mv_fileschrisrd2018-03-021-14/+3
| | | | | | | | | | | | | | | The test could fail because of a race condition between the files being generated in the background and attempting to move the files. Wait for all file generation to complete before trying to move the files around. Also, clean up the waiting: the 'wait' command without arguments waits for all child pids. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Chris Dunlop <[email protected]> Closes #7220 Closes #7242 Closes #7258
* Add ZFS perf test for dbuf cacheJohn Wren Kennedy2018-02-287-5/+118
| | | | | | | | | | | | | This change adds a test for sequential reads out of the dbuf cache. It's essentially a copy of sequential_reads_cached, using a smaller data set. The sequential read tests are renamed to differentiate them. Authored by: Dan Kimmel <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Signed-off-by: John Wren Kennedy <[email protected]> Closes #7225
* Fix some typosJohn Eismeier2018-02-2812-24/+24
| | | | | | | Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: George Melikov <[email protected]> Signed-off-by: John Eismeier <[email protected]> Closes #7237
* Add Python 3 rewrite of arc_summary.pyScot W. Stevenson2018-02-284-1/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add new script arc_summary3.py as a complete rewrite of the arc_summary.py tool (see issue #6873) Add new options: -g/--graph - Display crude graphic representation of ARC status and quit -r/--raw - Print all available information as minimally formatted list (for grep) -s/--section - Print a single section. This replaces -p/--page, which is kept for backwards use but marked as depreciated Add new sections with information on ZIL and SPL. Notify user if sections L2ARC and VDEV are skipped instead of failing silently. Add warning that -p/--page option is depreciated. Developed for Python 3.5. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Richard Elling <[email protected]> Reviewed by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Scot W. Stevenson <[email protected]> Closes #6873 Closes #6892
* Fix segfault in zfs_do_bookmark()LOLi2018-02-261-2/+25
| | | | | | | | | | | | | | When invoked with wrong parameters 'zfs bookmark' fails to gracefully validate user input and crashes. This is a regression accidentally introduced in 587e228; this commit adds additional tests to the ZFS Test Suite to exercise this codepath. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: KireinaHoro <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7228 Closes #7229
* ZTS: Fix zfs_share_* test case failuresBrian Behlendorf2018-02-241-1/+7
| | | | | | | | | | | | Prevent false positives when running the zfs_share_* test cases due to leftover stale /var/lib/nfs/etab entries. When starting the test group re-synchronize the /var/lib/nfs/etab file with /etc/exports. At this point in the testing there will be no additional `zfs share` entries to add. Reviewed by: George Melikov <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7226
* Add scrub after resilver zed scriptTony Hutter2018-02-239-13/+117
| | | | | | | | | | | | | | | | | | | | * Add a zed script to kick off a scrub after a resilver. The script is disabled by default. * Add a optional $PATH (-P) option to zed to allow it to use a custom $PATH for its zedlets. This is needed when you're running zed under the ZTS in a local workspace. * Update test scripts to not copy in all-debug.sh and all-syslog.sh by default. They can be optionally copied in as part of zed_setup(). These scripts slow down zed considerably under heavy events loads and can cause events to be dropped or their delivery delayed. This was causing some sporadic failures in the 'fault' tests. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #4662 Closes #7086
* Want 'zfs send -b'LOLi2018-02-215-101/+213
| | | | | | | | | | | | | | | This change implements 'zfs send -b' which can be used to send only received property values whether or not they are overridden by local settings. This can be very useful during "restore" operations from a backup pool because it allows to send only the property values originally sent from the backup source, even though they were later modified on the destination either by a 'zfs set' operation, explicit 'zfs inherit' or overridden during the receive process via 'zfs receive -o|-x'. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7156
* Raw receive should change key atomicallyTom Caputi2018-02-211-5/+25
| | | | | | | | | | | | | | | | | Currently, raw zfs sends transfer the encrypted master keys and objset_phys_t encryption parameters in the DRR_BEGIN payload of each send file. Both of these are processed as soon as they are read in dmu_recv_stream(), meaning that the new keys are set before the new snapshot is received. In addition to the fact that this changes the user's keys for the dataset earlier than they might expect, the keys were never reset to what they originally were in the event that the receive failed. This patch splits the processing into objset handling and key handling, the later of which is moved to dmu_recv_end() so that they key change can be done atomically. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #7200
* Raw receives must compress metadnode blocksTom Caputi2018-02-211-11/+0
| | | | | | | | | | | | | | | | | | | Currently, the DMU relies on ZIO layer compression to free LO dnode blocks that no longer have objects in them. However, raw receives disable all compression, meaning that these blocks can never be freed. In addition to the obvious space concerns, this could also cause incremental raw receives to fail to mount since the MAC of a hole is different from that of a completely zeroed block. This patch corrects this issue by adding a special case in zio_write_compress() which will attempt to compress these blocks to a hole even if ZIO_FLAG_RAW_ENCRYPT is set. This patch also removes the zfs_mdcomp_disable tunable, since tuning it could cause these same issues. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #7198
* Correct count_uberblocks in mmp.kshlibGiuseppe Di Natale2018-02-201-1/+1
| | | | | | | | | | | A log_must call was causing count_uberblocks to return more than just the uberblock count. Remove the log_must since it was only logging a sleep. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Closes #7191
* Project Quota on ZFSNasf-Fan2018-02-1335-17/+2210
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Project quota is a new ZFS system space/object usage accounting and enforcement mechanism. Similar as user/group quota, project quota is another dimension of system quota. It bases on the new object attribute - project ID. Project ID is a numerical value to indicate to which project an object belongs. An object only can belong to one project though you (the object owner or privileged user) can change the object project ID via 'chattr -p' or 'zfs project [-s] -p' explicitly. The object also can inherit the project ID from its parent when created if the parent has the project inherit flag (that can be set via 'chattr +P' or 'zfs project -s [-p]'). By accounting the spaces/objects belong to the same project, we can know how many spaces/objects used by the project. And if we set the upper limit then we can control the spaces/objects that are consumed by such project. It is useful when multiple groups and users cooperate for the same project, or a user/group needs to participate in multiple projects. Support the following commands and functionalities: zfs set projectquota@project zfs set projectobjquota@project zfs get projectquota@project zfs get projectobjquota@project zfs get projectused@project zfs get projectobjused@project zfs projectspace zfs allow projectquota zfs allow projectobjquota zfs allow projectused zfs allow projectobjused zfs unallow projectquota zfs unallow projectobjquota zfs unallow projectused zfs unallow projectobjused chattr +/-P chattr -p project_id lsattr -p This patch also supports tree quota based on the project quota via "zfs project" commands set as following: zfs project [-d|-r] <file|directory ...> zfs project -C [-k] [-r] <file|directory ...> zfs project -c [-0] [-d|-r] [-p id] <file|directory ...> zfs project [-p id] [-r] [-s] <file|directory ...> For "df [-i] $DIR" command, if we set INHERIT (project ID) flag on the $DIR, then the proejct [obj]quota and [obj]used values for the $DIR's project ID will be shown as the total/free (avail) resource. Keep the same behavior as EXT4/XFS does. Reviewed-by: Andreas Dilger <[email protected]> Reviewed-by Ned Bass <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Fan Yong <[email protected]> TEST_ZIMPORT_POOLS="zol-0.6.1 zol-0.6.2 master" Change-Id: Ib4f0544602e03fb61fd46a849d7ba51a6005693c Closes #6290
* OpenZFS 9004 - Some ZFS tests used files removed with 32 bit kernelJohn Wren Kennedy2018-02-092-2/+3
| | | | | | | | | | | | | | Authored by: John Wren Kennedy <[email protected]> Reviewed by: Igor Kozhukhov <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Reviewed by: George Melikov <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: Giuseppe Di Natale <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/9004 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/fafe9b241f Closes #7149
* Handle zap_add() failures in mixed case modesanjeevbagewadi2018-02-093-1/+138
| | | | | | | | | | | | | | With "casesensitivity=mixed", zap_add() could fail when the number of files/directories with the same name (varying in case) exceed the capacity of the leaf node of a Fatzap. This results in a ASSERT() failure as zfs_link_create() does not expect zap_add() to fail. The fix is to handle these failures and rollback the transactions. Reviewed by: Matt Ahrens <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Sanjeev Bagewadi <[email protected]> Closes #7011 Closes #7054
* Fix zdb -ed on objset for exported poolChunwei Chen2018-02-094-4/+69
| | | | | | | | | | | | | | | | | | zdb -ed on objset for exported pool would failed with: failed to own dataset 'qq/fs0': No such file or directory The reason is that zdb pass objset name to spa_import, it uses that name to create a spa. Later, when dmu_objset_own tries to lookup the spa using real pool name, it can't find one. We fix this by make sure we pass pool name rather than objset name to spa_import. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7099 Closes #6464
* OpenZFS 8965 - zfs_acl_ls_001_pos fails due to no longer supported grep regexJohn Wren Kennedy2018-02-081-3/+3
| | | | | | | | | | | | | | | | | | The test used \> to detect the end of a string, but this no longer works, so use $ which works as well since the string ends the line anyway. Authored by: John Wren Kennedy <[email protected]> Reviewed by: Akash Ayare <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Yuri Pankov <[email protected]> Reviewed by: Igor Kozhukhov <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: Giuseppe Di Natale <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/8965 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/cb1204e444 Closes #7145
* OpenZFS 8677 - Open-Context Channel ProgramsSerapheim Dimitropoulos2018-02-0814-41/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Authored by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Chris Williamson <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Don Brady <[email protected]> We want to be able to run channel programs outside of synching context. This would greatly improve performance for channel programs that just gather information, as they won't have to wait for synching context anymore. === What is implemented? This feature introduces the following: - A new command line flag in "zfs program" to specify our intention to run in open context. (The -n option) - A new flag/option within the channel program ioctl which selects the context. - Appropriate error handling whenever we try a channel program in open-context that contains zfs.sync* expressions. - Documentation for the new feature in the manual pages. === How do we handle zfs.sync functions in open context? When such a function is found by the interpreter and we are running in open context we abort the script and we spit out a descriptive runtime error. For example, given the script below ... arg = ... fs = arg["argv"][1] err = zfs.sync.destroy(fs) msg = "destroying " .. fs .. " err=" .. err return msg if we run it in open context, we will get back the following error: Channel program execution failed: [string "channel program"]:3: running functions from the zfs.sync submodule requires passing sync=TRUE to lzc_channel_program() (i.e. do not specify the "-n" command line argument) stack traceback: [C]: in function 'destroy' [string "channel program"]:3: in main chunk === What about testing? We've introduced new wrappers for all channel program tests that run each channel program as both (startard & open-context) and expect the appropriate behavior depending on the program using the zfs.sync module. OpenZFS-issue: https://www.illumos.org/issues/8677 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/17a49e15 Closes #6558
* Increase code coverage for Lua librariesDon Brady2018-02-087-2/+1362
| | | | | | | Add test coverage for lua libraries Remove dead code in Lua implementation Signed-off-by: Don Brady <[email protected]>
* Add basic functional tests for zcp user propertiesDon Brady2018-02-083-4/+103
| | | | Signed-off-by: Don Brady <[email protected]>
* OpenZFS 8600 - ZFS channel programs - snapshotChris Williamson2018-02-0813-4/+392
| | | | | | | | | | | | | | | | | Authored by: Chris Williamson <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: John Kennedy <[email protected]> Reviewed by: Brad Lewis <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Don Brady <[email protected]> ZFS channel programs should be able to create snapshots. In addition to the base snapshot functionality, this entails extra logic to handle edge cases which were formerly not possible, such as creating then destroying a snapshot in the same transaction sync. OpenZFS-issue: https://www.illumos.org/issues/8600 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/68089b8b
* OpenZFS 8592 - ZFS channel programs - rollbackBrad Lewis2018-02-084-3/+117
| | | | | | | | | | | | | Authored by: Brad Lewis <[email protected]> Reviewed by: Chris Williamson <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Don Brady <[email protected]> ZFS channel programs should be able to perform a rollback. OpenZFS-issue: https://www.illumos.org/issues/8592 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/d46b5ed6
* OpenZFS 8605 - zfs channel programs fix zfs.existsChris Williamson2018-02-084-2/+75
| | | | | | | | | | | | | | | | | Authored by: Chris Williamson <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Dan Kimmel <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Don Brady <[email protected]> zfs.exists() in channel programs doesn't return any result, and should have a man page entry. This patch corrects zfs.exists so that it returns a value indicating if the dataset exists or not. It also adds documentation about it in the man page. OpenZFS-issue: https://www.illumos.org/issues/8605 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/1e85e111
* OpenZFS 7431 - ZFS Channel ProgramsChris Williamson2018-02-0870-2/+4059
| | | | | | | | | | | | | | | | | | | | | | | | | Authored by: Chris Williamson <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: John Kennedy <[email protected]> Reviewed by: Dan Kimmel <[email protected]> Approved by: Garrett D'Amore <[email protected]> Ported-by: Don Brady <[email protected]> Ported-by: John Kennedy <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/7431 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/dfc11533 Porting Notes: * The CLI long option arguments for '-t' and '-m' don't parse on linux * Switched from kmem_alloc to vmem_alloc in zcp_lua_alloc * Lua implementation is built as its own module (zlua.ko) * Lua headers consumed directly by zfs code moved to 'include/sys/lua/' * There is no native setjmp/longjump available in stock Linux kernel. Brought over implementations from illumos and FreeBSD * The get_temporary_prop() was adapted due to VFS platform differences * Use of inline functions in lua parser to reduce stack usage per C call * Skip some ZFS Test Suite ZCP tests on sparc64 to avoid stack overflow
* Verify ZFS Test Suite scripts executabilityLOLi2018-02-076-0/+0
| | | | | | | | | | | | | | | | This change adds a make target 'testscheck' which verifies all ksh test scripts, part of the ZFS Test Suite, have their executable bit set: this should avoid adding new test scripts that cannot be executed by the test-runner.py which would result in the following warning message: Warning: Test removed from TestGroup because it failed verification. This also verifies both *.cfg and *.kshlib files used in the ZFS Test Suite are not executable. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7131
* Raw sends must be able to decrease nlevelsTom Caputi2018-02-023-2/+101
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, when a raw zfs send file includes a DRR_OBJECT record that would decrease the number of levels of an existing object, the object is reallocated with dmu_object_reclaim() which creates the new dnode using the old object's nlevels. For non-raw sends this doesn't really matter, but raw sends require that nlevels on the receive side match that of the send side so that the checksum-of-MAC tree can be properly maintained. This patch corrects the issue by freeing the object completely before allocating it again in this case. This patch also corrects several issues with dnode_hold_impl() and related functions that prevented dnodes (particularly multi-slot dnodes) from being reallocated properly due to the fact that existing dnodes were not being fully cleaned up when they were freed. This patch adds a test to make sure that zfs recv functions properly with incremental streams containing dnodes of different sizes. Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6821 Closes #6864
* Encryption Stability and On-Disk Format FixesTom Caputi2018-02-026-4/+209
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The on-disk format for encrypted datasets protects not only the encrypted and authenticated blocks themselves, but also the order and interpretation of these blocks. In order to make this work while maintaining the ability to do raw sends, the indirect bps maintain a secure checksum of all the MACs in the block below it along with a few other fields that determine how the data is interpreted. Unfortunately, the current on-disk format erroneously includes some fields which are not portable and thus cannot support raw sends. It is not possible to easily work around this issue due to a separate and much smaller bug which causes indirect blocks for encrypted dnodes to not be compressed, which conflicts with the previous bug. In addition, the current code generates incompatible on-disk formats on big endian and little endian systems due to an issue with how block pointers are authenticated. Finally, raw send streams do not currently include dn_maxblkid when sending both the metadnode and normal dnodes which are needed in order to ensure that we are correctly maintaining the portable objset MAC. This patch zero's out the offending fields when computing the bp MAC and ensures that these MACs are always calculated in little endian order (regardless of the host system's byte order). This patch also registers an errata for the old on-disk format, which we detect by adding a "version" field to newly created DSL Crypto Keys. We allow datasets without a version (version 0) to only be mounted for read so that they can easily be migrated. We also now include dn_maxblkid in raw send streams to ensure the MAC can be maintained correctly. This patch also contains minor bug fixes and cleanups. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #6845 Closes #6864 Closes #7052
* Fix 'zfs receive -o' when used with '-e|-d'LOLi2018-01-301-0/+36
| | | | | | | | | | | | | When used in conjunction with one of '-e' or '-d' zfs receive options none of the properties requested to be set (-o) are actually applied: this is caused by a wrong assumption made about the toplevel dataset in zfs_receive_one(). Fix this by correctly detecting the toplevel dataset. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #7088
* Add dbuf hash and dbuf cache kstatsGiuseppe Di Natale2018-01-298-1/+237
| | | | | | | | | | | | | | | | | | | | | | | Introduce kstats about the dbuf hash and dbuf cache to make it easier to inspect state. This should help with debugging and understanding of these portions of the codebase. Correct format of dbuf kstat file. Introduce a dbc column to dbufs kstat to indicate if a dbuf is in the dbuf cache. Introduce field filtering in the dbufstat python script. Introduce a no header option to the dbufstat python script. Introduce a test case to test basic mru->mfu list movement in the ARC. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Closes #6906
* zpool import -d to specify device pathChunwei Chen2018-01-263-0/+56
| | | | | | | | | | | | | | | | | When we know which devices have the pool we are looking for, sometime it's better if we can directly pass those device paths to zpool import instead of letting it to search through all unrelated stuff, which might take a lot of time if you have hundreds of disks. This patch allows option -d <dev_path> to zpool import. You can have multiple pairs of -d <dev_path>, and zpool import will only search through those devices. For example: zpool import -d /dev/sda -d /dev/sdb Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #7077
* Extend deadman logicBrian Behlendorf2018-01-256-0/+243
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The intent of this patch is extend the existing deadman code such that it's flexible enough to be used by both ztest and on production systems. The proposed changes include: * Added a new `zfs_deadman_failmode` module option which is used to dynamically control the behavior of the deadman. It's loosely modeled after, but independant from, the pool failmode property. It can be set to wait, continue, or panic. * wait - Wait for the "hung" I/O (default) * continue - Attempt to recover from a "hung" I/O * panic - Panic the system * Added a new `zfs_deadman_ziotime_ms` module option which is analogous to `zfs_deadman_synctime_ms` except instead of applying to a pool TXG sync it applies to zio_wait(). A default value of 300s is used to define a "hung" zio. * The ztest deadman thread has been re-enabled by default, aligned with the upstream OpenZFS code, and then extended to terminate the process when it takes significantly longer to complete than expected. * The -G option was added to ztest to print the internal debug log when a fatal error is encountered. This same option was previously added to zdb in commit fa603f82. Update zloop.sh to unconditionally pass -G to obtain additional debugging. * The FM_EREPORT_ZFS_DELAY event which was previously posted when the deadman detect a "hung" pool has been replaced by a new dedicated FM_EREPORT_ZFS_DEADMAN event. * The proposed recovery logic attempts to restart a "hung" zio by calling zio_interrupt() on any outstanding leaf zios. We may want to further restrict this to zios in either the ZIO_STAGE_VDEV_IO_START or ZIO_STAGE_VDEV_IO_DONE stages. Calling zio_interrupt() is expected to only be useful for cases when an IO has been submitted to the physical device but for some reasonable the completion callback hasn't been called by the lower layers. This shouldn't be possible but has been observed and may be caused by kernel/driver bugs. * The 'zfs_deadman_synctime_ms' default value was reduced from 1000s to 600s. * Depending on how ztest fails there may be no cache file to move. This should not be considered fatal, collect the logs which are available and carry on. * Add deadman test cases for spa_deadman() and zio_wait(). * Increase default zfs_deadman_checktime_ms to 60s. Reviewed-by: Tim Chase <[email protected]> Reviewed by: Thomas Caputi <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6999
* Fix shellcheck v0.4.6 warningsBrian Behlendorf2018-01-171-1/+1
| | | | | | | | | | | | | Resolve new warnings reported after upgrading to shellcheck version 0.4.6. This patch contains no functional changes. * egrep is non-standard and deprecated. Use grep -E instead. [SC2196] * Check exit code directly with e.g. 'if mycmd;', not indirectly with $?. [SC2181] Suppressed. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7040
* Support -fsanitize=address with --enable-asanBrian Behlendorf2018-01-105-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When --enable-asan is provided to configure then build all user space components with fsanitize=address. For kernel support use the Linux KASAN feature instead. https://github.com/google/sanitizers/wiki/AddressSanitizer When using gcc version 4.8 any test case which intentionally generates a core dump will fail when using --enable-asan. The default behavior is to disable core dumps and only newer versions allow this behavior to be controled at run time with the ASAN_OPTIONS environment variable. Additionally, this patch includes some build system cleanup. * Rules.am updated to set the minimum AM_CFLAGS, AM_CPPFLAGS, and AM_LDFLAGS. Any additional flags should be added on a per-Makefile basic. The --enable-debug and --enable-asan options apply to all user space binaries and libraries. * Compiler checks consolidated in always-compiler-options.m4 and renamed for consistency. * -fstack-check compiler flag was removed, this functionality is provided by asan when configured with --enable-asan. * Split DEBUG_CFLAGS in to DEBUG_CFLAGS, DEBUG_CPPFLAGS, and DEBUG_LDFLAGS. * Moved default kernel build flags in to module/Makefile.in and split in to ZFS_MODULE_CFLAGS and ZFS_MODULE_CPPFLAGS. These flags are set with the standard ccflags-y kbuild mechanism. * -Wframe-larger-than checks applied only to binaries or libraries which include source files which are built in both user space and kernel space. This restriction is relaxed for user space only utilities. * -Wno-unused-but-set-variable applied only to libzfs and libzpool. The remaining warnings are the result of an ASSERT using a variable when is always declared. * -D_POSIX_PTHREAD_SEMANTICS and -D__EXTENSIONS__ dropped because they are Solaris specific and thus not needed. * Ensure $GDB is defined as gdb by default in zloop.sh. Signed-off-by: DHE <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7027
* Disable history_004_posBrian Behlendorf2018-01-101-0/+5
| | | | | | | | | | | Occasionally observed failure of history_004_pos due to the test case not being 100% reliable. In order to prevent false positives disable this test case until it can be made reliable. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #7026 Closes #7028
* Fix 'zpool add' handling of nested interior VDEVsLOLi2017-12-283-2/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | When replacing a faulted device which was previously handled by a spare multiple levels of nested interior VDEVs will be present in the pool configuration; the following example illustrates one of the possible situations: NAME STATE READ WRITE CKSUM testpool DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 spare-0 DEGRADED 0 0 0 replacing-0 DEGRADED 0 0 0 /var/tmp/fault-dev UNAVAIL 0 0 0 cannot open /var/tmp/replace-dev ONLINE 0 0 0 /var/tmp/spare-dev1 ONLINE 0 0 0 /var/tmp/safe-dev ONLINE 0 0 0 spares /var/tmp/spare-dev1 INUSE currently in use This is safe and allowed, but get_replication() needs to handle this situation gracefully to let zpool add new devices to the pool. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6678 Closes #6996
* OpenZFS 8909 - 8585 can cause a use-after-free kernel panicPrakash Surya2017-12-283-1/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Authored by: Prakash Surya <[email protected]> Reviewed by: John Kennedy <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: Igor Kozhukhov <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Prakash Surya <[email protected]> PROBLEM ======= There's a race condition that exists if `zil_free_lwb` races with either `zil_commit_waiter_timeout` and/or `zil_lwb_flush_vdevs_done`. Here's an example panic due to this bug: > ::status debugging crash dump vmcore.0 (64-bit) from ip-10-110-205-40 operating system: 5.11 dlpx-5.2.2.0_2017-12-04-17-28-32b6ba51fb (i86pc) image uuid: 4af0edfb-e58e-6ed8-cafc-d3e9167c7513 panic message: BAD TRAP: type=e (#pf Page fault) rp=ffffff0010555970 addr=60 occurred in module "zfs" due to a NULL pointer dereference dump content: kernel pages only > $c zio_shrink+0x12() zil_lwb_write_issue+0x30d(ffffff03dcd15cc0, ffffff03e0730e20) zil_commit_waiter_timeout+0xa2(ffffff03dcd15cc0, ffffff03d97ffcf8) zil_commit_waiter+0xf3(ffffff03dcd15cc0, ffffff03d97ffcf8) zil_commit+0x80(ffffff03dcd15cc0, 9a9) zfs_write+0xc34(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0) fop_write+0x5b(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0) write+0x250(42, fffffd7ff4832000, 2000) sys_syscall+0x177() If there's an outstanding lwb that's in `zil_commit_waiter_timeout` waiting to timeout, waiting on it's waiter's CV, we must be sure not to call `zil_free_lwb`. If we end up calling `zil_free_lwb`, then that LWB may be freed and can result in a use-after-free situation where the stale lwb pointer stored in the `zil_commit_waiter_t` structure of the thread waiting on the waiter's CV is used. A similar situation can occur if an lwb is issued to disk, and thus in the `LWB_STATE_ISSUED` state, and `zil_free_lwb` is called while the disk is servicing that lwb. In this situation, the lwb will be freed by `zil_free_lwb`, which will result in a use-after-free situation when the lwb's zio completes, and `zil_lwb_flush_vdevs_done` is called. This race condition is prevented in `zil_close` by calling `zil_commit` before `zil_free_lwb` is called, which will ensure all outstanding (i.e. all lwb's in the `LWB_STATE_OPEN` and/or `LWB_STATE_ISSUED` states) reach the `LWB_STATE_DONE` state before the lwb's are freed (`zil_commit` will not return untill all the lwb's are `LWB_STATE_DONE`). Further, this race condition is prevented in `zil_sync` by only calling `zil_free_lwb` for lwb's that do not have their `lwb_buf` pointer set. All lwb's not in the `LWB_STATE_DONE` state will have a non-null value for this pointer; the pointer is only cleared in `zil_lwb_flush_vdevs_done`, at which point the lwb's state will be changed to `LWB_STATE_DONE`. This race *is* present in `zil_suspend`, leading to this bug. At first glance, it would appear as though this would not be true because `zil_suspend` will call `zil_commit`, just like `zil_close`, but the problem is that `zil_suspend` will set the zilog's `zl_suspend` field prior to calling `zil_commit`. Further, in `zil_commit`, if `zl_suspend` is set, `zil_commit` will take a special branch of logic and use `txg_wait_synced` instead of performing the normal `zil_commit` logic. This call to `txg_wait_synced` might be good enough for the data to reach disk safely before it returns, but it does not ensure that all outstanding lwb's reach the `LWB_STATE_DONE` state before it returns. This is because, if there's an lwb "stuck" in `zil_commit_waiter_timeout`, waiting for it's lwb to timeout, it will maintain a non-null value for it's `lwb_buf` field and thus `zil_sync` will not free that lwb. Thus, even though the lwb's data is already on disk, the lwb will be left lingering, waiting on the CV, and will eventually timeout and be issued to disk even though the write is unnecessary. So, after `zil_commit` is called from `zil_suspend`, we incorrectly assume that there are not outstanding lwb's, and proceed to free all lwb's found on the zilog's lwb list. As a result, we free the lwb that will later be used `zil_commit_waiter_timeout`. SOLUTION ======== The solution to this, is to ensure all outstanding lwb's complete before calling `zil_free_lwb` via `zil_destroy` in `zil_suspend`. This patch accomplishes this goal by forcing the normal `zil_commit` logic when called from `zil_sync`. Now, `zil_suspend` will call `zil_commit_impl` which will always use the normal logic of waiting/issuing lwb's to disk before it returns. As a result, any lwb's outstanding when `zil_commit_impl` is called will be guaranteed to reach the `LWB_STATE_DONE` state by the time it returns. Further, no new lwb's will be created via `zil_commit` since the zilog's `zl_suspend` flag will be set. This will force all new callers of `zil_commit` to use `txg_wait_synced` instead of creating and issuing new lwb's. Thus, all lwb's left on the zilog's lwb list when `zil_destroy` is called will be in the `LWB_STATE_DONE` state, and we'll avoid this race condition. OpenZFS-issue: https://www.illumos.org/issues/8909 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ece62b6f8d Closes #6940
* Support re-prioritizing asynchronous prefetchesTom Caputi2017-12-211-7/+7
| | | | | | | | | | | | | | | | | | | | | When sequential scrubs were merged, all calls to arc_read() (including prefetch IOs) were given ZIO_PRIORITY_ASYNC_READ. Unfortunately, this behaves badly with an existing issue where prefetch IOs cannot be re-prioritized after the issue. The result is that synchronous reads end up in the same vdev_queue as the scrub IOs and can have (in some workloads) multiple seconds of latency. This patch incorporates 2 changes. The first ensures that all scrub IOs are given ZIO_PRIORITY_SCRUB to allow the vdev_queue code to differentiate between these I/Os and user prefetches. Second, this patch introduces zio_change_priority() to provide the missing capability to upgrade a zio's priority. Reviewed by: George Wilson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #6921 Closes #6926
* Handle broken pipes in arc_summaryGiuseppe Di Natale2017-12-191-0/+3
| | | | | | | | | | | Using a command similar to 'arc_summary.py | head' causes a broken pipe exception. Gracefully exit in the case of a broken pipe in arc_summary.py. Reviewed-by: Richard Elling <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Closes #6965 Closes #6969