| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds five new tests to the ZTS:
* zpool_split_cliargs: verify command line options and arguments
* zpool_split_devices: verify zpool split accepts a device list
* zpool_split_encryption: verify zpool can split encrypted pools
* zpool_split_props: verify zpool split can set property values
* zpool_split_vdevs: verify vdev layout when splitting the pool
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #7409
|
|
|
|
|
|
|
|
|
|
|
|
| |
calloc(3) takes `nelem` (or `nmemb` in glibc) first, and then size of
elements. No difference expected for having these in reverse order,
however should follow the standard.
http://pubs.opengroup.org/onlinepubs/009695399/functions/calloc.html
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #7405
|
|
|
|
|
|
|
|
|
|
| |
When setting `zfs_arc_max` its minimum value is allowed
to be 64 MiB. There was an off-by-1 error which can matter
on tiny systems.
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris Zubrzycki <[email protected]>
Closes #7417
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Mike Gerdts <[email protected]>
Reviewed by: Allan Jude <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: John Kennedy <[email protected]>
Reviewed by: Andy Stormont <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Approved by: Richard Lowe <[email protected]>
Ported-by: Don Brady <[email protected]>
Porting Notes:
* Adopted destroy_dataset in ZTS test cleanup
* Use ksh shebang instead of bash for new tests
OpenZFS-issue: https://www.illumos.org/issues/9286
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/723d0c85
Closes #7387
|
|
|
|
|
|
|
|
|
| |
Commit e4010f2 accidentally allows zpool to set pool features to
"disabled"; this should only be allowed at pool creation. This commit
adds additional checks and test coverage to 'zpool set'.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #7402
|
|
|
|
|
|
|
|
|
|
|
| |
The property description has been updated with new algorithms as well.
Reviewed-by: Matt Ahrens <[email protected]>
Reviewed-by: Richard Elling <[email protected]>
Reviewed-by: loli10K <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: DHE <[email protected]>
Closes #7377
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, dnode_check_slots_free() works by checking dn->dn_type
in the dnode to determine if the dnode is reclaimable. However,
there is a small window of time between dnode_free_sync() in the
first call to dsl_dataset_sync() and when the useraccounting code
is run when the type is set DMU_OT_NONE, but the dnode is not yet
evictable, leading to crashes. This patch adds the ability for
dnodes to track which txg they were last dirtied in and adds a
check for this before performing the reclaim.
This patch also corrects several instances when dn_dirty_link was
treated as a list_node_t when it is technically a multilist_node_t.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7147
Closes #7388
|
|
|
|
|
|
|
|
|
|
| |
queue_flag_{set,clear}_unlocked are now private interfaces in
the Linux kernel (https://github.com/torvalds/linux/commit/8a0ac14).
Use blk_queue_flag_{set,clear} interfaces which were introduced as
of https://github.com/torvalds/linux/commit/8814ce8.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes #7410
|
|
|
|
|
|
|
|
|
|
|
| |
This patch corrects a small issue where two error messages
in the code that checks for invalid keylocations were
swapped.
Reviewed by: Matt Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7418
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit cc63068e95ee725cce03b1b7ce50179825a6cda5.
Under certain circumstances this change can result in an ENOSPC
error when adding new files to a directory. See #7401 for full
details.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Issue #7401
Cloes #7416
|
|
|
|
|
|
|
|
|
|
|
| |
When using 16MB blocks the send/recv queue's aren't quite big
enough. This change leaves the default 16M queue size which a
good value for most pools. But it additionally ensures that the
queue sizes are at least twice the allowed zfs_max_recordsize.
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7365
Closes #7404
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most kshlib files are imported by other scripts
and do not have a shebang at the top of their files.
Make all kshlib follow this convention.
Remove shebangs from cfg files as well.
Reviewed-by: loli10K <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Close #7406
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fedora 28's RPM build checks warn when executable files don't have a
shebang line. These warnings are caused when we (incorrectly)
include data & config files in the_SCRIPTS automake lines. Files in
_SCRIPTS are marked executable by automake. This patch fixes the
issue by including non-executable scripts in a _DATA line instead.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #7359
Closes #7395
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The newest Fedora packaging rules print warnings for scripts using the
/usr/bin/python shebang:
*** WARNING: mangling shebang in /usr/bin/arc_summary.py from
#!/usr/bin/python to #!/usr/bin/python2. This will become an ERROR,
fix it manually!
Fedora wants all cross compatible scripts to pick python3. Since we
don't want our users to have to pick a specific version of python, we
exclude our scripts from the RPM build check.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #7360
Closes #7399
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zfs-mount-generator implements the "systemd generator" protocol,
producing systemd.mount units from the cached outputs of zfs list,
during early boot, integrating with systemd.
Each pool has an indpendent cache of the command
zfs list -H -oname,mountpoint,canmount -tfilesystem -r $pool
which is kept synchronized by the ZEDLET
history_event-zfs-list-cacher.sh
Datasets not in the cache will be loaded later in the boot process by
zfs-mount.service, including pools without a cache.
Among other things, this allows for complex mount hierarchies.
Reviewed-by: Fabian Grünbichler <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Antonio Russo <[email protected]>
Closes #7329
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mdb doesn't have dmu_ot[], so we need a different mechanism for its
SNPRINTF_BLKPTR() to determine if the BP is encrypted vs authenticated.
Additionally, since it already relies on BP_IS_ENCRYPTED (etc),
SNPRINTF_BLKPTR might as well figure out the "crypt_type" on its own,
rather than making the caller do so.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tom Caputi <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #7390
|
|
|
|
|
|
|
|
|
|
|
|
| |
vdev_count_leaves() in the denominator may return 0, caught by Coverity.
Introduced by
* 533ea04 Update mmp_delay on sync or skipped, failed write
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Olaf Faaland <[email protected]>
Closes #7391
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, "zfs mount -a" will print a warning and fail to mount
any encrypted datasets that do not have a key loaded. This patch
makes the behavior of this failure consistent with other failure
modes ("zfs mount -a" will silently continue, explict "zfs mount"
will print a message and return an error code.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7382
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an MMP write is skipped, or fails, and time since
mts->mmp_last_write is already greater than mts->mmp_delay, increase
mts->mmp_delay. The original code only updated mts->mmp_delay when a
write succeeded, but this results in the write(s) after delays and
failed write(s) reporting an ub_mmp_delay which is too low.
Update mmp_last_write and mmp_delay if a txg sync was successful. At
least one uberblock was written, thus extending the time we can be sure
the pool will not be imported by another host.
Do not allow mmp_delay to go below (MSEC2NSEC(zfs_multihost_interval) /
vdev_count_leaves()) so that a period of frequent successful MMP writes,
e.g. due to frequent txg syncs, does not result in an import activity
check so short it is not reliable based on mmp thread writes alone.
Remove unnecessary local variable, start. We do not use the start time
of the loop iteration.
Add a debug message in spa_activity_check() to allow verification of the
import_delay value and to prove the activity check occurred.
Alter the tests that import pools and attempt to detect an activity
check. Calculate the expected duration of spa_activity_check() based on
module parameters at the time the import is performed, rather than a
fixed time set in mmp.cfg. The fixed time may be wrong. Also, use the
default zfs_multihost_interval value so the activity check is longer and
easier to recognize.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Olaf Faaland <[email protected]>
Closes #7330
|
|
|
|
|
|
|
|
|
| |
Fix a bunch of (mostly) sprintf/snprintf truncation compiler
warnings that show up on Fedora 28 (GCC 8.0.1).
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #7361
Closes #7368
|
|
|
|
|
|
|
|
|
|
|
|
| |
zfs_ioc_pool_scan leaks a spa reference when zc->zc_flags is not a
valid pool_scrub_cmd_t: this could happen if the userland binaries
and ZFS kernel module differ in version and would prevent the pool from
being exported.
Reviewed by: Matt Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #7380
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use 'zpool reopen' instead of 'zpool scrub' to kick in the spare device:
this is required to avoid spurious failures caused by a race condition
in events processing by the ZFS Event Daemon:
P1 (zpool scrub) P2 (zed)
---
zfs_ioc_pool_scan()
-> dsl_scan()
-> vdev_reopen()
-> vdev_set_state(VDEV_STATE_CANT_OPEN)
zfs_ioc_vdev_attach()
-> spa_vdev_attach()
-> dsl_resilver_restart()
-> dsl_sync_task()
-> dsl_scan_setup_check()
<- dsl_scan_setup_check(): EBUSY
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #7247
Closes #7342
|
|
|
|
|
|
|
|
|
|
| |
The ASSERT was erroneously copied from the next section of code.
The buffer's size should be expanded from "psize" to "asize"
if necessary.
Reviewed-by: Tom Caputi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Closes #7375
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the decryption and block authentication code in
the ZIO / ARC layers is a bit inconsistent with regards to
the ereports that are produces and the error codes that are
passed to calling functions. This patch ensures that all of
these errors (which begin as ECKSUM) are converted to EIO
before they leave the ZIO or ARC layer and that ereports
are correctly generated on each decryption / authentication
failure.
In addition, this patch fixes a bug in zio_decrypt() where
ECKSUM never gets written to zio->io_error.
Reviewed by: Matt Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7372
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Encrypted dnode blocks are always initially read as raw data and
converted to decrypted data when an encrypted bonus buffer is
needed. This allows the DMU to be used for things like fetching
the DMU master node without requiring keys to be loaded. However,
dbuf_issue_final_prefetch() does not currently read the data as
raw. The end result of this is that prefetched dnode blocks are
read twice from disk: once decrypted and then again as raw data.
This patch corrects the issue by adding the flag when appropriate.
Reviewed by: Matt Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7362
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During a receive operation zvol_create_minors_impl() can wait
needlessly for the prefetch thread because both share the same tasks
queue. This results in hung tasks:
<3>INFO: task z_zvol:5541 blocked for more than 120 seconds.
<3> Tainted: P O 3.16.0-4-amd64
<3>"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
The first z_zvol:5541 (zvol_task_cb) is waiting for the long running
traverse_prefetch_thread:260
root@linux:~# cat /proc/spl/taskq
taskq act nthr spwn maxt pri mina
spl_system_taskq/0 1 2 0 64 100 1
active: [260]traverse_prefetch_thread [zfs](0xffff88003347ae40)
wait: 5541
spl_delay_taskq/0 0 1 0 4 100 1
delay: spa_deadman [zfs](0xffff880039924000)
z_zvol/1 1 1 0 1 120 1
active: [5541]zvol_task_cb [zfs](0xffff88001fde6400)
pend: zvol_task_cb [zfs](0xffff88001fde6800)
This change adds a dedicated, per-pool, prefetch taskq to prevent the
traverse code from monopolizing the global (and limited) system_taskq by
inappropriately scheduling long running tasks on it.
Reviewed-by: Albert Lee <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #6330
Closes #6890
Closes #7343
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes zfs-mount initscript trying to mount swap volumes
as filesystems or anything that has 'none' as a mountpoint
in /etc/fstab. Additionally, fixes trying to mount swap volumes
as a filesystem on RHEL. RHEL defines mountpoint for swap
as `swap`.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Georgy Yakovlev <[email protected]>
Closes #7346
Closes #7347
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Andriy Gapon <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: Don Brady <[email protected]>
Reviewed-by: loli10K <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Approved by: Richard Lowe <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
Porting Notes:
* Re-enabled and tweaked the zpool_upgrade_007_pos test case
to successfully run in under 5 minutes.
OpenZFS-issue: https://www.illumos.org/issues/9164
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0e776dc06a
Closes #6112
Closes #7336
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a devid for nvme devices. This is very similar to how the
other 'bus' (scsi|sata|usb) devids are generated. The devid
resides in a name/value pair in the leaf vdevs in a zpool config.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Richard Elling <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #7356
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, when ZFS wants to accelerate compression with QAT, it
passes a destination buffer of the same size as the source buffer.
Unfortunately, if the data is incompressible, QAT can actually
"compress" the data to be larger than the source buffer. When this
happens, the QAT driver will return a FAILED error code and print
warnings to dmesg. This patch fixes these issues by providing the
QAT driver with an additional buffer to work with so that even
completely incompressible source data will not cause an overflow.
This patch also resolves an error handling issue where
incompressible data attempts compression twice: once by QAT and
once in software. To fix this issue, a new (and fake) error code
CPA_STATUS_INOMPRESSIBLE has been added so that the calling code
can correctly account for the difference between a hardware
failure and data that simply cannot be compressed.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Weigang Li <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7338
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes an issue where dsl_scan_prefetch_cb() might
add more prefetch I/Os to the prefetch queue after prefetching
has been completed. This was happening because that code was
checking scn->scn_suspending instead of scn->scn_prefetch_stop.
This occasionally triggered an ASSERT during ztest runs in
dsl_scan_fini() when the code attempted to destroy an AVL tree
that still had entires in it. This patch also includes a number
of spelling corrections and comment cleanups throughout
dsl_scan.c
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7353
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clear executable bit on zfs-import.in, zfs-mount.in,
zfs-share.in, and zfs-zed.in. These are automake files and
should not be marked executable. This fixes a RPM build error
on Fedora 28.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #7355
Closes #7327
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Calling uiomove() in mappedread() under the page lock can result
in a deadlock if the user space page needs to be faulted in.
Resolve the issue by dropping the page lock before the uiomove().
The inode range lock protects against concurrent updates via
zfs_read() and zfs_write().
Reviewed-by: Albert Lee <[email protected]>
Reviewed-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7335
Closes #7339
|
|
|
|
|
|
|
|
|
|
| |
RHEL/CentOS 6 supports sys/xattr.h eliminating the need for
libattr-devel as a dependency.
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: DHE <[email protected]>
Closes #7344
Closes #7351
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the wrong value
Authored by: Allan Jude <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Approved by: Garrett D'Amore <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/9321
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/92b05f3a18
Closes #7333
|
|
|
|
|
|
|
|
|
|
|
| |
If you run ./configure --with-config=srpm, it will not trigger
the user m4 scripts to populate the dracut and udev directories.
This causes a build error on Fedora 28. Make the dracut and
udev lines conditional to get around this.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #7326
Closes #7328
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, ZFS tracks statistics about calls to arc_read()
via the /proc/spl/kstat/zfs/<pool>/reads file for debugging.
Unfortunately, this file currently counts embedded bps as
disk reads since they are technically processed by the ZIO
layer. This pollutes the log since the ARC will never cache
embedded bps. This patch corrects this issue by preventing
the logging of embedded bp reads.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7334
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When given an empty string as a rootds value, bootcfg -C fails with
the error message 'could not set nextboot: '' is an invalid name'.
This should be allowed because it represents clearing the nextboot
configuration.
Authored by: Paul Dagnelie <[email protected]>
Reviewed by: Chris Williamson <[email protected]>
Reviewed by: Sebastien Roy <[email protected]>
Reviewed by: Igor Kozhukhov <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/9193
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/504645d227
Closes #7230
|
|
|
|
|
|
|
|
|
|
| |
Updated the "Intent Log" section of the "zpool" manual page to
properly reflect the actions that may be performed.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Peter Ashford <[email protected]>
Closes #6938
Closes #7318
|
|
|
|
|
|
|
|
|
|
|
| |
Resolves importing root pool during boot in dracut. This case was
inadvertently broken with the module autoloading change in #7287.
Reviewed-by: Matthew Thode <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Kash Pande <[email protected]>
Closes #7322
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zfs_dbgmsg() should record a message by default. As a general
principal, these messages shouldn't be too verbose. Furthermore, the
amount of memory used is limited to 4MB (by default).
dprintf() should only record a message if this is a debug build, and
ZFS_DEBUG_DPRINTF is set in zfs_flags. This flag is not set by default
(even on debug builds). These messages are extremely verbose, and
sometimes nontrivial to compute.
SET_ERROR() should only record a message if ZFS_DEBUG_SET_ERROR is set
in zfs_flags. This flag is not set by default (even on debug builds).
This brings our behavior in line with illumos. Note that the message
format is unchanged (including file, line, and function, even though
these are not recorded on illumos).
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Prakash Surya <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #7314
|
|
|
|
|
|
|
|
|
|
| |
This patch simply corrects some spelling / grammar errors in
the QAT and encryption code comments. No functional changes
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7319
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This treats /dev/nvme.. devices the same way as /dev/sd... devices. The
motivation behind this is that whole disk detection did not work on nvme
SSDs without that, because it DKC_UNKNOWN was returned for such devices.
Perhaps there should be a separate DKC_ type for this, but I don't know
enough about the code to know the implications of that.
Reviewed-by: Don Brady <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: timor <[email protected]>
Closes #7304
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds some comments describing the purpose of "portable"
dnode and objset flags so that it is clear when new flags should
be added to the repective flag masks. This patch includes no
functional changes.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7313
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The changes piggyback JSON output support on top of channel programs
(#6558). This way the JSON output support is targeted to scripting
use cases and is easily maintainable since it really only touches
one function (zfs_do_channel_program()).
This patch ports Joyent's JSON nvlist library from illumos to enable
easy JSON printing of channel program output nvlist. To keep the
delta small I also took advantage of the fact that printing in
zfs_do_channel_program() was almost always done before exiting
the program.
Reviewed by: Matt Ahrens <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Richard Elling <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alek Pinchuk <[email protected]>
Closes #7281
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In vdev_queue_aggregate() the zio_execute() bypass should not be
called under the vdev queue lock. This can result in a deadlock
as shown in the stack traces below.
Drop the vdev queue lock then walk the parents of the aggregate IO
to determine the list of component IOs to be bypassed. This can
be done safely without holding the io_lock since the new aggregate
IO has not yet been returned and its parents cannot change.
--- THREAD 1 ---
arc_read()
zio_nowait()
zio_vdev_io_start()
vdev_queue_io() <--- mutex_enter(vq->vq_lock)
vdev_queue_io_to_issue()
vdev_queue_aggregate()
zio_execute()
zio_vdev_io_assess()
zio_wait_for_children() <- mutex_enter(zio->io_lock)
--- THREAD 2 --- (inverse order)
arc_read()
zio_change_priority() <- mutex_enter(zio->zio_lock)
vdev_queue_change_io_priority() <- mutex_enter(vq->vq_lock)
Reviewed-by: Tom Caputi <[email protected]>
Reviewed-by: Don Brady <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7307
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds some handling to the QAT acceleration functions
that allows them to handle buffers that are not aligned with the
page cache. At the moment this never happens since callers only
happen to work with page-aligned buffers, but this code should
prevent headaches if this isn't always true in the future. This
patch also adds some cleanups to align the QAT compression code
with the encryption and checksumming code.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Weigang Li <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7305
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the pool is suspended, record whether it was due to an I/O error or
due to MMP writes failing to succeed within the required time.
Change spa_suspended from uint8_t to zio_suspend_reason_t to store the
reason.
When userspace queries pool status via spa_tryimport(), report the
reason the pool was suspended in a new key,
ZPOOL_CONFIG_SUSPENDED_REASON.
In libzfs, when interpreting the returned config nvlist, report
suspension due to MMP with a new pool status enum value,
ZPOOL_STATUS_IO_FAILURE_MMP.
In status_callback(), which generates and emits the message when 'zpool
status' is executed, add a case to print an appropriate message for the
new pool status enum value.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Olaf Faaland <[email protected]>
Closes #7296
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables acceleration of SHA256 checksums using Intel
Quick Assist Technology. This patch also fixes up and refactors
some of the code from QAT encryption to make the behavior
consistent.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chengfeix Zhu <[email protected]>
Signed-off-by: Weigang Li <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7295
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ZFS Performance test concurrency should be lowered for better latency
Work by Stephen Blinick.
Nightly performance runs typically consist of two levels of concurrency;
and both are fairly high.
Since the IO runs are to a ZFS filesystem, within a zpool, which is
based on some variable number of vdev's, the amount of IO driven to each
device is variable. Additionally, different device types (HDD vs SSD,
etc) can generally handle a different amount of concurrent IO before
saturating.
Nevertheless, in practice, it appears that most tests are well past the
concurrency saturation point and therefore both perform with the same
throughput, the maximum of the device. Because the queuedepth to the
device(s) is so high however, the latency is much higher than the best
possible at that throughput, and increases linearly with the increase in
concurrency.
This means that changes in code that impact latency during normal
operation (before saturation) may not be apparent when a large component
of the measured latency is from the IO sitting in a queue to be
serviced. Therefore, changing the concurrency settings is recommended
Authored by: Stephen Blinick <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: John Wren Kennedy <[email protected]>
Ported-by: John Wren Kennedy <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/9076
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/562
Upstream bug: DLPX-45477
Closes #7302
|