| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we know which devices have the pool we are looking for, sometime
it's better if we can directly pass those device paths to zpool import
instead of letting it to search through all unrelated stuff, which might
take a lot of time if you have hundreds of disks.
This patch allows option -d <dev_path> to zpool import. You can have
multiple pairs of -d <dev_path>, and zpool import will only search
through those devices. For example:
zpool import -d /dev/sda -d /dev/sdb
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes #7077
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intent of this patch is extend the existing deadman code
such that it's flexible enough to be used by both ztest and
on production systems. The proposed changes include:
* Added a new `zfs_deadman_failmode` module option which is
used to dynamically control the behavior of the deadman. It's
loosely modeled after, but independant from, the pool failmode
property. It can be set to wait, continue, or panic.
* wait - Wait for the "hung" I/O (default)
* continue - Attempt to recover from a "hung" I/O
* panic - Panic the system
* Added a new `zfs_deadman_ziotime_ms` module option which is
analogous to `zfs_deadman_synctime_ms` except instead of
applying to a pool TXG sync it applies to zio_wait(). A
default value of 300s is used to define a "hung" zio.
* The ztest deadman thread has been re-enabled by default,
aligned with the upstream OpenZFS code, and then extended
to terminate the process when it takes significantly longer
to complete than expected.
* The -G option was added to ztest to print the internal debug
log when a fatal error is encountered. This same option was
previously added to zdb in commit fa603f82. Update zloop.sh
to unconditionally pass -G to obtain additional debugging.
* The FM_EREPORT_ZFS_DELAY event which was previously posted
when the deadman detect a "hung" pool has been replaced by
a new dedicated FM_EREPORT_ZFS_DEADMAN event.
* The proposed recovery logic attempts to restart a "hung"
zio by calling zio_interrupt() on any outstanding leaf zios.
We may want to further restrict this to zios in either the
ZIO_STAGE_VDEV_IO_START or ZIO_STAGE_VDEV_IO_DONE stages.
Calling zio_interrupt() is expected to only be useful for
cases when an IO has been submitted to the physical device
but for some reasonable the completion callback hasn't been
called by the lower layers. This shouldn't be possible but
has been observed and may be caused by kernel/driver bugs.
* The 'zfs_deadman_synctime_ms' default value was reduced from
1000s to 600s.
* Depending on how ztest fails there may be no cache file to
move. This should not be considered fatal, collect the logs
which are available and carry on.
* Add deadman test cases for spa_deadman() and zio_wait().
* Increase default zfs_deadman_checktime_ms to 60s.
Reviewed-by: Tim Chase <[email protected]>
Reviewed by: Thomas Caputi <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6999
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Sean Eric Fagan <[email protected]>
Reviewed by: Alek Pinchuk <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Approved by: Gordon Ross <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
Porting Notes:
- Brought #defines in eventdefs.h in line with ZFS on Linux format.
- Updated zfs-events.5 with the new events.
OpenZFS-issue: https://www.illumos.org/issues/8959
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c862b93eea
Closes #7049
|
|
|
|
|
|
|
|
| |
Parameter was removed in d3c2ae1c0806
(OpenZFS 6950 - ARC should cache compressed data)
Reviewed by: Brian Behlendorf <[email protected]>
Signed-off-by: DHE <[email protected]>
Closes #7043
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Yuri Pankov <[email protected]>
Reviewed by: Alexander Pyhalov <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8899
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/b0e142e57d
Closes #7032
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ungracefully
Authored by: Yuri Pankov <[email protected]>
Reviewed by: Toomas Soome <[email protected]>
Reviewed by: Andy Stormont <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8898
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/9fa2266d9a
Closes #7031
|
|
|
|
|
|
|
|
|
| |
Replace "percent" with "%", add bold to default values.
Reviewed-by: bunder2015 <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #7018
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Prakash Surya <[email protected]>
Reviewed by: John Kennedy <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Reviewed by: Igor Kozhukhov <[email protected]>
Reviewed by: Brian Behlendorf <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Ported-by: Prakash Surya <[email protected]>
PROBLEM
=======
There's a race condition that exists if `zil_free_lwb` races with either
`zil_commit_waiter_timeout` and/or `zil_lwb_flush_vdevs_done`.
Here's an example panic due to this bug:
> ::status
debugging crash dump vmcore.0 (64-bit) from ip-10-110-205-40
operating system: 5.11 dlpx-5.2.2.0_2017-12-04-17-28-32b6ba51fb (i86pc)
image uuid: 4af0edfb-e58e-6ed8-cafc-d3e9167c7513
panic message:
BAD TRAP: type=e (#pf Page fault) rp=ffffff0010555970 addr=60 occurred in module "zfs" due to a NULL pointer dereference
dump content: kernel pages only
> $c
zio_shrink+0x12()
zil_lwb_write_issue+0x30d(ffffff03dcd15cc0, ffffff03e0730e20)
zil_commit_waiter_timeout+0xa2(ffffff03dcd15cc0, ffffff03d97ffcf8)
zil_commit_waiter+0xf3(ffffff03dcd15cc0, ffffff03d97ffcf8)
zil_commit+0x80(ffffff03dcd15cc0, 9a9)
zfs_write+0xc34(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0)
fop_write+0x5b(ffffff03dc38b140, ffffff0010555e60, 40, ffffff03e00fb758, 0)
write+0x250(42, fffffd7ff4832000, 2000)
sys_syscall+0x177()
If there's an outstanding lwb that's in `zil_commit_waiter_timeout`
waiting to timeout, waiting on it's waiter's CV, we must be sure not to
call `zil_free_lwb`. If we end up calling `zil_free_lwb`, then that LWB
may be freed and can result in a use-after-free situation where the
stale lwb pointer stored in the `zil_commit_waiter_t` structure of the
thread waiting on the waiter's CV is used.
A similar situation can occur if an lwb is issued to disk, and thus in
the `LWB_STATE_ISSUED` state, and `zil_free_lwb` is called while the
disk is servicing that lwb. In this situation, the lwb will be freed by
`zil_free_lwb`, which will result in a use-after-free situation when the
lwb's zio completes, and `zil_lwb_flush_vdevs_done` is called.
This race condition is prevented in `zil_close` by calling `zil_commit`
before `zil_free_lwb` is called, which will ensure all outstanding (i.e.
all lwb's in the `LWB_STATE_OPEN` and/or `LWB_STATE_ISSUED` states)
reach the `LWB_STATE_DONE` state before the lwb's are freed
(`zil_commit` will not return untill all the lwb's are
`LWB_STATE_DONE`).
Further, this race condition is prevented in `zil_sync` by only calling
`zil_free_lwb` for lwb's that do not have their `lwb_buf` pointer set.
All lwb's not in the `LWB_STATE_DONE` state will have a non-null value
for this pointer; the pointer is only cleared in
`zil_lwb_flush_vdevs_done`, at which point the lwb's state will be
changed to `LWB_STATE_DONE`.
This race *is* present in `zil_suspend`, leading to this bug.
At first glance, it would appear as though this would not be true
because `zil_suspend` will call `zil_commit`, just like `zil_close`, but
the problem is that `zil_suspend` will set the zilog's `zl_suspend`
field prior to calling `zil_commit`. Further, in `zil_commit`, if
`zl_suspend` is set, `zil_commit` will take a special branch of logic
and use `txg_wait_synced` instead of performing the normal `zil_commit`
logic.
This call to `txg_wait_synced` might be good enough for the data to
reach disk safely before it returns, but it does not ensure that all
outstanding lwb's reach the `LWB_STATE_DONE` state before it returns.
This is because, if there's an lwb "stuck" in
`zil_commit_waiter_timeout`, waiting for it's lwb to timeout, it will
maintain a non-null value for it's `lwb_buf` field and thus `zil_sync`
will not free that lwb. Thus, even though the lwb's data is already on
disk, the lwb will be left lingering, waiting on the CV, and will
eventually timeout and be issued to disk even though the write is
unnecessary.
So, after `zil_commit` is called from `zil_suspend`, we incorrectly
assume that there are not outstanding lwb's, and proceed to free all
lwb's found on the zilog's lwb list. As a result, we free the lwb that
will later be used `zil_commit_waiter_timeout`.
SOLUTION
========
The solution to this, is to ensure all outstanding lwb's complete before
calling `zil_free_lwb` via `zil_destroy` in `zil_suspend`. This patch
accomplishes this goal by forcing the normal `zil_commit` logic when
called from `zil_sync`.
Now, `zil_suspend` will call `zil_commit_impl` which will always use the
normal logic of waiting/issuing lwb's to disk before it returns. As a
result, any lwb's outstanding when `zil_commit_impl` is called will be
guaranteed to reach the `LWB_STATE_DONE` state by the time it returns.
Further, no new lwb's will be created via `zil_commit` since the zilog's
`zl_suspend` flag will be set. This will force all new callers of
`zil_commit` to use `txg_wait_synced` instead of creating and issuing
new lwb's.
Thus, all lwb's left on the zilog's lwb list when `zil_destroy` is
called will be in the `LWB_STATE_DONE` state, and we'll avoid this race
condition.
OpenZFS-issue: https://www.illumos.org/issues/8909
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/ece62b6f8d
Closes #6940
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extends vdev_id to support a new slot type, ses, for SCSI Enclosure
Services. With slot type ses, the disk slot numbers are determined by
using the device slot number reported by sg_ses for the device with
matching SAS address, found by querying all available enclosures.
This is primarily of use on systems with a deficient driver omitting
support for bay_identifier in /sys/devices. In my testing, I found that
the existing slot types of port and id were not stable across disk
replacement, so an alternative was required.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Simon Guest <[email protected]>
Closes #6956
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: DHE <[email protected]>
Closes #6894
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, scrubs and resilvers can take an extremely
long time to complete. This is largely due to the fact
that zfs scans process pools in logical order, as
determined by each block's bookmark. This makes sense
from a simplicity perspective, but blocks in zfs are
often scattered randomly across disks, particularly
due to zfs's copy-on-write mechanisms.
This patch improves performance by splitting scrubs
and resilvers into a metadata scanning phase and an IO
issuing phase. The metadata scan reads through the
structure of the pool and gathers an in-memory queue
of I/Os, sorted by size and offset on disk. The issuing
phase will then issue the scrub I/Os as sequentially as
possible, greatly improving performance.
This patch also updates and cleans up some of the scan
code which has not been updated in several years.
Reviewed-by: Brian Behlendorf <[email protected]>
Authored-by: Saso Kiselkov <[email protected]>
Authored-by: Alek Pinchuk <[email protected]>
Authored-by: Tom Caputi <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #3625
Closes #6256
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zfs-import-{cache,scan}.service must complete before any mounting of
filesystems can occur. To simplify this dependency, create a target
that is reached After (in the systemd sense) the pool is imported.
Additionally, recommend that legacy zfs mounts use the option
x-systemd.requires=zfs-import.target
to codify this requirement.
Reviewed-by: Fabian Grünbichler <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Antonio Russo <[email protected]>
Closes #6764
|
|
|
|
|
|
|
|
|
|
|
| |
Update zfs module parameters man5 with missing parameter details
for multiple tunings.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Alex Braunegg <[email protected]>
Closes #6785
|
|
|
|
|
|
|
|
|
|
|
|
| |
The 'zpool status' command supports the -P option for printing full
path names. It does not support the -p parsable option for printing
exact values.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: loli10K <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6792
Closes #6794
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current make recipe for mancheck silently ignores errors. Correct
the recipe so errors cause the mancheck recipe fail.
The zpool reopen command in the zpool.8 manpage had a bullet list
without an .El.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes #6790
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Additionally add four new tests:
* zpool_events_clear: verify 'zpool events -c' functionality
* zpool_events_cliargs: verify command line options and arguments
* zpool_events_follow: verify 'zpool events -f'
* zpool_events_poolname: verify events filtering by pool name
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #3285
Closes #6762
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8558 lwp_create() returns EAGAIN on system with more than 80K ZFS filesystems
On a system with more than 80K ZFS filesystems, we've seen cases
where lwp_create() will start to fail by returning EAGAIN. The
problem being, for each of those 80K ZFS filesystems, a taskq will
be created for each dataset as part of the ZIL for each dataset.
Porting Notes:
- The new nomem taskq kstat was dropped.
- Added module options and documentation for new tunings
zfs_zil_clean_taskq_nthr_pct, zfs_zil_clean_taskq_minalloc,
zfs_zil_clean_taskq_maxalloc, and zfs_sync_taskq_batch_pct.
Reviewed by: George Wilson <[email protected]>
Reviewed by: Sebastien Roy <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Authored by: Prakash Surya <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Chris Dunlop <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8558
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/216d772
8602 remove unused "dp_early_sync_tasks" field from "dsl_pool" structure
Reviewed by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Authored by: Prakash Surya <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Chris Dunlop <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8602
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/2bcb545
Closes #6779
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added -n flag to zpool reopen that allows a running scrub
operation to continue if there is a device with Dirty Time Log.
By default if a component device has a DTL and zpool reopen
is executed all running scan operations will be restarted.
Added functional tests for `zpool reopen`
Tests covers following scenarios:
* `zpool reopen` without arguments,
* `zpool reopen` with pool name as argument,
* `zpool reopen` while scrubbing,
* `zpool reopen -n` while scrubbing,
* `zpool reopen -n` while resilvering,
* `zpool reopen` with bad arguments.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tom Caputi <[email protected]>
Signed-off-by: Arkadiusz Bubała <[email protected]>
Closes #6076
Closes #6746
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only place vn_rename and vn_remove are used is when writing
out an updated pool configuration file. By truncating the file
instead of renaming and removing it we can avoid having to implement
these interfaces entirely. Functionally an empty cache file is
treated the same as a missing cache file. This is particularly
advantageous because the Linux kernel has never provided a way
to reliably implement vn_rename and vn_remove.
The cachefile_004_pos.ksh test case was updated to understand
that an empty cache file is the same as a missing one.
The zfs-import-* systemd service files were not updated to use
ConditionFileNotEmpty in place of ConditionPathExists. This
means that after exporting all pools and rebooting new pools
will not the scanned for on the next boot. This small change
should not impact normal usage since pools are not exported
as part of a normal shutdown.
Documentation was updated accordingly.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Arkadiusz Bubała <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes zfsonlinux/spl#648
Closes #6753
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* PBKDF2 implementation changed to OpenSSL implementation.
* HKDF implementation moved to its own file and tests
added to ensure correctness.
* Removed libzfs's now unnecessary dependency on libzpool
and libicp.
* Ztest can now create and test encrypted datasets. This is
currently disabled until issue #6526 is resolved, but
otherwise functions as advertised.
* Several small bug fixes discovered after enabling ztest
to run on encrypted datasets.
* Fixed coverity defects added by the encryption patch.
* Updated man pages for encrypted send / receive behavior.
* Fixed a bug where encrypted datasets could receive
DRR_WRITE_EMBEDDED records.
* Minor code cleanups / consolidation.
Signed-off-by: Tom Caputi <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's often useful to have access to txg history for debugging
purposes. This patch changes the default from 0 to 100 TXGs
worth of history preserved.
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Reviewed by: Prakash Surya <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Alek Pinchuk <[email protected]>
Closes #6691
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new make 'mancheck' target which uses mandoc -Tlint to verify
manpage files: currently only zfs(8), zpool(8) zdb(8) and zgenhostid(8)
are supported.
Additionally fix some outstanding manpage formatting issues.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #6646
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3796 listsnapshots not documented in zpool man page
Authored by: George Melikov <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: Yuri Pankov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
Ported-by: George Melikov [email protected]
OpenZFS-issue: https://www.illumos.org/issues/8435
OpenZFS-commit: openzfs/openzfs@a058d1c
Porting notes: OpenZFS review applied,
some ZoL changes were reverted.
See https://github.com/openzfs/openzfs/pull/415
|
|
|
|
|
|
|
|
|
| |
This leverages the functionality introduced in cf7684b to expose
verbose, dry-run and parsable 'zfs send' options for bookmarks.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #3666
Closes #6601
|
|
|
|
|
|
|
|
|
|
|
| |
compression=lz4 depends on the lz4 feature being enabled, while
compression=on will let ZFS use either lzjb or lz4 where appropriate.
It also allows the documentation to not go out of date if/when ZFS
picks a new default in the future.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Mike Swanson <[email protected]>
Closes #6614
|
|
|
|
|
|
|
|
|
|
| |
Corrected indent of the note located at the bottom of the options for
zfs send as well as remove an extra whitespace
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: bunder2015 <[email protected]>
Closes #6590
|
|
|
|
|
|
| |
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Melikov <[email protected]>
Closes #6553
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Yuri Pankov <[email protected]>
Reviewed by: Robert Mustacchi <[email protected]>
Approved by: Richard Lowe <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8547
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c66b804
Closes #6549
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Alek Pinchuk <[email protected]>
Reviewed by: George Melikov <[email protected]>
Reviewed by: Brian Behlendorf <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Reviewed by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Alek Pinchuk <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8414
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c29616076
Closes #6538
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes several issues discovered after
the encryption patch was merged:
* Fixed a bug where encrypted datasets could attempt
to receive embedded data records.
* Fixed a bug where dirty records created by the recv
code wasn't properly setting the dr_raw flag.
* Fixed a typo where a dmu_tx_commit() was changed to
dmu_tx_abort()
* Fixed a few error handling bugs unrelated to the
encryption patch in dmu_recv_stream()
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #6512
Closes #6524
Closes #6545
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Removed zpios kmod, utility, headers and man page.
* Removed unused scripts zpios-profile/*, zpios-test/*,
zpool-config/*, smb.sh, zpios-sanity.sh, zpios-survey.sh,
zpios.sh, and zpool-create.sh.
* Removed zfs-script-config.sh.in. When building 'make' generates
a common.sh with in-tree path information from the common.sh.in
template. This file and sourced by the test scripts and used
for in-tree testing, it is not included in the packages. When
building packages 'make install' uses the same template to
create a new common.sh which is appropriate for the packaging.
* Removed unused functions/variables from scripts/common.sh.in.
Only minimal path information and configuration environment
variables remain.
* Removed unused scripts from scripts/ directory.
* Remaining shell scripts in the scripts directory updated to
cleanly pass shellcheck and added to checked scripts.
* Renamed tests/test-runner/cmd/ to tests/test-runner/bin/ to
match install location name.
* Removed last traces of the --enable-debug-dmu-tx configure
options which was retired some time ago.
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6509
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With HPE hardware and hpsa-driven SAS adapters, only a single phy is
reported, but no individual per-port phys (ie. no phy* entry below
port_dir), which breaks topology detection in the current sas_handler
code. Instead, slot information can be derived directly from the port
number. This change implements a new slot keyword "port" similar to
"id" and "lun", and assumes a default phy/port of 0 if no individual
phy entry can be found. It allows to use the "sas_direct" topology with
current HPE Dxxxx and Apollo 45xx JBODs.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Daniel Kobras <[email protected]>
Closes #6484
|
|
|
|
|
|
|
|
|
|
|
| |
Added a 'corrupt' error option that will flip a bit in the data
after a read operation. This is useful for generating checksum
errors at the device layer (in a mirror config for example). It
is also used to validate the diagnosis of checksum errors from
the zfs diagnosis engine.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #6345
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change incorporates three major pieces:
The first change is a keystore that manages wrapping
and encryption keys for encrypted datasets. These
commands mostly involve manipulating the new
DSL Crypto Key ZAP Objects that live in the MOS. Each
encrypted dataset has its own DSL Crypto Key that is
protected with a user's key. This level of indirection
allows users to change their keys without re-encrypting
their entire datasets. The change implements the new
subcommands "zfs load-key", "zfs unload-key" and
"zfs change-key" which allow the user to manage their
encryption keys and settings. In addition, several new
flags and properties have been added to allow dataset
creation and to make mounting and unmounting more
convenient.
The second piece of this patch provides the ability to
encrypt, decyrpt, and authenticate protected datasets.
Each object set maintains a Merkel tree of Message
Authentication Codes that protect the lower layers,
similarly to how checksums are maintained. This part
impacts the zio layer, which handles the actual
encryption and generation of MACs, as well as the ARC
and DMU, which need to be able to handle encrypted
buffers and protected data.
The last addition is the ability to do raw, encrypted
sends and receives. The idea here is to send raw
encrypted and compressed data and receive it exactly
as is on a backup system. This means that the dataset
on the receiving system is protected using the same
user key that is in use on the sending side. By doing
so, datasets can be efficiently backed up to an
untrusted system without fear of data being
compromised.
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #494
Closes #5769
|
|
|
|
|
|
|
|
|
|
| |
* ztest.1 man page: fix typo
* zfs-module-parameters.5 man page: fix grammar
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Fabian Grünbichler <[email protected]>
Closes #6492
|
|
|
|
|
|
|
|
| |
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: bunder2015 <[email protected]>
Closes #6409
Closes #6410
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Redefine the SET_ERROR macro in terms of __dprintf() so the error
return codes get logged as both tracepoint events (if tracepoints are
enabled) and as ZFS debug log entries. This also allows us to use
the same definition of SET_ERROR() in kernel and user space.
Define a new debug flag ZFS_DEBUG_SET_ERROR=512 that may be bitwise
or'd into zfs_flags. Setting this flag enables both dprintf() and
SET_ERROR() messages in the debug log. That is, setting
ZFS_DEBUG_SET_ERROR and ZFS_DEBUG_DPRINTF|ZFS_DEBUG_SET_ERROR are
equivalent (this was done for sake of simplicity). Leaving
ZFS_DEBUG_SET_ERROR unset suppresses the SET_ERROR() messages which
helps avoid cluttering up the logs.
To enable SET_ERROR() logging, run:
echo 1 > /sys/module/zfs/parameters/zfs_dbgmsg_enable
echo 512 > /sys/module/zfs/parameters/zfs_flags
Remove the zfs_set_error_class tracepoints event class since
SET_ERROR() now uses __dprintf(). This sacrifices a bit of
granularity when selecting individual tracepoint events to enable but
it makes the code simpler.
Include file, function, and line number information in debug log
entries. The information is now added to the message buffer in
__dprintf() and as a result the zfs_dprintf_class tracepoints event
class was changed from a 4 parameter interface to a single parameter.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ned Bass <[email protected]>
Closes #6400
|
|
|
|
|
|
|
|
|
| |
The userobj_accounting feature described in the zpool-features.5
man page was incorrectly indented. Fix it.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6402
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Turning the multihost property on requires that a hostid be set to allow
ZFS to determine when a foreign system is attemping to import a pool.
The error message instructing the user to set a hostid refers to
genhostid(1).
Genhostid(1) is not available on SUSE Linux. This commit adds a script
modeled after genhostid(1) for those users.
Zgenhostid checks for an /etc/hostid file; if it does not exist, it
creates one and stores a value. If the user has provided a hostid as an
argument, that value is used. Otherwise, a random hostid is generated
and stored.
This differs from the CENTOS 6/7 versions of genhostid, which overwrite
the /etc/hostid file even though their manpages state otherwise.
A man page for zgenhostid is added. The one for genhostid is in (1), but
I put zgenhostid in (8) because I believe it's more appropriate.
The mmp tests are modified to use zgenhostid to set the hostid instead
of using the spl_hostid module parameter. zgenhostid will not replace
an existing /etc/hostid file, so new mmp_clear_hostid calls are
required.
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Olaf Faaland <[email protected]>
Closes #6358
Closes #6379
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zpool iostat/status -c is supposed to be restricted
by its search path, but currently isn't. To prevent
arbitrary scripts from being executed, disallow '/'
from commands.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Ned Bass <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes #6353
Closes #6359
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Use nested [] notation to denote optional script list elements
- Fix space before comma after smarctl(8)
- Fix typo and formatting error in reference to -v option
- Fix spelling errors
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Ned Bass <[email protected]>
Closes #6370
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add multihost=on|off pool property to control MMP. When enabled
a new thread writes uberblocks to the last slot in each label, at a
set frequency, to indicate to other hosts the pool is actively imported.
These uberblocks are the last synced uberblock with an updated
timestamp. Property defaults to off.
During tryimport, find the "best" uberblock (newest txg and timestamp)
repeatedly, checking for change in the found uberblock. Include the
results of the activity test in the config returned by tryimport.
These results are reported to user in "zpool import".
Allow the user to control the period between MMP writes, and the
duration of the activity test on import, via a new module parameter
zfs_multihost_interval. The period is specified in milliseconds. The
activity test duration is calculated from this value, and from the
mmp_delay in the "best" uberblock found initially.
Add a kstat interface to export statistics about Multiple Modifier
Protection (MMP) updates. Include the last synced txg number, the
timestamp, the delay since the last MMP update, the VDEV GUID, the VDEV
label that received the last MMP update, and the VDEV path. Abbreviated
output below.
$ cat /proc/spl/kstat/zfs/mypool/multihost
31 0 0x01 10 880 105092382393521 105144180101111
txg timestamp mmp_delay vdev_guid vdev_label vdev_path
20468 261337 250274925 68396651780 3 /dev/sda
20468 261339 252023374 6267402363293 1 /dev/sdc
20468 261340 252000858 6698080955233 1 /dev/sdx
20468 261341 251980635 783892869810 2 /dev/sdy
20468 261342 253385953 8923255792467 3 /dev/sdd
20468 261344 253336622 042125143176 0 /dev/sdab
20468 261345 253310522 1200778101278 2 /dev/sde
20468 261346 253286429 0950576198362 2 /dev/sdt
20468 261347 253261545 96209817917 3 /dev/sds
20468 261349 253238188 8555725937673 3 /dev/sdb
Add a new tunable zfs_multihost_history to specify the number of MMP
updates to store history for. By default it is set to zero meaning that
no MMP statistics are stored.
When using ztest to generate activity, for automated tests of the MMP
function, some test functions interfere with the test. For example, the
pool is exported to run zdb and then imported again. Add a new ztest
function, "-M", to alter ztest behavior to prevent this.
Add new tests to verify the new functionality. Tests provided by
Giuseppe Di Natale.
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Ned Bass <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Olaf Faaland <[email protected]>
Closes #745
Closes #6279
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The volmode property may be set to control the visibility of ZVOL
block devices.
This allow switching ZVOL between three modes:
full - existing fully functional behaviour (default)
dev - hide partitions on ZVOL block devices
none - not exposing volumes outside ZFS
Additionally the new zvol_volmode module parameter can be used to
control the default behaviour.
This functionality can be used, for instance, on "backup" pools to
avoid cluttering /dev with unneeded zd* devices.
Original-patch-by: mav <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: loli10K <[email protected]>
Signed-off-by: loli10K <[email protected]>
FreeBSD-commit: https://github.com/freebsd/freebsd/commit/dd28e6bb
Closes #1796
Closes #3438
Closes #6233
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Alex Reece <[email protected]>
Reviewed by: Yuri Pankov <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8067
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/8173085
Closes #6319
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, there is no way to pause a scrub. Pausing may
be useful when the pool is busy with other I/O to preserve
bandwidth.
This patch adds the ability to pause and resume scrubbing.
This is achieved by maintaining a persistent on-disk scrub state.
While the state is 'paused' we do not scrub any more blocks.
We do however perform regular scan housekeeping such as
freeing async destroyed and deadlist blocks while paused.
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Thomas Caputi <[email protected]>
Reviewed-by: Serapheim Dimitropoulos <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alek Pinchuk <[email protected]>
Closes #6167
|
|
|
|
|
|
|
|
|
|
| |
* Fixed some typos
* Additional description for some commands arguments
* Text reworked to be in sync with OpenZFS
* Added Linux as .Os type
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6282
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fixed some typos
* Additional description for some commands arguments
* `listsnapshots` remained
* Text reworked to be in sync with OpenZFS
* Added Linux as .Os type
* Updated `zpool events` section.
* Updated `zpool iostat|status -c` sections
* Added zed(8) reference to SEE ALSO
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Melikov <[email protected]>
Closes #6245
|
|
|
|
|
|
|
|
|
|
|
| |
In the original form of device error injection, it was an all or nothing
situation. To help simulate intermittent error conditions, you can now
specify a real number percentage value. This is also very useful for our
ZFS fault diagnosis testing and for injecting intermittent errors during
load testing.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #6227
|
|
|
|
|
|
|
|
|
| |
In arc_evict_state() we start pruning when arc_dnode_size >
arc_dnode_limit, i.e. arc_dnode_limit is a ceiling rather than a
floor.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris Dunlop <[email protected]>
Closes #6228
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- After some ZIL changes 6 years ago zil_slog_limit got partially broken
due to zl_itx_list_sz not updated when async itx'es upgraded to sync.
Actually because of other changes about that time zl_itx_list_sz is not
really required to implement the functionality, so this patch removes
some unneeded broken code and variables.
- Original idea of zil_slog_limit was to reduce chance of SLOG abuse by
single heavy logger, that increased latency for other (more latency critical)
loggers, by pushing heavy log out into the main pool instead of SLOG. Beside
huge latency increase for heavy writers, this implementation caused double
write of all data, since the log records were explicitly prepared for SLOG.
Since we now have I/O scheduler, I've found it can be much more efficient
to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE
to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.
- Existing ZIL implementation had problem with space efficiency when it
has to write large chunks of data into log blocks of limited size. In some
cases efficiency stopped to almost as low as 50%. In case of ZIL stored on
spinning rust, that also reduced log write speed in half, since head had to
uselessly fly over allocated but not written areas. This change improves
the situation by offloading problematic operations from z*_log_write() to
zil_lwb_commit(), which knows real situation of log blocks allocation and
can split large requests into pieces much more efficiently. Also as side
effect it removes one of two data copy operations done by ZIL code WR_COPIED
case.
- While there, untangle and unify code of z*_log_write() functions.
Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing
block boundary, that may also improve efficiency if ZPL is made to do that.
Sponsored by: iXsystems, Inc.
Authored by: Alexander Motin <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Prakash Surya <[email protected]>
Reviewed by: Andriy Gapon <[email protected]>
Reviewed by: Steven Hartland <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/7578
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/aeb13ac
Closes #6191
|