| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, there is no way to pause a scrub. Pausing may
be useful when the pool is busy with other I/O to preserve
bandwidth.
This patch adds the ability to pause and resume scrubbing.
This is achieved by maintaining a persistent on-disk scrub state.
While the state is 'paused' we do not scrub any more blocks.
We do however perform regular scan housekeeping such as
freeing async destroyed and deadlist blocks while paused.
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Thomas Caputi <[email protected]>
Reviewed-by: Serapheim Dimitropoulos <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alek Pinchuk <[email protected]>
Closes #6167
|
|
|
|
|
|
|
|
|
|
|
|
| |
On the single core machine the system may hang when the
spa_namespare_lock acquisition fails in the zvol_first_open
function. It returns -ERESTARTSYS error what causes the
endless loop in __blkdev_get function.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Arkadiusz Bubała <[email protected]>
Closes #6283
Closes #6312
|
|
|
|
|
|
|
|
|
|
| |
Needed for PATH variable to be passed into su. The
posix* tests were fixed, but they need further investigation
before they can be enabled.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: George Melikov <[email protected]>
Closes #6303
|
|
|
|
|
|
|
|
|
| |
Musl libc's <stdio.h> doesn't include <stdarg.h>, which cause
`va_start` and `va_end` end up being undefined symbols.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Leorize <[email protected]>
Closes #6310
|
|
|
|
|
|
|
|
|
| |
Clang doesn't support `/` as comment in assembly, this patch replaces
them with `#`.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Leorize <[email protected]>
Closes #6311
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: Prakash Surya <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
The problem is that zfs_get_data() supplies a stale zgd_bp to
dmu_sync(), which we then nopwrite against.
zfs_get_data() doesn't hold any DMU-related locks, so after it
copies db_blkptr to zgd_bp, dbuf_write_ready() could change
db_blkptr, and dbuf_write_done() could remove the dirty record.
dmu_sync() then sees the stale BP and that the dbuf it not dirty,
so it is eligible for nop-writing.
The fix is for dmu_sync() to copy db_blkptr to zgd_bp after
acquiring the db_mtx. We could still see a stale db_blkptr,
but if it is stale then the dirty record will still exist and
thus we won't attempt to nopwrite.
OpenZFS-issue: https://www.illumos.org/issues/8378
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3127742
Closes #6293
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Andriy Gapon <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
The existing kernel-side code only provides a method to rollback to a
latest snapshot, whatever it happens to be at the time when the rollback
is actually done. That could be unsafe or confusing in environments
where concurrent DSL changes are possible as the resulting state could
correspond to a newer or older snapshot than the originally requested
one.
This change allows to amend that method such that the rollback is
performed only when the latest snapshot has a specific name. That is,
if a new snapshot is concurrently created or the target snapshot is
destroyed, then no rollback is done and EXDEV error is returned.
New libzfs_core function lzc_rollback_to() is provided for the new
functionality. libzfs is changed to use lzc_rollback_to() to implement
zfs rollback command.
Perhaps we should return different errors to distinguish the case where
the desired snapshot exists but it's not the latest snapshot and the
case where the desired snapshot does not exist.
OpenZFS-issue: https://www.illumos.org/issues/7600
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/3d645eb
Closes #6292
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Andriy Gapon <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/7910
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/cb6af4b
Closes #6291
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Marcel Telka <[email protected]>
Reviewed by: Vitaliy Gusev <[email protected]>
Approved by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8418
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/e09ba01
Closes #6305
|
|
|
|
|
|
| |
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Melikov <[email protected]>
Closes #6298
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now test runner will always exit(0).
It's helpful to have zfs-tests.sh provide different
exit values depending on if everything passed or not.
We can then use common shell cmds to run tests until failure.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Alek Pinchuk <[email protected]>
Closes #6285
|
|
|
|
|
|
|
|
|
|
|
| |
Reorder operations in _endlog so failure messages get
printed prior to performing callbacks and cleanup. This
helps clarify why a test failed and places the message
closer to the point of incident in the resulting logs.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes #6281
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fdopendir()
Authored by: Sowrabha Gopal <[email protected]>
Reviewed by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Yuri Pankov <[email protected]>
Reviewed by: Igor Kozhukhov <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
dir_is_empty_readdir() immediately returns if fdopendir() fails.
We should close dirfd when that happens.
OpenZFS-issue: https://www.illumos.org/issues/8430
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/e165e20
Closes #6289
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Andriy Gapon <[email protected]>
Reviewed by: Igor Kozhukhov <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Alek Pinchuk <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8416
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/589c189
Closes #6288
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Andriy Gapon <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8426
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/37359a6
Closes #6287
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: Paul Dagnelie <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
The problem is that when dsl_bookmark_destroy_check() is
executed from open context (the pre-check), it fills in
dbda_success based on the existence of the bookmark. But
the bookmark (or containing filesystem as in this case)
can be destroyed before we get to syncing context. When
we re-run dsl_bookmark_destroy_check() in syncing context,
it will not add the deleted bookmark to dbda_success,
intending for dsl_bookmark_destroy_sync() to not process
it. But because the bookmark is still in dbda_success from
the open-context call, we do try to destroy it.
The fix is that dsl_bookmark_destroy_check() should not
modify dbda_success when called from open context.
OpenZFS-issue: https://www.illumos.org/issues/8377
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/b0b6fe3
Closes #6286
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolves issues discovered when porting to OpenZFS.
* Lint warnings.
* Made dnode_move_impl() large dnode aware. This
functionality is currently unused on Linux.
Reviewed-by: Ned Bass <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #6262
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make zfs_arc_meta_limit_percent and zfs_arc_dnode_limit_percent behave
as you would expect from zfs-module-parameters.5.
- recalculate arc_meta_limit if zfs_arc_meta_limit_percent changes
- recalculate arc_dnode_limit if zfs_arc_dnode_limit_percent changes
- correctly set arc_meta_limit and arc_dnode_limit if zfs_arc_max or
zfs_arc_meta_min changes
Reviewed-by: Tim Chase <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris Dunlop <[email protected]>
Closes #6269
|
|
|
|
|
|
|
|
|
|
| |
* Fixed some typos
* Additional description for some commands arguments
* Text reworked to be in sync with OpenZFS
* Added Linux as .Os type
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6282
|
|
|
|
|
|
|
|
|
|
|
| |
GCC 7.1 with will warn when we're not checking the snprintf()
return code in cases where the buffer could be truncated. This
patch either checks the snprintf return code (where applicable),
or simply disables the warnings (ztest.c).
Reviewed-by: Chunwei Chen <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #6253
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Fixed some typos
* Additional description for some commands arguments
* `listsnapshots` remained
* Text reworked to be in sync with OpenZFS
* Added Linux as .Os type
* Updated `zpool events` section.
* Updated `zpool iostat|status -c` sections
* Added zed(8) reference to SEE ALSO
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Melikov <[email protected]>
Closes #6245
|
|
|
|
|
|
|
|
|
|
|
|
| |
On RHEL 7.4, include/linux/bio.h now includes a macro for
bio_set_op_attrs that conflicts with the ifndef in ZFS
include/linux/blkdev_compat.h. This patch fixes the build.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #6234
Closes #6271
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 8542ef8 allowed optional IOs to be aggregated beyond
the specified aggregation limit. Since the aggregation limit
was also used to enforce the maximum block size, setting
`zfs_vdev_aggregation_limit=16777216` could result in an
attempt to allocate an ABD larger than 16M.
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6259
Closes #6270
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under Linux the existence of a block device in /etc/fstab is
not sufficient to prevent the use of the force flag. Without
the force flag a warning will be printed that the device has
a filesystem of a given type. Providing the force option
will overwrite that filesystem as long as it is not actively
mounted.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Tested-by: bunder2015 <[email protected]>
Signed-off-by: bunder2015 <[email protected]>
Closes #6267
Closes #6272
|
|
|
|
|
|
|
|
|
|
| |
Use zv_state_lock to protect all members of zvol_state structure, add
relevant ASSERT()s. Take zv_suspend_lock before zv_state_lock, do not
hold zv_state_lock across suspend/resume.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Boris Protopopov <[email protected]>
Closes #6226
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Andriy Gapon <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Saso Kiselkov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/5220
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/403a8da
Closes #6260
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Andrew Stormont <[email protected]>
Reviewed by: Andriy Gapon <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Approved by: Dan McDonald <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8264
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/a4b8c9a
Closes #6254
|
|
|
|
|
|
|
|
|
|
| |
zfs_clone_010_pos.ksh: line 234: ZFS_MAXPROPLEN: arithmetic syntax error
Reviewed-by: Kash Pande <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: bunder2015 <[email protected]>
Closes #6268
|
|
|
|
|
|
|
|
| |
In bqueue_dequeue(), call cv_signal() with bq_lock held.
Re-enable rsend_009_pos to test the fix.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Boris Protopopov <[email protected]>
Closes #5887
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Andrew Stormont <[email protected]>
Reviewed by: Marcel Telka <[email protected]>
Reviewed by: Toomas Soome <[email protected]>
Approved by: Dan McDonald <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8331
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/4f4378c
Closes #6255
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prints dashes instead of zeros for zero latency values in
'zpool iostat -p'. You'll get zero latencies reported when the
disk is idle, but technically a zero latency is invalid, since you
can't measure the latency of doing nothing.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #6210
|
|
|
|
|
|
|
|
|
|
|
| |
In zfs/dmu_object and icp/core/kcf_sched, the CPU_SEQID macro
should be surrounded by `kpreempt_disable` and `kpreempt_enable`
calls to avoid a Linux kernel BUG warning. These code paths use
the cpuid to minimize lock contention and is is safe to reschedule
the process to a different processor at any time.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Morgan Jones <[email protected]>
Closes #6239
|
|
|
|
|
|
|
|
|
|
|
| |
In the original form of device error injection, it was an all or nothing
situation. To help simulate intermittent error conditions, you can now
specify a real number percentage value. This is also very useful for our
ZFS fault diagnosis testing and for injecting intermittent errors during
load testing.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #6227
|
|
|
|
|
|
|
|
| |
Add links for information about the ZFS buildbot options
to the contributing guidelines and PR template.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes #6235
|
|
|
|
|
|
|
|
| |
Use queue_flag_set_unlocked() in zvol_alloc().
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Boris Protopopov <[email protected]>
Issue #6226
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
5559ba0 added zv_state_lock to protect zvol_state_t internal data:
this, however, doesn't guard zv->zv_open_count and
zv->zv_disk->private_data in zvol_remove_minors_impl().
Fix this by taking zv->zv_state_lock before we check its zv_open_count.
P1 (z_zvol) P2 (systemd-udevd)
--- ---
zvol_remove_minors_impl()
: zv->zv_open_count==0
zvol_open()
->mutex_enter(zv_state_lock)
: zv->zv_open_count++
->mutex_exit(zv_state_lock)
->mutex_enter(zv->zv_state_lock)
->zvol_remove(zv)
->mutex_exit(zv->zv_state_lock)
: zv->zv_disk->private_data = NULL
->zvol_free()
-->ASSERT(zv->zv_open_count==0) *
zvol_release()
: zv = disk->private_data
->ASSERT(zv && zv->zv_open_count>0) *
--- ---
* ASSERT() fails
Reviewed by: Boris Protopopov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]
Signed-off-by: loli10K <[email protected]>
Closes #6213
|
|
|
|
|
|
|
|
|
| |
In arc_evict_state() we start pruning when arc_dnode_size >
arc_dnode_limit, i.e. arc_dnode_limit is a ceiling rather than a
floor.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris Dunlop <[email protected]>
Closes #6228
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This continues what was started in
0eef1bde31d67091d3deed23fe2394f5a8bf2276 by fully converting zvols
to avoid unnecessary dnode_hold() calls. This saves a small amount
of CPU time and slightly improves latencies of operations on zvols.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #6058
|
|
|
|
|
|
|
|
|
|
| |
Cleanup zpool_import_all_001_pos to no longer use devices.
The test is meant to test zpool import -a and by no longer
requiring devices, a number of dependencies are no longer
necessary.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes #6198
|
|
|
|
|
|
|
|
| |
Buildbots and zfs-tests regularly see 7 kilobytes of stack
usage with this function. Convert self-calls to iterations
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: DHE <[email protected]>
Closes #6219
|
|
|
|
|
|
|
|
|
| |
The log function log_must_busy was added in commit e623aea2 for
this purpose. Update destroy_pool to use it.
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6217
|
|
|
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: DHE <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Jack Draak <[email protected]>
Signed-off-by: Kash Pande <[email protected]>
Closes #6203
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Paul Dagnelie <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Kash Pande <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
The send size estimate for a zvol can be too low, if the size of the
record headers (dmu_replay_record_t's) is a significant portion of the
size. This is typically the case when the data is highly compressible,
especially with embedded blocks.
The problem is that dmu_adjust_send_estimate_for_indirects() assumes
that blocks are the size of the "recordsize" property (128KB). However,
for zvols, the blocks are the size of the "volblocksize" property (8KB).
Therefore, we estimate that there will be 16x less record headers than
there really will be.
The fix is to check the type of the object set (whether it is a zvol or
not) and pick the appropriate property. In addition, while we are at it,
we also add the size of the BEGIN and END records to the estimate.
OpenZFS-issue: https://www.illumos.org/issues/8056
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/faf09cd
Closes #6205
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Paul Dagnelie <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
dbuf_evict_notify() holds the dbuf_evict_lock while checking if it should
do the eviction itself (because the evict thread is not able to keep up).
This can result in massive lock contention. It isn't necessary to hold
the lock, because if we make the wrong choice occasionally, nothing bad
will happen. This commit results in a ~60% performance improvement for
ARC-cached sequential reads.
OpenZFS-issue: https://www.illumos.org/issues/8156
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/f73e5d9
Closes #6204
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dmu_object_alloc() is single-threaded, so when multiple threads are
creating files in a single filesystem, they spend a lot of time waiting
for the os_obj_lock. To improve performance of multi-threaded file
creation, we must make dmu_object_alloc() typically not grab any
filesystem-wide locks.
The solution is to have a "next object to allocate" for each CPU. Each
of these "next object"s is in a different block of the dnode object, so
that concurrent allocation holds dnodes in different dbufs. When a
thread's "next object" reaches the end of a chunk of objects (by default
4 blocks worth -- 128 dnodes), it will be reset to the per-objset
os_obj_next, which will be increased by a chunk of objects (128). Only
when manipulating the os_obj_next will we need to grab the os_obj_lock.
This decreases lock contention dramatically, because each thread only
needs to grab the os_obj_lock briefly, once per 128 allocations.
This results in a 70% performance improvement to multi-threaded object
creation (where each thread is creating objects in its own directory),
from 67,000/sec to 115,000/sec, with 8 CPUs.
Work sponsored by Intel Corp.
Authored by: Matthew Ahrens <[email protected]>
Reviewed-by: Ned Bass <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Matthew Ahrens <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8199
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/374
Closes #4703
Closes #6117
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- After some ZIL changes 6 years ago zil_slog_limit got partially broken
due to zl_itx_list_sz not updated when async itx'es upgraded to sync.
Actually because of other changes about that time zl_itx_list_sz is not
really required to implement the functionality, so this patch removes
some unneeded broken code and variables.
- Original idea of zil_slog_limit was to reduce chance of SLOG abuse by
single heavy logger, that increased latency for other (more latency critical)
loggers, by pushing heavy log out into the main pool instead of SLOG. Beside
huge latency increase for heavy writers, this implementation caused double
write of all data, since the log records were explicitly prepared for SLOG.
Since we now have I/O scheduler, I've found it can be much more efficient
to reduce priority of heavy logger SLOG writes from ZIO_PRIORITY_SYNC_WRITE
to ZIO_PRIORITY_ASYNC_WRITE, while still leave them on SLOG.
- Existing ZIL implementation had problem with space efficiency when it
has to write large chunks of data into log blocks of limited size. In some
cases efficiency stopped to almost as low as 50%. In case of ZIL stored on
spinning rust, that also reduced log write speed in half, since head had to
uselessly fly over allocated but not written areas. This change improves
the situation by offloading problematic operations from z*_log_write() to
zil_lwb_commit(), which knows real situation of log blocks allocation and
can split large requests into pieces much more efficiently. Also as side
effect it removes one of two data copy operations done by ZIL code WR_COPIED
case.
- While there, untangle and unify code of z*_log_write() functions.
Also zfs_log_write() alike to zvol_log_write() can now handle writes crossing
block boundary, that may also improve efficiency if ZPL is made to do that.
Sponsored by: iXsystems, Inc.
Authored by: Alexander Motin <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Prakash Surya <[email protected]>
Reviewed by: Andriy Gapon <[email protected]>
Reviewed by: Steven Hartland <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/7578
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/aeb13ac
Closes #6191
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
When writing pre-compressed buffers, arc_write() requires that
the compression algorithm used to compress the buffer matches
the compression algorithm requested by the zio_prop_t, which is
set by dmu_write_policy(). This makes dmu_write_policy() and its
callers a bit more complicated.
We simplify this by making arc_write() trust the caller to supply
the type of pre-compressed buffer that it wants to write,
and override the compression setting in the zio_prop_t.
OpenZFS-issue: https://www.illumos.org/issues/8155
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/b55ff58
Closes #6200
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit torvalds/linux@9e8925b6 allowed for kernels to be built
without support for mandatory locking (MS_MANDLOCK). This will
result in 'zfs mount' failing when the nbmand=on property is set
if the kernel is built without CONFIG_MANDATORY_FILE_LOCKING.
Unfortunately we can not reliably detect prior to the mount(2) system
call if the kernel was built with this support. The best we can do
is check if the mount failed with EPERM and if we passed 'mand'
as a mount option and then print a more useful error message. e.g.
filesystem 'tank/fs' has the 'nbmand=on' property set, this mount
option may be disabled in your kernel. Use 'zfs set nbmand=off'
to disable this option and try to mount the filesystem again.
Additionally, switch the default error message case to use
strerror() to produce a more human readable message.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4729
Closes #6199
|
|
|
|
|
|
|
|
|
|
| |
zpool_create_024_pos, zvol_misc_002_pos, write_dirs_002_pos are slow
on the buildbot 32-bit builder. Skip the test cases for now on 32-bit
builders.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes #6195
|