| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
If you've got multiple scrubs/resilvers going, it's rather helpful
to know which pool each scan line refers to.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes: #12674
|
|
|
|
|
|
|
|
|
| |
The fnvlist versions of the functions are fatal if they fail,
saving each call from having to include checking the result.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Igor Kozhukhov <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Remove code duplication by moving code responsible for partial block
zeroing to a separate function: dnode_partial_zero().
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #12627
|
|
|
|
|
|
|
|
| |
Make the main dmu_buf_hold_array() function non-static.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Pawel Jakub Dawidek <[email protected]>
Closes #12628
|
|
|
|
|
|
|
|
|
|
| |
Lustre makes light use of the zfs_refcount interfaces which
isn't a problem when using a non-debug build of OpenZFS. However,
when debugging is enabled the required symbols are not exported.
Reviewed-by: Olaf Faaland <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #12613
|
|
|
|
|
|
|
|
|
|
|
|
| |
refcount_add_many(foo,N) is not the same as
for (i=0; i < N; i++) { refcount_add(foo); }
Unfortunately, this is only actually true with debug kernels and
reference_tracking_enable=1.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #12589
Closes #12602
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When you create a pool, zfs writes vd->vdev_enc_sysfs_path with the
enclosure sysfs path to the fault LEDs, like:
vdev_enc_sysfs_path = /sys/class/enclosure/0:0:1:0/SLOT8
However, this enclosure path doesn't get updated on successive imports
even if enclosure path to the disk changes. This patch fixes the issue.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #11950
Closes #12095
|
|
|
|
|
|
|
|
|
|
| |
Commit 0c03d21ac99ebdbe left in a redundant if condition while
removing some code. Just remove it.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: John Kennedy <[email protected]>
Signed-off-by: Attila Fülöp <[email protected]>
Closes #12598
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, dmu_read_uio_dnode can read 64K of a requested 1M in one
loop, get EFAULT back from zfs_uiomove() (because the iovec only holds
64k), and return EFAULT, which turns into EAGAIN on the way out. EAGAIN
gets interpreted as "I didn't read anything", the caller tries again
without consuming the 64k we already read, and we're stuck.
This apparently works on newer kernels because the caller which breaks
on older Linux kernels by happily passing along a 1M read request and a
64k iovec just requests 64k at a time.
With this, we now won't return EFAULT if we got a partial read.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #12370
Closes #12509
Closes #12516
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case an ARC buffer is allocated only on L2ARC, and there are
underlying errors in a pool with the cache device in faulty state, a
panic can occur in arc_read_done()->arc_hdr_destroy()->
arc_hdr_l2arc_destroy()->arc_hdr_clear_flags() when trying to free
the ARC buffer.
Fix this by discarding the buffer's identity in arc_hdr_destroy(), in
case the buffer is not empty, before calling arc_hdr_l2hdr_destroy().
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #12392
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of the Linux 5.9 kernel a fallthrough macro has been added which
should be used to anotate all intentional fallthrough paths. Once
all of the kernel code paths have been updated to use fallthrough
the -Wimplicit-fallthrough option will because the default. To
avoid warnings in the OpenZFS code base when this happens apply
the fallthrough macro.
Additional reading: https://lwn.net/Articles/794944/
Reviewed-by: Tony Nguyen <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #12441
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Userland figures out which encryption-root keys are required to load,
and issues ZFS_IOC_LOAD_KEY.
The tail section of spa_keystore_load_wkey() will call
zvol_create_minors() on the encryption-root object.
Any clones of the encrypted zvol will not be plumbed. This commits
adds additional logic to detect if zvol has clones, and is encrypted,
then adds these to the list of zvols to call zvol_create_minors() on.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Jorgen Lundman <[email protected]>
Closes #12471
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Errors in zil_lwb_write_done() are not propagated to
zil_lwb_flush_vdevs_done() which can result in zil_commit_impl()
not returning an error to applications even when zfs was not able
to write data to the disk.
Remove the ZIO_FLAG_DONT_PROPAGATE flag from zio_rewrite() to
allow errors to propagate and consolidate the error handling for
flush and write errors to a single location (rather than having
error handling split between the "write done" and "flush done"
handlers).
Reviewed-by: George Wilson <[email protected]>
Reviewed-by: Prakash Surya <[email protected]>
Signed-off-by: Arun KV <[email protected]>
Closes #12391
Closes #12443
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The block pointer verification check in arc_read() should also
cover embedded block pointers. While highly unlikely, accessing
a damaged block pointer can result in panic. To further harden
the code extend the existing check to include embedded block
pointers and add a comment explaining the rational for this
sanity check. Lastly, correct a flaw in zfs_blkptr_verify()
so the error count is checked even when checking a untrusted
config to verify the non-pool-specific portions of a block
pointer.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Tony Nguyen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #12535
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For kernel to send snapshot mount/unmount events to zed.
For kernel to send symlink creates/removes on zvol plumbing.
(/dev/run/dsk/zvol/$pool/$zvol -> /dev/diskX)
If zed misses the ENODEV, all errors after are EINVAL. Treat any error
as kernel module failure.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Jorgen Lundman <[email protected]>
Closes #12416
|
|
|
|
|
|
|
|
|
|
|
|
| |
When zfs_send_corrupt_data is set, use the TRAVERSE_HARD flag,
so traverse_visitbp() will not fail with ECKSUM if a blockpointer
cannot be read, but rather will continue and send the objects it can.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: John Kennedy <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Sponsored-By: Klara Inc.
Sponsored-By: WHC Online Solutions Inc.
Closes #12541
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately, there was an overzealous assertion that was (in pretty
specific circumstances) false, causing failure. This assertion was
added in error, so we're removing it.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: George Wilson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #9897
Closes #12020
Closes #12246
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We round up the psize to the nearest multiple of the asize or to the
lsize, whichever is smaller. Once that's done, we allocate a new
buffer of the appropriate size, zero the tail, and copy the data
into it. This adds a small performance cost to these kinds of writes,
but fixes the bookkeeping problems.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Co-authored-by: Matthew Ahrens <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes #12522
Closes #8462
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, zpool-iostat did not display any data regarding rebuild I/Os
in either the latency/size histograms (-w/-l/-r) or the queue data (-q).
This fix essentially utilizes the existing infrastructure for tracking
rebuild queue data and displays this data in the proper places within
zpool-iostat's output.
Signed-off-by: Trevor Bautista <[email protected]>
Signed-off-by: Trevor Bautista <[email protected]>
Co-authored-by: Trevor Bautista <[email protected]>
Reviewed-by: Richard Elling <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
benchmark_raidz() allocates a row to benchmark parity calculation and
reconstruction. In the latter case, the parity blocks are left
uninitialized, leading to reports from KMSAN.
Initialize parity blocks to 0xAA as we do for the data earlier in the
function. This does not affect the selected RAID-Z implementation on
any of several systems tested.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Mark Johnston <[email protected]>
Closes #12473
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a header is allocated for full overwrite it is a waste of time
to allocate b_pabd/b_rabd for it, since arc_write() will free them
without ever being touched. If it is a read or a partial overwrite
then arc_read() and arc_hdr_decrypt() allocate them explicitly.
Reduced memory allocation in user threads also reduces ARC eviction
throttling there, proportionally increasing it in ZIO threads, that
is not good. To minimize or even avoid it introduce ARC allocation
reserve, allowing certain arc_get_data_abd() callers to allocate a
bit longer in situations where user threads will already throttle.
Reviewed-by: George Wilson <[email protected]>
Reviewed-by: Mark Maybee <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes #12398
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is very expensive and not informative to call multilist_is_empty()
for each arc_change_state() on debug builds to check for impossible.
Instead implement special index function for arc_l2c_only->arcs_list,
multilists, panicking on any attempt to use it.
Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes #12421
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of clearing stats inside arc_buf_alloc_impl() do it inside
arc_hdr_alloc() and arc_release(). It fixes statistics being wiped
every time a new dbuf is filled from the ARC.
Remove b_l1hdr.b_l2_hits. L2ARC hits are accounted at b_l2hdr.b_hits.
Since the hits are accounted under hash lock, replace atomics with
simple increments.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Wilson <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes #12422
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vq_lock is already too congested for two more operations per I/O.
Instead of dropping and reacquiring it inside vdev_queue_aggregate()
delegate the zio_vdev_io_bypass() and zio_execute() calls for parent
I/Os to callers, that drop the lock any way to execute the new I/O.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes #12297
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use atomic_load_64() for zfs_refcount_count() to prevent torn reads
on 32-bit platforms. On 64-bit ones it should not change anything.
When built with ZFS_DEBUG but running without tracking enabled use
atomics instead of mutexes same as for builds without ZFS_DEBUG.
Since rc_tracked can't change live we can check it without lock.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes #12420
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before OpenZFS 2.0, trying to set the FreeBSD sysctl vfs.zfs.arc_max
to a disallowed value would return an error.
Since the switch, it instead only generates WARN_IF_TUNING_IGNORED
Keep the ability to set the sysctl's specifically to 0, even though
that is less than the minimum, because some tests depend on this.
Also lost, was the ability to set vfs.zfs.arc_max to a value less
than the default vfs.zfs.arc_min at boot time. Restore this as well.
Reviewed-by: Tony Nguyen <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Closes #12161
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Run arc_evict thread at higher priority, nice=0, to give it more CPU
time which can improve performance for workload with high ARC evict
activities.
On mixed read/write and sequential read workloads, I've seen between
10-40% better performance.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Tony Nguyen <[email protected]>
Closes #12397
|
|
|
|
|
|
|
|
|
|
|
| |
It is wrong for arc_write_ready() to use zfs_abd_scatter_enabled to
decide whether to reallocate/copy the buffer, because the answer is
OS-specific and depends on the buffer size. Instead of that use
abd_size_alloc_linear(), moved into public header.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Closes #12425
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5dbf6c5a66d did not address these format specifier warnings
since they were introduced by an unrelated commit which had not
been rebased on 5dbf6c5a66d when merged. Fix them.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #12435
|
|
|
|
|
|
|
| |
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Tony Nguyen <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Jorgen Lundman <[email protected]>
Closes #12357
|
|
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Issue #12314
Closes #12419
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In l2arc_add_vdev() first decide whether the device is eligible for
L2ARC rebuild or whole device trim and then add it to the list of cache
devices. Otherwise l2arc_feed_thread() might already start writing on
the device invalidating previous content as l2ad_hand = l2ad_start.
However l2arc_rebuild_vdev() needs the device present in the cache
device list to figure out its l2arc_dev_t. Fix this by moving most of
l2arc_rebuild_vdev() in a new function l2arc_rebuild_dev() which does
not need to search in the cache device list.
In contrast to l2arc_add_vdev() we do not have to worry about
l2arc_feed_thread() invalidating previous content when onlining a
cache device. The device parameters (l2ad*) are not cleared when
offlining the device and writing new buffers will not invalidate
all previous content. In worst case only buffers that have not had
their log block written to the device will be lost.
Retire persist_l2arc_00{4,5,8} tests since they cover code already
covered by the remaining ones. Test persist_l2arc_006 is renamed to
persist_l2arc_004 and persist_l2arc_007 is renamed to persist_l2arc_005.
Fix a typo in persist_l2arc_004, and remove an assertion that is not
always true from l2arc_arcstats_pos. Also update an assertion in
persist_l2arc_005 and explain why in a comment.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #12365
|
|
|
|
|
|
|
|
| |
These were mostly used to annotate do {} while(0)s
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Issue #12201
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Issue #12201
|
|
|
|
|
|
|
|
|
| |
This includes a simplification of mkbusy and format correctness in zhack
and ztest
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Issue #12201
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Issue #12201
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems nothing ensures that this array is zeroed when a dnode is
freshly allocated, so in principle it retains the values from the
previous allocation. In practice it seems to be the case that the
fields should end up zeroed, but we can zero the field anyway for
consistency.
This was found using KMSAN.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mark Johnston <[email protected]>
Closes #12383
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When logging a TX_WRITE record in the case where file data has to be
copied from the DMU, we pad the log record size to a multiple of 8
bytes. In this case, any padding bytes should be zeroed, otherwise the
contents of uninitialized memory are written to the ZIL.
This was found using KMSAN.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mark Johnston <[email protected]>
Closes #12383
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When allocating a record, we round up the allocation size to a multiple
of 8. In this case, any padding bytes should be zeroed, otherwise the
contents of uninitialized memory are written to the ZIL.
This was found using KMSAN.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mark Johnston <[email protected]>
Closes #12383
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When logging TX_SETATTR, we could otherwise fail to initialize part of
the corresponding ZIL record depending on which fields are present in
the xvattr. Initialize the creation time and the AV scan timestamp to
zero so that uninitialized bytes are not written to the ZIL.
This was found using KMSAN.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mark Johnston <[email protected]>
Closes #12383
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
spa_prop_find() may fail to find the specified property, in which case
it suppresses ENOENT from zap_lookup(). In this case, the return value
is left uninitialized, so spa_autoreplace was being initialized using an
uninitialized stack variable.
This was found using KMSAN. It appears to be a regression from commit
9eb7b46ed0, which removed the initialization of "autoreplace" from the
definition.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mark Johnston <[email protected]>
Closes #12383
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After 1325434b, we can in certain circumstances end up calling
spa_update_dspace with vd->vdev_mg NULL, which ends poorly during
vdev removal.
So let's not do that further space adjustment when we can't.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #12380
Closes #12428
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove mc_lock use from metaslab_class_throttle_*(). The math there
is based on refcounts and so atomic, so the only race possible there
is between zfs_refcount_count() and zfs_refcount_add(). But in most
cases metaslab_class_throttle_reserve() is called with the allocator
lock held, which covers the race. In cases where the lock is not
held, GANG_ALLOCATION() or METASLAB_MUST_RESERVE are set, and so we
do not use zfs_refcount_count(). And even if we assume some other
non-existing scenario, the worst that may happen from this race is
few more I/Os get to allocation earlier, that is not a problem.
Move locks and data of different allocators into different cache
lines to avoid false sharing. Group spa_alloc_* arrays together
into single array of aligned struct spa_alloc spa_allocs. Align
struct metaslab_class_allocator.
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Don Brady <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes #12314
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add Module Parameters Regarding Log Size Limit
zfs_wrlog_data_max
The upper limit of TX_WRITE log data. Once it is reached,
write operation is blocked, until log data is cleared out
after txg sync. It only counts TX_WRITE log with WR_COPIED
or WR_NEED_COPY.
Reviewed-by: Prakash Surya <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: jxdking <[email protected]>
Closes #12284
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove unneeded global, practically constant, state pointer variables
(arc_anon, arc_mru, etc.), replacing them with macros of real state
variables addresses (&ARC_anon, &ARC_mru, etc.).
Change ARC_EVICT_ALL from -1ULL to UINT64_MAX, not requiring special
handling in inner loop of ARC reclamation. Respectively change bytes
argument of arc_evict_state() from int64_t to uint64_t.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Closes #12348
|
|
|
|
|
|
|
|
| |
Ensure all calls to bqueue_init() has a corresponding call to bqueue_destroy()
Reviewed-by: Paul Dagnelie <[email protected]>
Co-authored-by: Brian Behlendorf <[email protected]>
Signed-off-by: Jorgen Lundman <[email protected]>
Closes #12118
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* zio: avoid callback typecasting
* zil: avoid zil_itxg_clean() callback typecasting
* zpl: decouple zpl_readpage() into two separate callbacks
* nvpair: explicitly declare callbacks for xdr_array()
* linux/zfs_nvops: don't use external iput() as a callback
* zcp_synctask: don't use fnvlist_free() as a callback
* zvol: don't use ops->zv_free() as a callback for taskq_dispatch()
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Mark Maybee <[email protected]>
Signed-off-by: Alexander Lobakin <[email protected]>
Closes #12260
|
|
|
|
|
|
|
|
| |
We don't use or need the pool name or value source in the zvol tasks.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #12361
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of dsl_dir_diduse_space() and dsl_dir_transfer_space() CPU time
is a dd_lock overhead and time spent in dmu_buf_will_dirty(). Calling
them one after another is a waste of time and even more contention.
Doing that twice for each rewritten block within dbuf_write_done()
via dsl_dataset_block_kill() and dsl_dataset_block_born() created one
of the biggest CPU overheads in case of small blocks rewrite.
dsl_dir_diduse_transfer_space() combines functionality of these two
functions for cases where it is needed, but without double overhead,
practically for the cost of dsl_dir_diduse_space() or even cheaper.
While there, optimize dsl_dir_phys() calls in dsl_dir_diduse_space()
and dsl_dir_transfer_space(). It seems Clang detects some aliasing
there, repeating dd->dd_dbuf->db_data dereference multiple times,
increasing dd_lock scope and contention.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Author: Ryan Moeller <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes #12300
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Tinker with slop space accounting with dedup
Do not include the deduplicated space usage in the slop space
reservation, it leads to surprising outcomes.
* Update spa_dedup_dspace sometimes
Sometimes, we get into spa_get_slop_space() with
spa_dedup_dspace=~0ULL, AKA "unset", while spa_dspace is correctly set.
So call the code to update it before we use it if we hit that case.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Mark Maybee <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #12271
|