| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Fix a misreport in 'zdb -d' where it falsely marked
BRT objects as leaked.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Yuxin Wang <[email protected]>
Closes #15882
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds the zed_notify_ntfy() function and hooks it
into zed_notify(). This will allow ZED to send notifications
to ntfy.sh or a self-hosted Ntfy service, which can be received
on a desktop or mobile device. It is configured with ZED_NTFY_TOPIC,
ZED_NTFY_URL, and ZED_NTFY_ACCESS_TOKEN variables in zed.rc.
Reviewed-by: @classabbyamp
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Dex Wood <[email protected]>
Closes #15584
|
|
|
|
|
|
|
|
|
| |
Because "filesystem" and "volume" are just too long!
Sponsored-by: https://despairlabs.com/sponsor/
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes #15864
(cherry picked from commit a5a725440bcb2f4c4554be3e489f911e3dd60412)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When very large pools are present, it can be laborious to find
reasons for why a pool is degraded and/or where an unhealthy vdev
is. This option filters out vdevs that are ONLINE and with no errors
to make it easier to see where the issues are. Root and parents of
unhealthy vdevs will always be printed.
Testing:
ZFS errors and drive failures for multiple vdevs were simulated with
zinject.
Sample vdev listings with '-e' option
- All vdevs healthy
NAME STATE READ WRITE CKSUM
iron5 ONLINE 0 0 0
- ZFS errors
NAME STATE READ WRITE CKSUM
iron5 ONLINE 0 0 0
raidz2-5 ONLINE 1 0 0
L23 ONLINE 1 0 0
L24 ONLINE 1 0 0
L37 ONLINE 1 0 0
- Vdev faulted
NAME STATE READ WRITE CKSUM
iron5 DEGRADED 0 0 0
raidz2-6 DEGRADED 0 0 0
L67 FAULTED 0 0 0 too many errors
- Vdev faults and data errors
NAME STATE READ WRITE CKSUM
iron5 DEGRADED 0 0 0
raidz2-1 DEGRADED 0 0 0
L2 FAULTED 0 0 0 too many errors
raidz2-5 ONLINE 1 0 0
L23 ONLINE 1 0 0
L24 ONLINE 1 0 0
L37 ONLINE 1 0 0
raidz2-6 DEGRADED 0 0 0
L67 FAULTED 0 0 0 too many errors
- Vdev missing
NAME STATE READ WRITE CKSUM
iron5 DEGRADED 0 0 0
raidz2-6 DEGRADED 0 0 0
L67 UNAVAIL 3 1 0
- Slow devices when -s provided with -e
NAME STATE READ WRITE CKSUM SLOW
iron5 DEGRADED 0 0 0 -
raidz2-5 DEGRADED 0 0 0 -
L10 FAULTED 0 0 0 0 external device fault
L51 ONLINE 0 0 0 14
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Cameron Harr <[email protected]>
Closes #15769
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace ENCLO_US_RE with ENCLO_SU_RE in the name of the variable.
Note this changes the user-visible string in zed.rc, thus might
break current users with the wrong string, but it's ~2 months
since zfs-2.2.0 tag is out, thus should not be widespread yet.
Mechanical change:
$ grep -rl ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT
cmd/zed/zed.d/zed.rc
cmd/zed/zed.d/statechange-slot_off.sh
$ sed -i 's/ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT/<linebreak>
ZED_POWER_OFF_ENCLOSURE_SLOT_ON_FAULT/g' \
cmd/zed/zed.d/zed.rc \
cmd/zed/zed.d/statechange-slot_off.sh
$ grep -rl ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT
$
Fixes 11fbcacf37d1a66c7a40bb8920c70ce9a87270ea
("zed: Add zedlet to power off slot when drive is faulted")
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Mauricio Faria de Oliveira <[email protected]>
Closes #15651
|
|
|
|
|
|
|
|
|
|
| |
- Mark some parameters to zpool_power*() as unused.
- Add a stub zpool_disk_wait().
Fixes: a9520e6e5 ("zpool: Add slot power control, print power status")
Signed-off-by: Mark Johnston <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add `zpool` flags to control the slot power to drives. This assumes
your SAS or NVMe enclosure supports slot power control via sysfs.
The new `--power` flag is added to `zpool offline|online|clear`:
zpool offline --power <pool> <device> Turn off device slot power
zpool online --power <pool> <device> Turn on device slot power
zpool clear --power <pool> [device] Turn on device slot power
If the ZPOOL_AUTO_POWER_ON_SLOT env var is set, then the '--power'
option is automatically implied for `zpool online` and `zpool clear`
and does not need to be passed.
zpool status also gets a --power option to print the slot power status.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Mart Frauenlob <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #15662
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There have been rare cases where the VDEV_ENC_SYSFS_PATH value that zed
gets passed is stale. To mitigate this, dynamically check the sysfs
path at the time of zed event processing, and use the dynamic value if
possible. Note that there will be other times when we can not
dynamically detect the sysfs path (like if a disk disappears) and have
to rely on the old value for things like turning on the fault LED. That
is to say, we can't just blindly use the dynamic path in every case.
Also:
- Add enclosure sysfs entry when running 'zpool add'
- Fix 'slot' and 'enc' zpool.d scripts for nvme
Reviewed-by: Don Brady <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #15462
|
|
|
|
|
|
|
|
|
| |
list, status and iostat all display the -T timestamp before the header,
but wait showed it after. Make it be like the others.
Reported-by: Kyle Evans <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes #15825
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running zfs share -a resetting the exports.d/zfs.exports makes
sense the get a clean state.
Truncating was also called with zfs mount which would not populate the
file again.
Add test to verify shares persist after mount -a.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Stefan Lendl <[email protected]>
Closes #15607
Closes #15660
|
|
|
|
|
|
|
|
|
|
| |
zdb -R with :d tries to use gzip decompression 9 times per size.
There's absolutely no reason for that, they're all the same
decompressor.
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #15726
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Profiling zdb -vvvvv on datasets with a lot of zstd blocks, we find
ourselves spending quite a lot of time on malloc/free, because we
allocate a 16M abd each call, and never free it, so we're leaking
16M per call as well.
This seems sub-optimal. So let's just keep the buffer around and
reuse it.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Signed-off-by: Rich Ercolani <[email protected]>
Closes #15721
|
|
|
|
|
|
|
|
|
|
| |
Block pointers are not encrypted in TX_WRITE and TX_CLONE_RANGE
records, so we can dump them, that may be useful for debugging.
Related to #15543.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #15629
|
|
|
|
|
|
|
|
|
| |
Bug introduced in 213d6829673.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Warner Losh <[email protected]>
Signed-off-by: Martin Matuska <[email protected]>
Closes #15606
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zdb with '-e' or exported zpool doesn't work along with
'-O' and '-r' options as we process them before '-e' has
been processed.
Below errors are seen:
~> zdb -e pool-mds65/mdt65 -O oi.9/0x200000009:0x0:0x0
failed to hold dataset 'pool-mds65/mdt65': No such file or directory
~> zdb -e pool-oss0/ost0 -r file1 /tmp/filecopy1 -p.
failed to hold dataset 'pool-oss0/ost0': No such file or directory
zdb: internal error: No such file or directory
We need to make sure to process '-O|-r' options after the
'-e' option has been processed, which imports the pool to
the namespace if it's not in the cachefile.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Akash B <[email protected]>
Closes #15532
|
|
|
|
|
|
|
|
|
|
| |
Same idea as the dedup stats, but for block cloning.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Kay Pedersen <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes #15541
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zpool_create_features_007_pos only tested for compat-2020 feature
set. It would be useful to test for all known features sets. If
any additional feature is found enabled that is not present in
compatibility list or feature set, it should be caught and
reported earlier.
This commit also removes encryption from openzfsonosx-1.8.1
compatibility list. Encryption enables bookmark_v2, since it is
a dependency of encryption, but not listed in openzfsonoxx-1.8.1
compatibility list.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Umer Saleem <[email protected]>
Closes #15505
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PR#15459 add all read-only compatible zpool features to grub2
compatibility list. 'obsolete_counts' is a read-only features that
depends on 'device_removal' feature which is not read-only and
is marked as ZFEATURE_FLAG_MOS. Creating a pool with grub2
compatibility enables 'device_removal' feature as well, which is
not desired.
This commit removes the 'obsolete_counts' feature from
grub2 compatibility list, as GRUB only supports read-only
compatible features.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Umer Saleem <[email protected]>
Closes #15499
|
|
|
|
|
|
|
|
|
|
|
| |
GRUB opens the boot pool in read-only mode. All read-only
compatible features for zpool can be enabled and added to
grub2 compatibility, as GRUB does not open the boot-pool
for write.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Umer Saleem <[email protected]>
Closes #15459
|
|
|
|
|
|
|
|
| |
The first occurrence should be "ARC prefetch data accesses:"
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: ofthesun9 <[email protected]>
Closes #15427
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The change is simple -- restore the original code so that the VDEV
path is updated when using by-id paths. The more challenging part
was to devise a second ZTS test, that would test auto-replace for
'by-id' and help prevent a future regression.
With that new test, we can now do an A|B test with , and without,
the fix to confirm that auto-replace for by-id paths works. The
existing auto-replace test, functional/fault/auto_replace_001_pos,
will confirm that we didn't break auto-replace for 'by-vdev' paths.
In the original functional/fault/auto_replace_001_pos test, the disk
wipe (using dd) was not effective in removing the partitioning since
the kernel was never informed of the wipe.
Added a call to wipefs(8) so that the kernel is informed and ZED will
re-partition the device.
Added a validation step that the re-partitioning occurred by
confirming that the GPT partition UUID changes.
Sponsored-By: OpenDrives Inc.
Sponsored-By: Klara Inc.
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #15363
|
|
|
|
|
|
|
|
|
|
| |
Have libzfs call a special `zfs_prepare_disk` script before a disk is
included into the pool. The user can edit this script to add things
like a disk firmware update or a disk health check. Use of the script
is totally optional. See the zfs_prepare_disk manpage for full details.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #15243
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if a cachefile is passed to zpool import, the cached config
is mostly offered as-is to ZFS_IOC_POOL_TRYIMPORT->spa_tryimport(), and
the results are taken as the canonical pool config and handed back to
ZFS_IOC_POOL_IMPORT.
In the course of its operation, spa_load() will inspect the pool and
build a new config from what it finds on disk. However, it then
regenerates a new config ready to import, and so rightly sets the hostid
and hostname for the local host in the config it returns.
Because of this, the "require force" checks always decide the pool is
exported and last touched by the local host, even if this is not true,
which is possible in a HA environment when MMP is not enabled. The pool
may be imported on another head, but the import checks still pass here,
so the pool ends up imported on both.
(This doesn't happen when a cachefile isn't used, because the pool
config is discovered in userspace in zpool_find_import(), and that does
find the on-disk hostid and hostname correctly).
Since the systemd zfs-import-cache.service unit uses cachefile imports,
this can lead to a system returning after a crash with a "valid"
cachefile on disk and automatically, quietly, importing a pool that has
already been taken up by a secondary head.
This commit causes the on-disk hostid and hostname to be included in the
ZPOOL_CONFIG_LOAD_INFO item in the returned config, and then changes the
"force" checks for zpool import to use them if present.
This method should give no change in behaviour for old userspace on new
kernels (they won't know to look for the new config items) and for new
userspace on old kernels (the won't find the new config items).
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Closes #15290
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds '-u' flag for zfs set operation. With this flag,
mountpoint, sharenfs and sharesmb properties can be updated
without actually mounting or sharing the dataset.
Previously, if dataset was unmounted, and mountpoint property was
updated, dataset was not mounted after the update. This behavior
is changed in #15240. We mount the dataset whenever mountpoint
property is updated, regardless if it's mounted or not.
To provide the user with option to keep the dataset unmounted and
still update the mountpoint without mounting the dataset, '-u'
flag can be used.
If any of mountpoint, sharenfs or sharesmb properties are updated
with '-u' flag, the property is set to desired value but the
operation to (re/un)mount and/or (re/un)share the dataset is not
performed and dataset remains as it was before.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Umer Saleem <[email protected]>
Closes #15322
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For sharesmb and sharenfs properties, the status of setting the
property is tied with whether we succeed to share the dataset or
not. In case sharing the dataset is not successful, this is
treated as overall failure of setting the property. In this case,
if we check the property after the failure, it is set to on.
This commit updates this behavior and the status of setting the
share properties is not returned as failure, when we fail to
share the dataset.
For sharenfs property, if access list is provided, the syntax
errors in access list/host adresses are not validated until after
setting the property during postfix phase while trying to
share the dataset. This is not correct, since the property has
already been set when we reach there.
Syntax errors in access list/host addresses are validated while
validating the property list, before setting the property and
failure is returned to user in this case when there are errors
in access list.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Ameer Hamza <[email protected]>
Signed-off-by: Umer Saleem <[email protected]>
Closes #15240
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are some inconsistencies in the handling of mountpoint
property. This commit updates the behavior and makes it
consistent.
If mountpoint property is set when dataset is unmounted, this
would update the mountpoint property. The mountpoint could be
valid or invalid in this case. Setting the mountpoint property
would result in success in this case. Dataset would still be
unmounted here.
On the other hand, if dataset is mounted and mountpoint
property is updated to something invalid where mount cannot be
successful, for example, setting the mountpoint inside a readonly
directory. This would unmount the dataset, set the mountpoint
property to requested value and tries to mount the dataset. The
mount operation returns error and this error is treated as
overall failure of setting the property while the property is
actually set.
To make the behavior consistent in case dataset is mounted or
unmounted, we should try to mount the dataset whenever mountpoint
property is updated. This would result in mounting the datasets
if canmount property is set to on, regardless if the dataset was
previously unmounted.
The failure in mount operation while setting the mountpoint
property should not be treated as failure, since the property is
actually set now to user requested value.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Ameer Hamza <[email protected]>
Signed-off-by: Umer Saleem <[email protected]>
Closes #15240
|
|
|
|
|
|
|
|
|
|
| |
Commit 8af1104f does not actually store the ashift of cache devices in
their label. However, in order to facilitate reporting the ashift
through zdb, we enable this in the present commit. We also document
how the retrieval of the ashift is done.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Amanakis <[email protected]>
Closes #15331
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem that was occurring is basically that a device was removed
by ztest and replaced with another device. It was then reguided. The
import then failed because there were two possible imports with the
same name; one with the new guid, and one with the old. This can
happen because the label writes from the device removal/replacement
can be subject to ztest's error injection.
The other ways to fix this would be to change the error injection to
not trigger on removals (which may not be technically feasible), or
to change the import code to not report configurations that are so
short on devices (which would potentially have unpleasant end-user
effects when trying to recover from data losses/device configuration
issues).
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes #15298
|
|
|
|
|
|
|
|
|
|
| |
While working on similar patches for zfs and zvol in #15153 I've
forgot about ztest. Update it also so that we test the same code
paths as use in production.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #15301
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'program help subcommand' is a reasonably common pattern for
multifunction command-line programs. This commit adds support for that
style to the zpool and zfs commands.
When run as 'zpool help [<topic>]' or 'zfs help [<topic>]', executes the
'man' program on the PATH with the most likely manpage name for the
requested topic: "zpool-<topic>" or "zfs-<topic>" for subcommands, or
"zpool<topic>" or "zfs<topic>" for the "concepts" and "props" topics.
If no topic is supplied, uses the top "zpool" or "zfs" pages.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Kay Pedersen <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes #15288
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is an occasional ztest failure that looks like ztest: attach
(/var/tmp/zloop-run/ztest.13a 570425344, draid1-1-0 532152320, 1)
returned 22, expected 95. This is because the value that we return
is EINVAL, but expected_error is set incorrectly.
Change the expected_error value to match both the comment and the
actual error value.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes #15295
|
|
|
|
|
|
|
|
| |
Allow zed to autoreplace vdevs marked as REMOVED. Also update
statechange-led zedlet to toggle fault LEDs for REMOVED vdevs.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #15281
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For zpool import and zpool split, zpool_enable_datasets is called
to mount and share all datasets in a pool. If there is an error
while mounting or sharing any dataset in the pool, the status of
import or split is reported as failure. However, the changes do
show up in zpool list.
This commit updates the error reporting in zpool import and zpool
split path. More descriptive messages are shown to user in case
there is an error during mount or share. Errors in mount or share
do not effect the overall status of zpool import and zpool split.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Umer Saleem <[email protected]>
Closes #15216
|
|
|
|
|
|
|
| |
Reviewed-by: Don Brady <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Serapheim Dimitropoulos <[email protected]>
Closes #15220
|
|
|
|
|
|
|
|
|
|
|
| |
The statechange-slot_off.sh zedlet which was added in #15200
needed to be installed so it's included by the packages.
Additional testing has also shown that multiple retries are
often needed for the script to operate reliably.
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #15210
|
|
|
|
|
|
|
|
|
|
|
| |
If ZED_POWER_OFF_ENCLOUSRE_SLOT_ON_FAULT is enabled in zed.rc, then
power off the drive's slot in the enclosure if it becomes FAULTED.
This can help silence misbehaving drives. This assumes your drive
enclosure fully supports slot power control via sysfs.
Reviewed-by: @AllKind
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #15200
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous patch #14841 appeared to have significant flaw, causing
deadlocks if zl_get_data callback got blocked waiting for TXG sync. I
already handled some of such cases in the original patch, but issue
#14982 shown cases that were impossible to solve in that design.
This patch fixes the problem by postponing log blocks allocation till
the very end, just before the zios issue, leaving nothing blocking after
that point to cause deadlocks. Before that point though any sleeps are
now allowed, not causing sync thread blockage. This require slightly
more complicated lwb state machine to allocate blocks and issue zios
in proper order. But with removal of special early issue workarounds
the new code is much cleaner now, and should even be more efficient.
Since this patch uses null zios between write, I've found that null
zios do not wait for logical children ready status in zio_ready(),
that makes parent write to proceed prematurely, producing incorrect
log blocks. Added ZIO_CHILD_LOGICAL_BIT to zio_wait_for_children()
fixes it.
Reviewed-by: Rob Norris <[email protected]>
Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: George Wilson <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #15122
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This gives `zdb -b` support for clone blocks.
Previously, it didn't know what clones were, so would count their space
allocation multiple times and then report leaked space (or, in debug,
would assert trying to claim blocks a second time).
This commit fixes those bugs, and reports the number of clones and the
space "used" (saved) by them.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Kay Pedersen <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Sponsored-By: OpenDrives Inc.
Sponsored-By: Klara Inc.
Closes #15123
|
|
|
|
|
|
|
|
|
|
|
| |
For large JBODs the log message "zfs_iter_vdev: no match" can
account for the bulk of the log messages (over 70%). Since this
message is purely informational and not that useful we remove it.
Reviewed-by: Olaf Faaland <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #15086
Closes #15094
|
|
|
|
|
|
|
|
|
|
| |
We would see zed assert on one of our systems if we powered off a
slot. Further examination showed zfs_retire_recv() was reporting
a GUID of 0, which in turn would return a NULL nvlist. Add
in a check for a zero GUID.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #15084
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Scan process may skip blocks based on their birth time, DVA, etc.
Traditionally those blocks were accounted as issued, that caused
reporting of hugely over-inflated numbers, having nothing to do
with actual disk I/O. This change utilizes never used field in
struct dsl_scan_phys to account such skipped bytes, allowing to
report how much data were actually scrubbed/resilvered and what
is the actual I/O speed. While formally it is an on-disk format
change, it should be compatible both ways, so should not need a
feature flag.
This should partially address the same issue as c85ac731a0e, but
from a different perspective, complementing it.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Akash B <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #15007
|
|
|
|
|
|
|
|
|
|
| |
This patch changes the passing of "size" to snprintf
from hard-coded (openended) to sizeof(errbuf). This
is bringing to standard with rest of the code where-
ever 'errbuf' is used.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Arshad Hussain <[email protected]>
Closes #15003
|
|
|
|
|
|
|
| |
Reviewed-by: Tino Reichardt <[email protected]>
Reviewed-by: Rob Norris <[email protected]>
Signed-off-by: Mateusz Piotrowski <[email protected]>
Sponsored-by: Klara Inc.
Closes #15014
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was a vdev level read cache, designed to aggregate many small
reads by speculatively issuing bigger reads instead and caching
the result. But since it has almost no idea about what is going
on with exception of ZIO_FLAG_DONT_CACHE flag set by higher layers,
it was found to make more harm than good, for which reason it was
disabled for the past 12 years. These days we have much better
instruments to enlarge the I/Os, such as speculative and prescient
prefetches, I/O scheduler, I/O aggregation etc.
Besides just the dead code removal this removes one extra mutex
lock/unlock per write inside vdev_cache_write(), not otherwise
disabled and trying to do some work.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14953
|
|
|
|
|
|
|
|
|
|
|
| |
- Do not report L2ARC as FAULTED in presence of in-flight writes.
- Report read and write I/Os, bytes and errors.
- Remove few numbers not important to average user.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #12304
Closes #14946
|
|
|
|
|
|
|
|
|
|
|
| |
... instead of list_head() + list_remove(). On FreeBSD the list
functions are not inlined, so in addition to more compact code
this also saves another function call.
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14955
|
|
|
|
|
|
|
|
|
|
|
| |
This is more-or-less like `zfs send`, but specifying the snapshot by its
objset id for situations where it can't be referenced any other way.
Sponsored-By: Klara, Inc.
Reviewed-by: Tino Reichardt <[email protected]>
Reviewed-by: WHR <[email protected]>
Signed-off-by: Rob Norris <[email protected]>
Closes #14642
|
|
|
|
|
|
| |
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Mike Swanson <[email protected]>
Closes #14902
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GRUB2 is compatible with all "read-only compatible" features,
so it is safe to add new features of this type to the grub2
compatibility list. We generally want to include all compatible
features, to minimize the differences between grub2-compatible
pools and no-compatibility pools.
Adding new properties `livelist` and `zpool_checkpoint` accordingly.
Also adding them to the man page which references this file as an
example, for consistency.
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Colm Buckley <[email protected]>
Closes #14893
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements a binary search algorithm for B-Trees that reduces
branching to the absolute minimum necessary for a binary search
algorithm. It also enables the compiler to inline the comparator to
ensure that the only slowdown when doing binary search is from waiting
for memory accesses. Additionally, it instructs the compiler to unroll
the loop, which gives an additional 40% improve with Clang and 8%
improvement with GCC.
Consumers must opt into using the faster algorithm. At present, only
B-Trees used inside kernel code have been modified to use the faster
algorithm.
Micro-benchmarks suggest that this can improve binary search performance
by up to 3.5 times when compiling with Clang 16 and up to 1.9 times when
compiling with GCC 12.2.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #14866
|