| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
Adds a devid for nvme devices. This is very similar to how the
other 'bus' (scsi|sata|usb) devids are generated. The devid
resides in a name/value pair in the leaf vdevs in a zpool config.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Richard Elling <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #7356
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When given an empty string as a rootds value, bootcfg -C fails with
the error message 'could not set nextboot: '' is an invalid name'.
This should be allowed because it represents clearing the nextboot
configuration.
Authored by: Paul Dagnelie <[email protected]>
Reviewed by: Chris Williamson <[email protected]>
Reviewed by: Sebastien Roy <[email protected]>
Reviewed by: Igor Kozhukhov <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/9193
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/504645d227
Closes #7230
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This treats /dev/nvme.. devices the same way as /dev/sd... devices. The
motivation behind this is that whole disk detection did not work on nvme
SSDs without that, because it DKC_UNKNOWN was returned for such devices.
Perhaps there should be a separate DKC_ type for this, but I don't know
enough about the code to know the implications of that.
Reviewed-by: Don Brady <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: timor <[email protected]>
Closes #7304
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The changes piggyback JSON output support on top of channel programs
(#6558). This way the JSON output support is targeted to scripting
use cases and is easily maintainable since it really only touches
one function (zfs_do_channel_program()).
This patch ports Joyent's JSON nvlist library from illumos to enable
easy JSON printing of channel program output nvlist. To keep the
delta small I also took advantage of the fact that printing in
zfs_do_channel_program() was almost always done before exiting
the program.
Reviewed by: Matt Ahrens <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Richard Elling <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alek Pinchuk <[email protected]>
Closes #7281
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the pool is suspended, record whether it was due to an I/O error or
due to MMP writes failing to succeed within the required time.
Change spa_suspended from uint8_t to zio_suspend_reason_t to store the
reason.
When userspace queries pool status via spa_tryimport(), report the
reason the pool was suspended in a new key,
ZPOOL_CONFIG_SUSPENDED_REASON.
In libzfs, when interpreting the returned config nvlist, report
suspension due to MMP with a new pool status enum value,
ZPOOL_STATUS_IO_FAILURE_MMP.
In status_callback(), which generates and emits the message when 'zpool
status' is executed, add a case to print an appropriate message for the
new pool status enum value.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Olaf Faaland <[email protected]>
Closes #7296
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Historically a dynamic misc minor number was registered for the
/dev/zfs device in order to prevent minor number collisions. This
was fine but it prevented us from being able to use the kernel
module auto-loaded which requires a known reserved value.
Resolve this issue by adding a configure test to find an available
misc minor number which can then be used in MODULE_ALIAS_MISCDEV at
build time. By adding this alias the zfs kmod is added to the list
of known static-nodes and the systemd-tmpfiles-setup-dev service
will create a /dev/zfs character device at boot time.
This in turn allows us to update the 90-zfs.rules file to make it
aware this is a static node. The upshot of this is that whenever
a process (zpool, zfs, zed) opens the /dev/zfs the kmods will be
automatic loaded. This even works for unprivileged users so there
is no longer a need to manually load the modules at boot time.
As an additional bonus the zed now no longer needs to start after
the zfs-import.service since it will trigger the module load.
In the unlikely event the minor number we selected conflicts with
another out of tree unregistered minor number the code falls back
to dynamically allocating it. In this case the modules again
must be manually loaded.
Note that due to the change in the method of registering the minor
number the zimport.sh test case may incorrectly fail when the
static node for the installed packages is created instead of the
dynamic one. This issue will only transiently impact zimport.sh
for this single commit when we transition and are mixing and
matching methods.
Reviewed-by: Fabian Grünbichler <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
TEST_ZIMPORT_SKIP="yes"
Closes #7287
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_format_prompt_string() and zpool_state_to_name() return
a string literal which is read-only, thus they should return
`const char*`.
zpool_get_prop_string() returns a non-const string after
successful nv-lookup, and returns a string literal otherwise.
Since this function is designed to be used for read-only purpose,
the return type should also be `const char*`.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #7285
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1) The Coverity Scan reports some issues for the project
quota patch, including:
1.1) zfs_prop_get_userquota() directly uses the const quota
type value as the condition check by wrong.
1.2) dmu_objset_userquota_get_ids() may cause dnode::dn_newgid
to be overwritten by dnode::dn->dn_oldprojid.
2) This patch fixes related issues. It also enhances the logic
for zfs_project_item_alloc() to avoid buffer overflow.
3) Skip project quota ability check if does not change project
quota related things (id or flag). Otherwise, it will cause
chattr (for other non project quota flags) operation failed
if project quota disabled.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Fan Yong <[email protected]>
Closes #7251
Closes #7265
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change implements 'zfs send -b' which can be used to send only
received property values whether or not they are overridden by local
settings.
This can be very useful during "restore" operations from a backup pool
because it allows to send only the property values originally sent
from the backup source, even though they were later modified on the
destination either by a 'zfs set' operation, explicit 'zfs inherit' or
overridden during the receive process via 'zfs receive -o|-x'.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #7156
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current design of ZFS encryption only allows a dataset to
have one DSL Crypto Key at a time. As a result, it is important
that the zfs receive code ensures that only one key can be in use
at a time for a given DSL Directory. zfs receive -F complicates
this, since the new dataset is received as a clone of the existing
one so that an atomic switch can be done at the end. To prevent
confusion about which dataset is actually encrypted a check was
added to ensure that encrypted datasets cannot use zfs recv -F to
completely replace existing datasets. Unfortunately, the check did
not take into account unencrypted datasets being overriden by
encrypted ones as a case.
Along the same lines, the code also failed to ensure that raw
recieves could not be done on top of existing unencrypted
datasets, which causes amny problems since the new stream cannot
be decrypted.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7199
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because resuming from a token requires "guid" -> "snapshot" mapping
we have to walk the whole dataset hierarchy to find the right snapshot
to send; when both source and destination exists, for an incremental
resumable stream, libzfs gets confused and picks up the wrong snapshot
to send from: this results in attempting to send
"destination@snap1 -> source@snap2"
instead of
"source@snap1 -> source@snap2"
which fails with a "Invalid cross-device link" error (EXDEV).
Fix this by adjusting the logic behind dataset traversal in
zfs_iter_children() to pick the right snapshot to send from.
Additionally update dry-run 'zfs send -t' to print its output to
stderr: this is consistent with other dry-run commands.
Patch Notes:
Reconciled differences between OpenZFS and
aee1dd4d983c64db3c3155290d48f05243e85709.
Authored by: loli10K <[email protected]>
Reviewed by: Paul Dagnelie <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Brian Behlendorf <[email protected]>
Approved by: Hans Rosenfeld <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8940
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/9f7867c206
Closes #7171
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Project quota is a new ZFS system space/object usage accounting
and enforcement mechanism. Similar as user/group quota, project
quota is another dimension of system quota. It bases on the new
object attribute - project ID.
Project ID is a numerical value to indicate to which project an
object belongs. An object only can belong to one project though
you (the object owner or privileged user) can change the object
project ID via 'chattr -p' or 'zfs project [-s] -p' explicitly.
The object also can inherit the project ID from its parent when
created if the parent has the project inherit flag (that can be
set via 'chattr +P' or 'zfs project -s [-p]').
By accounting the spaces/objects belong to the same project, we
can know how many spaces/objects used by the project. And if we
set the upper limit then we can control the spaces/objects that
are consumed by such project. It is useful when multiple groups
and users cooperate for the same project, or a user/group needs
to participate in multiple projects.
Support the following commands and functionalities:
zfs set projectquota@project
zfs set projectobjquota@project
zfs get projectquota@project
zfs get projectobjquota@project
zfs get projectused@project
zfs get projectobjused@project
zfs projectspace
zfs allow projectquota
zfs allow projectobjquota
zfs allow projectused
zfs allow projectobjused
zfs unallow projectquota
zfs unallow projectobjquota
zfs unallow projectused
zfs unallow projectobjused
chattr +/-P
chattr -p project_id
lsattr -p
This patch also supports tree quota based on the project quota via
"zfs project" commands set as following:
zfs project [-d|-r] <file|directory ...>
zfs project -C [-k] [-r] <file|directory ...>
zfs project -c [-0] [-d|-r] [-p id] <file|directory ...>
zfs project [-p id] [-r] [-s] <file|directory ...>
For "df [-i] $DIR" command, if we set INHERIT (project ID) flag on
the $DIR, then the proejct [obj]quota and [obj]used values for the
$DIR's project ID will be shown as the total/free (avail) resource.
Keep the same behavior as EXT4/XFS does.
Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by Ned Bass <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Fan Yong <[email protected]>
TEST_ZIMPORT_POOLS="zol-0.6.1 zol-0.6.2 master"
Change-Id: Ib4f0544602e03fb61fd46a849d7ba51a6005693c
Closes #6290
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Receiving an incremental stream after an interrupted "zfs receive -s"
fails with the message "dataset is busy": this is because we still have
the hidden clone ../%recv from the resumable receive.
Improve the error message suggesting the existence of a partially
complete resumable stream from "zfs receive -s" which can be either
aborted ("zfs receive -A") or resumed ("zfs send -t").
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #7129
Closes #7154
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8520 lzc_rollback_to should support rolling back to origin
7198 libzfs should gracefully handle EINVAL from lzc_rollback
lzc_rollback_to() should support rolling back to a clone's origin.
The current checks in zfs_ioc_rollback() would not allow that
because the origin snapshot belongs to a different filesystem.
The overly restrictive check was in introduced in 7600, but it
was not a regression as none of the existing tools provided a
way to rollback to the origin.
Authored by: Andriy Gapon <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8520
OpenZFS-issue: https://www.illumos.org/issues/7198
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/78a5a1a25a
Closes #7150
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: Chris Williamson <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Ported-by: Don Brady <[email protected]>
We want to be able to run channel programs outside of synching
context. This would greatly improve performance for channel programs
that just gather information, as they won't have to wait for synching
context anymore.
=== What is implemented?
This feature introduces the following:
- A new command line flag in "zfs program" to specify our intention
to run in open context. (The -n option)
- A new flag/option within the channel program ioctl which selects
the context.
- Appropriate error handling whenever we try a channel program in
open-context that contains zfs.sync* expressions.
- Documentation for the new feature in the manual pages.
=== How do we handle zfs.sync functions in open context?
When such a function is found by the interpreter and we are running
in open context we abort the script and we spit out a descriptive
runtime error. For example, given the script below ...
arg = ...
fs = arg["argv"][1]
err = zfs.sync.destroy(fs)
msg = "destroying " .. fs .. " err=" .. err
return msg
if we run it in open context, we will get back the following error:
Channel program execution failed:
[string "channel program"]:3: running functions from the zfs.sync
submodule requires passing sync=TRUE to lzc_channel_program()
(i.e. do not specify the "-n" command line argument)
stack traceback:
[C]: in function 'destroy'
[string "channel program"]:3: in main chunk
=== What about testing?
We've introduced new wrappers for all channel program tests that
run each channel program as both (startard & open-context) and
expect the appropriate behavior depending on the program using
the zfs.sync module.
OpenZFS-issue: https://www.illumos.org/issues/8677
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/17a49e15
Closes #6558
|
|
|
|
|
|
|
| |
Add test coverage for lua libraries
Remove dead code in Lua implementation
Signed-off-by: Don Brady <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Chris Williamson <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: John Kennedy <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Ported-by: Don Brady <[email protected]>
ZFS channel programs should be able to create snapshots.
In addition to the base snapshot functionality, this entails extra
logic to handle edge cases which were formerly not possible, such as
creating then destroying a snapshot in the same transaction sync.
OpenZFS-issue: https://www.illumos.org/issues/8600
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/68089b8b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Chris Williamson <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: John Kennedy <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Approved by: Garrett D'Amore <[email protected]>
Ported-by: Don Brady <[email protected]>
Ported-by: John Kennedy <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/7431
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/dfc11533
Porting Notes:
* The CLI long option arguments for '-t' and '-m' don't parse on linux
* Switched from kmem_alloc to vmem_alloc in zcp_lua_alloc
* Lua implementation is built as its own module (zlua.ko)
* Lua headers consumed directly by zfs code moved to 'include/sys/lua/'
* There is no native setjmp/longjump available in stock Linux kernel.
Brought over implementations from illumos and FreeBSD
* The get_temporary_prop() was adapted due to VFS platform differences
* Use of inline functions in lua parser to reduce stack usage per C call
* Skip some ZFS Test Suite ZCP tests on sparc64 to avoid stack overflow
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The on-disk format for encrypted datasets protects not only
the encrypted and authenticated blocks themselves, but also
the order and interpretation of these blocks. In order to
make this work while maintaining the ability to do raw
sends, the indirect bps maintain a secure checksum of all
the MACs in the block below it along with a few other
fields that determine how the data is interpreted.
Unfortunately, the current on-disk format erroneously
includes some fields which are not portable and thus cannot
support raw sends. It is not possible to easily work around
this issue due to a separate and much smaller bug which
causes indirect blocks for encrypted dnodes to not be
compressed, which conflicts with the previous bug. In
addition, the current code generates incompatible on-disk
formats on big endian and little endian systems due to an
issue with how block pointers are authenticated. Finally,
raw send streams do not currently include dn_maxblkid when
sending both the metadnode and normal dnodes which are
needed in order to ensure that we are correctly maintaining
the portable objset MAC.
This patch zero's out the offending fields when computing
the bp MAC and ensures that these MACs are always
calculated in little endian order (regardless of the host
system's byte order). This patch also registers an errata
for the old on-disk format, which we detect by adding a
"version" field to newly created DSL Crypto Keys. We allow
datasets without a version (version 0) to only be mounted
for read so that they can easily be migrated. We also now
include dn_maxblkid in raw send streams to ensure the MAC
can be maintained correctly.
This patch also contains minor bug fixes and cleanups.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #6845
Closes #6864
Closes #7052
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When used in conjunction with one of '-e' or '-d' zfs receive options
none of the properties requested to be set (-o) are actually applied:
this is caused by a wrong assumption made about the toplevel dataset
in zfs_receive_one().
Fix this by correctly detecting the toplevel dataset.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #7088
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we know which devices have the pool we are looking for, sometime
it's better if we can directly pass those device paths to zpool import
instead of letting it to search through all unrelated stuff, which might
take a lot of time if you have hundreds of disks.
This patch allows option -d <dev_path> to zpool import. You can have
multiple pairs of -d <dev_path>, and zpool import will only search
through those devices. For example:
zpool import -d /dev/sda -d /dev/sdb
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes #7077
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
usr/src/uts/common/sys/fs/zfs.h
Change ZPROP_INVAL and ZPROP_CONT from macros to enum values. Clang
and GCC both prefer to use unsigned ints to store enums. That was
causing tautological comparison warnings (and likely eliminating
error handling code at compile time) whenever a zfs_prop_t or
zpool_prop_t was compared to ZPROP_INVAL or ZPROP_CONT. Making the
error flags be explicity enum values forces the enum types to be
signed.
ZPROP_INVAL was also compared against two different enum types. I
had to change its name to ZPOOL_PROP_INVAL whenever its compared to
a zpool_prop_t. There are still some places where ZPROP_INVAL or
ZPROP_CONT is compared to a plain int, in code that doesn't know
whether the int is storing a zfs_prop_t or a zpool_prop_t.
usr/src/uts/common/fs/zfs/spa.c
s/ZPROP_INVAL/ZPOOL_PROP_INVAL/
Authored by: Alan Somers <[email protected]>
Approved by: Gordon Ross <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Igor Kozhukhov <[email protected]>
Reviewed by: George Melikov <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8652
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/c2de80dc74
Closes #7061
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"replacing" vdevs
Add "spare" and "replacing" to the list of interior vdev types in
zpool_vdev_is_interior(), alongside the existing "mirror" and "raidz".
This fixes running "zinject -d" and "zpool clear" on spare and replacing
vdevs.
Authored by: Alan Somers <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Melikov <[email protected]>
Approved by: Gordon Ross <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8641
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/9a36801382
Closes #7060
|
|
|
|
|
|
|
|
|
|
|
|
| |
For ztest, which is solely for testing, using a pseudo random
is entirely reasonable. Using /dev/urandom ensures the system
entropy pool doesn't get depleted thus stalling the testing.
This is a particular problem when testing in VMs.
Reviewed-by: Tim Chase <[email protected]>
Reviewed by: Thomas Caputi <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7017
Closes #7036
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ungracefully
Authored by: Yuri Pankov <[email protected]>
Reviewed by: Toomas Soome <[email protected]>
Reviewed by: Andy Stormont <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8898
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/9fa2266d9a
Closes #7031
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Yuri Pankov <[email protected]>
Reviewed by: Toomas Soome <[email protected]>
Reviewed by: Igor Kozhukhov <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8897
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/9a551dd645
Closes #7030
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When --enable-asan is provided to configure then build all user
space components with fsanitize=address. For kernel support
use the Linux KASAN feature instead.
https://github.com/google/sanitizers/wiki/AddressSanitizer
When using gcc version 4.8 any test case which intentionally
generates a core dump will fail when using --enable-asan.
The default behavior is to disable core dumps and only newer
versions allow this behavior to be controled at run time with
the ASAN_OPTIONS environment variable.
Additionally, this patch includes some build system cleanup.
* Rules.am updated to set the minimum AM_CFLAGS, AM_CPPFLAGS,
and AM_LDFLAGS. Any additional flags should be added on a
per-Makefile basic. The --enable-debug and --enable-asan
options apply to all user space binaries and libraries.
* Compiler checks consolidated in always-compiler-options.m4
and renamed for consistency.
* -fstack-check compiler flag was removed, this functionality
is provided by asan when configured with --enable-asan.
* Split DEBUG_CFLAGS in to DEBUG_CFLAGS, DEBUG_CPPFLAGS, and
DEBUG_LDFLAGS.
* Moved default kernel build flags in to module/Makefile.in and
split in to ZFS_MODULE_CFLAGS and ZFS_MODULE_CPPFLAGS. These
flags are set with the standard ccflags-y kbuild mechanism.
* -Wframe-larger-than checks applied only to binaries or
libraries which include source files which are built in
both user space and kernel space. This restriction is
relaxed for user space only utilities.
* -Wno-unused-but-set-variable applied only to libzfs and
libzpool. The remaining warnings are the result of an
ASSERT using a variable when is always declared.
* -D_POSIX_PTHREAD_SEMANTICS and -D__EXTENSIONS__ dropped
because they are Solaris specific and thus not needed.
* Ensure $GDB is defined as gdb by default in zloop.sh.
Signed-off-by: DHE <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7027
|
|
|
|
|
|
|
|
|
|
|
| |
The functionality provided by this header is not required by any
of the ZFS user space code. Minimal functionality was provided
in commit c28a677 which added include/sys/frame.h.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6960
Closes #6972
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Prakash Surya <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Prakash Surya <[email protected]>
Problem
=======
The current implementation of zil_commit() can introduce significant
latency, beyond what is inherent due to the latency of the underlying
storage. The additional latency comes from two main problems:
1. When there's outstanding ZIL blocks being written (i.e. there's
already a "writer thread" in progress), then any new calls to
zil_commit() will block waiting for the currently oustanding ZIL
blocks to complete. The blocks written for each "writer thread" is
coined a "batch", and there can only ever be a single "batch" being
written at a time. When a batch is being written, any new ZIL
transactions will have to wait for the next batch to be written,
which won't occur until the current batch finishes.
As a result, the underlying storage may not be used as efficiently
as possible. While "new" threads enter zil_commit() and are blocked
waiting for the next batch, it's possible that the underlying
storage isn't fully utilized by the current batch of ZIL blocks. In
that case, it'd be better to allow these new threads to generate
(and issue) a new ZIL block, such that it could be serviced by the
underlying storage concurrently with the other ZIL blocks that are
being serviced.
2. Any call to zil_commit() must wait for all ZIL blocks in its "batch"
to complete, prior to zil_commit() returning. The size of any given
batch is proportional to the number of ZIL transaction in the queue
at the time that the batch starts processing the queue; which
doesn't occur until the previous batch completes. Thus, if there's a
lot of transactions in the queue, the batch could be composed of
many ZIL blocks, and each call to zil_commit() will have to wait for
all of these writes to complete (even if the thread calling
zil_commit() only cared about one of the transactions in the batch).
To further complicate the situation, these two issues result in the
following side effect:
3. If a given batch takes longer to complete than normal, this results
in larger batch sizes, which then take longer to complete and
further drive up the latency of zil_commit(). This can occur for a
number of reasons, including (but not limited to): transient changes
in the workload, and storage latency irregularites.
Solution
========
The solution attempted by this change has the following goals:
1. no on-disk changes; maintain current on-disk format.
2. modify the "batch size" to be equal to the "ZIL block size".
3. allow new batches to be generated and issued to disk, while there's
already batches being serviced by the disk.
4. allow zil_commit() to wait for as few ZIL blocks as possible.
5. use as few ZIL blocks as possible, for the same amount of ZIL
transactions, without introducing significant latency to any
individual ZIL transaction. i.e. use fewer, but larger, ZIL blocks.
In theory, with these goals met, the new allgorithm will allow the
following improvements:
1. new ZIL blocks can be generated and issued, while there's already
oustanding ZIL blocks being serviced by the storage.
2. the latency of zil_commit() should be proportional to the underlying
storage latency, rather than the incoming synchronous workload.
Porting Notes
=============
Due to the changes made in commit 119a394ab0, the lifetime of an itx
structure differs than in OpenZFS. Specifically, the itx structure is
kept around until the data associated with the itx is considered to be
safe on disk; this is so that the itx's callback can be called after the
data is committed to stable storage. Since OpenZFS doesn't have this itx
callback mechanism, it's able to destroy the itx structure immediately
after the itx is committed to an lwb (before the lwb is written to
disk).
To support this difference, and to ensure the itx's callbacks can still
be called after the itx's data is on disk, a few changes had to be made:
* A list of itxs was added to the lwb structure. This list contains
all of the itxs that have been committed to the lwb, such that the
callbacks for these itxs can be called from zil_lwb_flush_vdevs_done(),
after the data for the itxs is committed to disk.
* A list of itxs was added on the stack of the zil_process_commit_list()
function; the "nolwb_itxs" list. In some circumstances, an itx may
not be committed to an lwb (e.g. if allocating the "next" ZIL block
on disk fails), so this list is used to keep track of which itxs
fall into this state, such that their callbacks can be called after
the ZIL's writer pipeline is "stalled".
* The logic to actually call the itx's callback was moved into the
zil_itx_destroy() function. Since all consumers of zil_itx_destroy()
were effectively performing the same logic (i.e. if callback is
non-null, call the callback), it seemed like useful code cleanup to
consolidate this logic into a single function.
Additionally, the existing Linux tracepoint infrastructure dealing with
the ZIL's probes and structures had to be updated to reflect these code
changes. Specifically:
* The "zil__cw1" and "zil__cw2" probes were removed, so they had to be
removed from "trace_zil.h" as well.
* Some of the zilog structure's fields were removed, which affected
the tracepoint definitions of the structure.
* New tracepoints had to be added for the following 3 new probes:
* zil__process__commit__itx
* zil__process__normal__itx
* zil__commit__io__error
OpenZFS-issue: https://www.illumos.org/issues/8585
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/5d95a3a
Closes #6566
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolve new warnings and errors from cppcheck v1.80.
* [lib/libshare/libshare.c:543]: (warning)
Possible null pointer dereference: protocol
* [lib/libzfs/libzfs_dataset.c:2323]: (warning)
Possible null pointer dereference: srctype
* [lib/libzfs/libzfs_import.c:318]: (error)
Uninitialized variable: link
* [module/zfs/abd.c:353]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:353]: (error) Uninitialized variable: i
* [module/zfs/abd.c:385]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:385]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: i
* [module/zfs/abd.c:553]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:763]: (error) Uninitialized variable: i
* [module/zfs/abd.c:763]: (error) Uninitialized variable: sg
* [module/zfs/abd.c:305]: (error) Uninitialized variable: tmp_page
* [module/zfs/zpl_xattr.c:342]: (warning)
Possible null pointer dereference: value
* [module/zfs/zvol.c:208]: (error) Uninitialized variable: p
Convert the following suppression to inline.
* [module/zfs/zfs_vnops.c:840]: (error)
Possible null pointer dereference: aiov
Exclude HAVE_UIO_ZEROCOPY and HAVE_DNLC from analysis since
these macro's will never be defined until this functionality
is implemented.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6879
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, scrubs and resilvers can take an extremely
long time to complete. This is largely due to the fact
that zfs scans process pools in logical order, as
determined by each block's bookmark. This makes sense
from a simplicity perspective, but blocks in zfs are
often scattered randomly across disks, particularly
due to zfs's copy-on-write mechanisms.
This patch improves performance by splitting scrubs
and resilvers into a metadata scanning phase and an IO
issuing phase. The metadata scan reads through the
structure of the pool and gathers an in-memory queue
of I/Os, sorted by size and offset on disk. The issuing
phase will then issue the scrub I/Os as sequentially as
possible, greatly improving performance.
This patch also updates and cleans up some of the scan
code which has not been updated in several years.
Reviewed-by: Brian Behlendorf <[email protected]>
Authored-by: Saso Kiselkov <[email protected]>
Authored-by: Alek Pinchuk <[email protected]>
Authored-by: Tom Caputi <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #3625
Closes #6256
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`zpool status` normally aligns NAME/STATE/etc columns:
NAME STATE READ WRITE CKSUM
dummy ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
/tmp/dummy-long-1.bin ONLINE 0 0 0
/tmp/dummy-long-2.bin ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
/tmp/dummy-long-3.bin ONLINE 0 0 0
/tmp/dummy-long-4.bin ONLINE 0 0 0
However, if the zpool name is longer than the zvol names, alignment
issues arise:
NAME STATE READ WRITE CKSUM
dummy-very-very-long-zpool-name ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
/tmp/dummy-1.bin ONLINE 0 0 0
/tmp/dummy-2.bin ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
/tmp/dummy-3.bin ONLINE 0 0 0
/tmp/dummy-4.bin ONLINE 0 0 0
`zpool iostat` and `zpool import` are also affected:
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
dummy 104K 1.97G 0 0 152 9.84K
dummy-very-very-long-zpool-name 152K 1.97G 0 1 144 13.1K
---------- ----- ----- ----- ----- ----- -----
dummy-very-very-long-zpool-name ONLINE
mirror-0 ONLINE
/tmp/dummy-1.bin ONLINE
/tmp/dummy-2.bin ONLINE
mirror-1 ONLINE
/tmp/dummy-3.bin ONLINE
/tmp/dummy-4.bin ONLINE
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Gaydarov <[email protected]>
Closes #6786
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Porting Notes:
- The OpenZFS patch added nicenum_scale() and nicenum() to a
library not used by ZFS. Rather than pull in a new dependency
the version of nicenum in lib/libzpool/util.c was simply
replaced with the new one.
Reviewed by: Sebastian Wiedenroth <[email protected]>
Reviewed by: Robert Mustacchi <[email protected]>
Reviewed by: Yuri Pankov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Authored by: Jason King <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/640
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0a055120
Closes #6796
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added -n flag to zpool reopen that allows a running scrub
operation to continue if there is a device with Dirty Time Log.
By default if a component device has a DTL and zpool reopen
is executed all running scan operations will be restarted.
Added functional tests for `zpool reopen`
Tests covers following scenarios:
* `zpool reopen` without arguments,
* `zpool reopen` with pool name as argument,
* `zpool reopen` while scrubbing,
* `zpool reopen -n` while scrubbing,
* `zpool reopen -n` while resilvering,
* `zpool reopen` with bad arguments.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tom Caputi <[email protected]>
Signed-off-by: Arkadiusz Bubała <[email protected]>
Closes #6076
Closes #6746
|
|
|
|
|
|
|
|
| |
Add get functions to match existing ones.
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: John Ramsden <[email protected]>
Closes #6308
|
|
|
|
|
|
|
|
|
|
| |
Currently the function documentation states that two strings are
allocated, this is outdated. Only one char ** parameter is passed
into the function now, clearly only a pointer to a single string
is returned and needs to be free'd.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tobin C. Harding <[email protected]>
Closes #6754
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This PR includes fixes for bugs and documentation issues found
after the encryption patch was merged and general code improvements
for long-term maintainability.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Issue #6526
Closes #6639
Closes #6703
Cloese #6706
Closes #6714
Closes #6595
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* PBKDF2 implementation changed to OpenSSL implementation.
* HKDF implementation moved to its own file and tests
added to ensure correctness.
* Removed libzfs's now unnecessary dependency on libzpool
and libicp.
* Ztest can now create and test encrypted datasets. This is
currently disabled until issue #6526 is resolved, but
otherwise functions as advertised.
* Several small bug fixes discovered after enabling ztest
to run on encrypted datasets.
* Fixed coverity defects added by the encryption patch.
* Updated man pages for encrypted send / receive behavior.
* Fixed a bug where encrypted datasets could receive
DRR_WRITE_EMBEDDED records.
* Minor code cleanups / consolidation.
Signed-off-by: Tom Caputi <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
CPPASCOMPILE and LTCPPASCOMPILE all include DEFAULT_INCLUDES,
hence it's unnecessary to add the includes again.
Signed-off-by: Pengcheng Xu <[email protected]>
|
|/
|
|
|
|
|
|
| |
It's important to respect the user's CFLAGS as mismatched -mcpu
will directly result in the assembler not able to produce correct
code. Fixes #6733.
Signed-off-by: Pengcheng Xu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because resuming from a token requires "guid" -> "snapshot" mapping
we have to walk the whole dataset hierarchy to find the right snapshot
to send; when both source and destination exists, for an incremental
resumable stream, libzfs gets confused and picks up the wrong snapshot
to send from: this results in attempting to send
"destination@snap1 -> source@snap2"
instead of
"source@snap1 -> source@snap2"
which fails with a "Invalid cross-device link" error (EXDEV).
Fix this by adjusting the logic behind dataset traversal in
zfs_iter_children() to pick the right snapshot to send from.
Additionally update dry-run 'zfs send -t' to print its output to
stderr: this is consistent with other dry-run commands.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #6618
Closes #6619
Closes #6623
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add 'zfs bookmark' coverage (zfs_bookmark_cliargs)
* Add OpenZFS 8166 coverage (zpool_scrub_offline_device)
* Fix "busy" zfs_mount_remount failures
* Fix bootfs_003_pos, bootfs_004_neg, zdb_005_pos local cleanup
* Update usage of $KEEP variable, add get_all_pools() function
* Enable history_008_pos and rsend_019_pos (non-32bit builders)
* Enable zfs_copies_005_neg, update local cleanup
* Fix zfs_send_007_pos (large_dnode + OpenZFS 8199)
* Fix rollback_003_pos (use dataset name, not mountpoint, to unmount)
* Update default_raidz_setup() to work properly with more than 3 disks
* Use $TEST_BASE_DIR instead of hardcoded (/var)/tmp for file VDEVs
* Update usage of /dev/random to /dev/urandom
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Issue #6086
Closes #5658
Closes #6143
Closes #6421
Closes #6627
Closes #6632
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ZFS buildbot STYLE builder was moved to Ubuntu 17.04
which has a newer version of cppcheck. Handle the
new cppcheck errors.
uu_* functions removed in this commit were unused
and effectively dead code. They are now retired.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Giuseppe Di Natale <[email protected]>
Closes #6653
|
|
|
|
|
|
|
|
|
| |
FRU and LIBTOPO support are illumos only features that will not be ported to
Linux and make the code more complicated than necessary. This commit
makes way for further cleanups of the zed/FMA code.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: David Quigley <[email protected]>
Closes #6641
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change allow (ref)reservation to be set larger than the current
ZVOL size: this is safe as we normally set refreservation > volsize
at ZVOL creation time when we account for metadata.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #2468
Closes #6610
|
|
|
|
|
|
|
|
|
| |
This leverages the functionality introduced in cf7684b to expose
verbose, dry-run and parsable 'zfs send' options for bookmarks.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #3666
Closes #6601
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a small wrapper around libzfs_core`lzc_send_space() to libzfs so
that every legacy ZFS_IOC_SEND consumer, along with their userland
counterpart estimate_ioctl(), can leverage ZFS_IOC_SEND_SPACE to
request send space estimation.
The legacy functionality in zfs_ioc_send() is left untouched for
compatibility purposes.
Reviewed by: Thomas Caputi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #6029
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Steve Dougherty <[email protected]>
Reviewed by: Josef 'Jeff' Sipek <[email protected]>
Reviewed by: Andy Stormont <[email protected]>
Reviewed by: Garrett D'Amore <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Reviewed-by: Don Brady <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Giuseppe Di Natale <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/6447
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/759e89b
Closes #6581
|
|
|
|
|
|
|
|
|
|
|
|
| |
pthread_attr_(get/set)affinity_np() is glibc-only. This commit
disable the code path that use those functions in non-glibc
system. Fixes the following when building with musl:
libzfs.so: undefined reference to`pthread_attr_setaffinity_np'
libzfs.so: undefined reference to`pthread_attr_getaffinity_np'
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Leorize <[email protected]>
Closes #6571
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes several issues discovered after
the encryption patch was merged:
* Fixed a bug where encrypted datasets could attempt
to receive embedded data records.
* Fixed a bug where dirty records created by the recv
code wasn't properly setting the dr_raw flag.
* Fixed a typo where a dmu_tx_commit() was changed to
dmu_tx_abort()
* Fixed a few error handling bugs unrelated to the
encryption patch in dmu_recv_stream()
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #6512
Closes #6524
Closes #6545
|