| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ZFS_IOC_POOL_TRYIMPORT ioctl returns an nvlist from the kernel to a
preallocated buffer in userland. Userland must guess how large the
buffer should be. If it undersizes it, it must reallocate and try
again. That can cost a lot of time for large pools.
OpenZFS commit 28b40c8a6e3 set the guess at "zc.zc_nvlist_conf_size * 4"
without explanation. On my system, that is too small. From experiment,
x 32 is a better multiplier. But I don't know how to calculate it
theoretically.
Sponsored by: Axcient
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alek Pinchuk <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
Closes #11469
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Read all labels in parallel instead of sequentially.
Originally committed as
https://cgit.freebsd.org/src/commit/?id=b49e9abcf44cafaf5cfad7029c9a6adbb28346e8
Obtained from: FreeBSD
Sponsored by: Spectra Logic, Axcient
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alek Pinchuk <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
Closes #11467
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zpool_read_label doesn't need the full labels including uberblocks. It
only needs the vdev_phys_t. This reduces by half the amount of data
read to check for a label, speeding up "zpool import", "zpool
labelclear", etc.
Originally committed as
https://cgit.freebsd.org/src/commit/?id=63f8025d6acab1b334373ddd33f940a69b3b54cc
Obtained from: FreeBSD
Sponsored by: Spectra Logic Corp, Axcient
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alek Pinchuk <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
Closes #11467
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In FreeBSD the struct uio was just a typedef to uio_t. In order to
extend this struct, outside of the definition for the struct uio, the
struct uio has been embedded inside of a uio_t struct.
Also renamed all the uio_* interfaces to be zfs_uio_* to make it clear
this is a ZFS interface.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Brian Atkinson <[email protected]>
Closes #11438
|
|
|
|
|
|
|
|
| |
Use verified variants of nvlist/nvpair functions where applicable.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11460
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In `zpool_find_config()`, the `pools` nvlist is leaked. Part of it (a
sub-nvlist) is returned in `*configp`, but the callers also leak that.
Additionally, in `zdb.c:main()`, the `searchdirs` is leaked.
The leaks were detected by ASAN (`configure --enable-asan`).
This commit resolves the leaks.
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #11396
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of the 5.10 kernel the generic splice compatibility code has been
removed. All filesystems are now responsible for registering a
->splice_read and ->splice_write callback to support this operation.
The good news is the VFS provided generic_file_splice_read() and
iter_file_splice_write() callbacks can be used provided the ->iter_read
and ->iter_write callback support pipes. However, this is currently
not the case and only iovecs and bvecs (not pipes) are ever attached
to the uio structure.
This commit changes that by allowing full iov_iter structures to be
attached to uios. Ever since the 4.9 kernel the iov_iter structure
has supported iovecs, kvecs, bvevs, and pipes so it's desirable to
pass the entire thing when possible. In conjunction with this the
uio helper functions (i.e uiomove(), uiocopy(), etc) have been
updated to understand the new UIO_ITER type.
Note that using the kernel provided uio_iter interfaces allowed the
existing Linux specific uio handling code to be simplified. When
there's no longer a need to support kernel's older than 4.9, then
it will be possible to remove the iovec and bvec members from the
uio structure and always use a uio_iter. Until then we need to
maintain all of the existing types for older kernels.
Some additional refactoring and cleanup was included in this change:
- Added checks to configure to detect available iov_iter interfaces.
Some are available all the way back to the 3.10 kernel and are used
when available. In particular, uio_prefaultpages() now always uses
iov_iter_fault_in_readable() which is available for all supported
kernels.
- The unused UIO_USERISPACE type has been removed. It is no longer
needed now that the uio_seg enum is platform specific.
- Moved zfs_uio.c from the zcommon.ko module to the Linux specific
platform code for the zfs.ko module. This gets it out of libzfs
where it was never needed and keeps this Linux specific code out
of the common sources.
- Removed unnecessary O_APPEND handling from zfs_iter_write(), this
is redundant and O_APPEND is already handled in zfs_write();
Reviewed-by: Colin Ian King <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11351
|
|
|
|
|
|
|
|
|
|
|
|
| |
Building libzfs with gcc on FreeBSD failed because gcc is picky about
the order of keywords in declarations with __thread, whereas clang is
more relaxed.
https://gcc.gnu.org/onlinedocs/gcc/Thread-Local.html
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Ryan Libby <[email protected]>
Closes #11331
|
|
|
|
|
|
|
|
|
|
|
|
| |
`zpool create -n` fails to list cache and spare vdevs.
`zpool add -n` fails to list spare devices.
`zpool split -n` fails to list `special` and `dedup` labels.
`zpool add -n` and `zpool split -n` shouldn't list hole devices.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Attila Fülöp <[email protected]>
Closes #11122
Closes #11167
|
|
|
|
|
|
|
|
|
| |
We do not build libnvpair.pc. Moreover, it is automatically pulled in
by libzfs.pc, so no additional specific dependency is required.
Reviewed by: Toomas Soome <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Antonio Russo <[email protected]>
Closes #11227
|
|
|
|
|
|
|
| |
The ABI should be included when generating the `make dist` tarball
since it's required by the `make checkabi` target.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11225
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a snapshot of the current ABI using libabigail-1.7-2. The
included ABI passes `make checkabi` for CentOS 7, Fedora 33,
Debian 10, and Ubuntu 20.04. This covers a fairly wide range
of glibc, gcc, and libabigail versions plus other changes which
are platform specific.
Reviewed-by: Antonio Russo <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11144
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide two make targets: checkabi and storeabi.
storeabi uses libabigail to generate a reference copy of the ABI for the
public libraries.
checkabi compares such a reference to the compiled version, failing if
they are not compatible. No ABI is generated for libzpool.so, it is
only used by ztest and zdb and not external consumers.
Co-authored-by: Brian Behlendorf <[email protected]>
Signed-off-by: Antonio Russo <[email protected]>
Closes #11144
|
|
|
|
|
|
|
|
|
|
| |
zpool_expand_proplist() now ignores pl_fixed if its new literal
argument is true. The rest is a consequence of needing to pass
that down.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ahelenia Ziemiao?=~Dska <[email protected]>
Closes #11202
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a new top-level vdev type called dRAID, which stands
for Distributed parity RAID. This pool configuration allows all dRAID
vdevs to participate when rebuilding to a distributed hot spare device.
This can substantially reduce the total time required to restore full
parity to pool with a failed device.
A dRAID pool can be created using the new top-level `draid` type.
Like `raidz`, the desired redundancy is specified after the type:
`draid[1,2,3]`. No additional information is required to create the
pool and reasonable default values will be chosen based on the number
of child vdevs in the dRAID vdev.
zpool create <pool> draid[1,2,3] <vdevs...>
Unlike raidz, additional optional dRAID configuration values can be
provided as part of the draid type as colon separated values. This
allows administrators to fully specify a layout for either performance
or capacity reasons. The supported options include:
zpool create <pool> \
draid[<parity>][:<data>d][:<children>c][:<spares>s] \
<vdevs...>
- draid[parity] - Parity level (default 1)
- draid[:<data>d] - Data devices per group (default 8)
- draid[:<children>c] - Expected number of child vdevs
- draid[:<spares>s] - Distributed hot spares (default 0)
Abbreviated example `zpool status` output for a 68 disk dRAID pool
with two distributed spares using special allocation classes.
```
pool: tank
state: ONLINE
config:
NAME STATE READ WRITE CKSUM
slag7 ONLINE 0 0 0
draid2:8d:68c:2s-0 ONLINE 0 0 0
L0 ONLINE 0 0 0
L1 ONLINE 0 0 0
...
U25 ONLINE 0 0 0
U26 ONLINE 0 0 0
spare-53 ONLINE 0 0 0
U27 ONLINE 0 0 0
draid2-0-0 ONLINE 0 0 0
U28 ONLINE 0 0 0
U29 ONLINE 0 0 0
...
U42 ONLINE 0 0 0
U43 ONLINE 0 0 0
special
mirror-1 ONLINE 0 0 0
L5 ONLINE 0 0 0
U5 ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
L6 ONLINE 0 0 0
U6 ONLINE 0 0 0
spares
draid2-0-0 INUSE currently in use
draid2-0-1 AVAIL
```
When adding test coverage for the new dRAID vdev type the following
options were added to the ztest command. These options are leverages
by zloop.sh to test a wide range of dRAID configurations.
-K draid|raidz|random - kind of RAID to test
-D <value> - dRAID data drives per group
-S <value> - dRAID distributed hot spares
-R <value> - RAID parity (raidz or dRAID)
The zpool_create, zpool_import, redundancy, replacement and fault
test groups have all been updated provide test coverage for the
dRAID feature.
Co-authored-by: Isaac Huang <[email protected]>
Co-authored-by: Mark Maybee <[email protected]>
Co-authored-by: Don Brady <[email protected]>
Co-authored-by: Matthew Ahrens <[email protected]>
Co-authored-by: Brian Behlendorf <[email protected]>
Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Matt Ahrens <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #10102
|
|
|
|
|
|
|
|
|
|
|
| |
This shows up when compiling freebsd-head on amd64 using gcc-6.4.
The lib32 compat build ends up tripping over this assumption.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: adrian chadd <[email protected]>
Closes #11068
Closes #11069
|
|
|
|
|
|
|
|
|
|
|
| |
Bump library SOVERSION under Linux to match FreeBSD's.
Additionally, this bump properly accounts for the ABI changes relative
to ZoL 0.8.5 for the Linux build.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Antonio Russo <[email protected]>
Issue #11144
|
|
|
|
|
|
|
|
|
|
| |
The original xuio zero copy functionality has always been unused
on Linux and FreeBSD. Remove this disabled code to avoid any
confusion and improve readability.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #11124
|
|
|
|
|
|
|
|
| |
Refer to the correct section or alternative for FreeBSD and Linux.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11132
|
|
|
|
|
|
|
|
|
|
| |
The zfs_fsync, zfs_read, and zfs_write function are almost identical
between Linux and FreeBSD. With a little refactoring they can be
moved to the common code which is what is done by this commit.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #11078
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The acltype property is currently hidden on FreeBSD and does not
reflect the NFSv4 style ZFS ACLs used on the platform. This makes it
difficult to observe that a pool imported from FreeBSD on Linux has a
different type of ACL that is being ignored, and vice versa.
Add an nfsv4 acltype and expose the property on FreeBSD.
Make the default acltype nfsv4 on FreeBSD.
Setting acltype to an unhanded style is treated the same as setting
it to off. The ACLs will not be removed, but they will be ignored.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #10520
|
|
|
|
|
|
|
|
|
|
|
| |
In FreeBSD, there are three compile environments that are supported:
user land, the kernel and the bootloader / standalone. Adjust the
headers to compile in the standalone environment. Limit kernel-only
items from view when _STANDALONE is defined.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Warner Losh <[email protected]>
Closes #10998
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is a common mistake to have failed to autoload the module due to
permission issues when running a ZFS command as a user. "Operation
not permitted" is an unhelpfully vague error message.
Use a thread-local message buffer to format a nicer error message.
We can infer that loading the kernel module failed if the module is
not loaded. This can be extended with heuristics for other errors
in the future.
While looking at this stuff, remove an unused thread-local message
buffer found in libspl and remove some inaccurate verbiage from the
comment on libzfs_load_module.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11033
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change updates the documentation to refer to the project
as OpenZFS instead ZFS on Linux. Web links have been updated
to refer to https://github.com/openzfs/zfs. The extraneous
zfsonlinux.org web links in the ZED and SPL sources have been
dropped.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11007
|
|
|
|
|
|
|
|
|
|
|
| |
fixup of 196bee4
On gcc (GCC) 9.2.1 20190827 (Red Hat 9.2.1-1), the code removed
caused `-Wmaybe-uninitialized` errors.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Christian Schwarz <[email protected]>
Closes #11021
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When resuming an interrupted ZFS send stream that creates a new dataset
with the same name as an existing dataset, if the existing dataset is
accessed after the failed receive, then after the subsequent successful
receive it will return EIO. This happens because nothing mounts the new
dataset, leaving the old, no longer valid dataset still mounted.
This commit fixes zfs receive to always unmount and remount the
destination, regardless of whether the stream is a new stream or a
resumed stream.
Sponsored by: Axcient
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=249579
Closes #10995
Closes #10999
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the /etc/exports.d directory does not exist, then we should only
create it when we're performing an action which already requires root
privileges.
This commit moves the directory creation to the enable/disable code
path which ensures that we have the appropriate privileges.
Reviewed-by: Richard Elling <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Wilson <[email protected]>
Closes #10785
Closes #10934
|
|
|
|
|
|
|
|
|
|
| |
The procfs_list interface is required by several kstats. Implement
this functionality for FreeBSD to provide access to these kstats.
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #10890
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolves FreeBSD Coverity defect:
CID 1432398: Resource leaks (RESOURCE_LEAK)
libzfs: don't leak hdl if there is an error reading env var
Resolves FreeBSD Coverity defect:
CID 1432395: Resource leaks (RESOURCE_LEAK)
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Closes #10882
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When expanding a device zfs needs to rescan the partition table to
get the correct size. This can only happen when we're in the kernel
and requires the device to be closed. As part of the rescan, udev is
notified and the device links are removed and recreated. This leave a
window where the vdev code may try to reopen the device before udev
has recreated the link. If that happens, then the pool may end up in
a suspended state.
To correct this, we leverage the BLKPG_RESIZE_PARTITION ioctl which
allows the partition information to be modified even while it's in use.
This ioctl also does not remove the device link associated with the zfs
data partition so it eliminates the race condition that can occur in
the kernel.
Reviewed-by: Pavel Zakharov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Wilson <[email protected]>
Closes #10897
|
|
|
|
|
|
|
|
| |
A small bug did slip into initial libzfsbootenv; while storing nvlist
in nvlist, we should make sure the bootenv is using VB_NVLIST format.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Toomas Soome <[email protected]>
Closes #10937
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvlist does allow us to support different data types and systems.
To encapsulate user data to/from nvlist, the libzfsbootenv library is
provided.
Reviewed-by: Arvind Sankar <[email protected]>
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Igor Kozhukhov <[email protected]>
Signed-off-by: Toomas Soome <[email protected]>
Closes #10774
|
|
|
|
|
|
|
|
|
|
|
|
| |
libzutil depends on libnvpair, but this dependency is undeclared in the
build system. Therefore it isn't possible to make a new command that
depends on libzutil, but does not (directly) depend on libnvpair.
This commit makes this dependency explicit.
Reviewed-by: Brian Behlendorf <[email protected]>
Reivewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #10915
|
|
|
|
|
|
|
|
|
| |
The pbkdf2iters property is an iteration counter
and should be displayed as plain number rather
than in binary unit.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Fabio Buso <[email protected]>
Closes #10871
|
|
|
|
|
|
|
|
|
|
| |
On musl libc, zfs failed to compile due to the missing <fcntl.h>
include, which is required for `open()` per POSIX.
This commit add the missing <fcntl.h> include.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Hiếu Lê <[email protected]>
Closes #10880
|
|
|
|
|
|
|
|
|
|
|
|
| |
Several of the listed library dependencies are not relevant on FreeBSD.
Have ./configure save libraries that are found via pkg-config as
${LIB}_PC and use the configured automake variables instead of hard
coded names so we only get what was actually needed.
While here, update the URL to point at the OpenZFS Github repo.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #10869
|
|
|
|
|
|
|
|
|
|
|
| |
Adding an #ifdef __FreeBSD__ to a FreeBSD-specific header may seem odd,
but these headers are used on non-FreeBSD systems during the bootstrap
tools phase.
Originally submitted downstream as https://reviews.freebsd.org/D26193
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alex Richardson <[email protected]>
Closes #10863
|
|
|
|
|
|
|
|
|
|
|
| |
Those macros are also defined by the compiler-provided float.h which
will be included later on (at least in the FreeBSD buildworld case) and
triggers these -Werror warnings. Including <float.h> first and only
defining the macros when DBL_DIG/FLT_DIG is missing fixes this problem.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alex Richardson <[email protected]>
Closes #10864
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow to rename file systems without remounting if it is possible.
It is possible for file systems with 'mountpoint' property set to
'legacy' or 'none' - we don't have to change mount directory for them.
Currently such file systems are unmounted on rename and not even
mounted back.
This introduces layering violation, as we need to update
'f_mntfromname' field in statfs structure related to mountpoint (for
the dataset we are renaming and all its children).
In my opinion it is worth it, as it allow to update FreeBSD in even
cleaner way - in ZFS-only configuration root file system is ZFS file
system with 'mountpoint' property set to 'legacy'. If root dataset is
named system/rootfs, we can snapshot it (system/rootfs@upgrade), clone
it (system/oldrootfs), update FreeBSD and if it doesn't boot we can
boot back from system/oldrootfs and rename it back to system/rootfs
while it is mounted as /. Before it was not possible, because
unmounting / was not possible.
Authored by: Pawel Jakub Dawidek <[email protected]>
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported by: Matt Macy <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #10839
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD has the concept of jails, a precursor to Solaris's zones, which
can be mapped to the required zones interface with relative ease. The
previous ZFS implementation in FreeBSD did so, and we should continue
to provide an appropriate implementation in OpenZFS as well.
Move lib/libspl/zone.c into platform code and adopt the correct
implementation for FreeBSD.
While here, prune unused code.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #10851
|
|
|
|
|
|
|
|
|
| |
The matching ioctl is DIOCGMEDIASIZE.
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Alex Richardson <[email protected]>
Closes #10818
|
|
|
|
|
|
|
| |
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #10820
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Originally we asserted that all reads are less than SPA_MAXBLOCKSIZE
However, nvlists are not ZFS records, and are not limited to
SPA_MAXBLOCKSIZE.
Add a new environment variable, ZFS_SENDRECV_MAX_NVLIST, to allow the
user to specify the maximum size of the nvlist that can be sent or
received.
Default value: 4 * SPA_MAXBLOCKSIZE (64 MB)
Modify libzfs send routines to return a useful error if the send stream
will generate an nvlist that is beyond the maximum size.
Modify libzfs recv routines to add an explicit error message if the
nvlist is too large, rather than abort()ing.
Move the change the assert() to only trigger on data records
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Closes #9616
|
|
|
|
|
|
|
|
|
|
|
| |
For Linux, when zfs is compiled as an in kernel static variant
and the in kernel zstd library is compiled statically into the kernel
a symbol collision will occur. This wrapper header renames all
of the relevant zstd functions to avoid this problem.
Reviewed-by: Kjeld Schouten <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Closes #10775
|
|
|
|
|
|
|
|
|
| |
Increase the size of DDT_NAMELEN and MNT_LINE_MAX to appease GCC
snprintf truncation warnings.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris McDonough <[email protected]>
Closes #10712
Closes #10766
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the current way CFLAGS are modified in libzstd, CFLAGS passed on
the make command line will cause the CFLAGS in the Makefile for zstd.c
to be discarded, but not AM_CFLAGS. This causes a smaller frame size
limit to be used, and the build fails.
We don't need to worry about stack frame sizes in userspace. Drop the
extra flags.
Reviewed-by: Kjeld Schouten <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #10773
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many modern devices use physical allocation units that are much
larger than the minimum logical allocation size accessible by
external commands. Two prevalent examples of this are 512e disk
drives (512b logical sector, 4K physical sector) and flash devices
(512b logical sector, 4K or larger allocation block size, and 128k
or larger erase block size). Operations that modify less than the
physical sector size result in a costly read-modify-write or garbage
collection sequence on these devices.
Simply exporting the true physical sector of the device to ZFS would
yield optimal performance, but has two serious drawbacks:
1. Existing pools created with devices that have different logical
and physical block sizes, but were configured to use the logical
block size (e.g. because the OS version used for pool construction
reported the logical block size instead of the physical block
size) will suddenly find that the vdev allocation size has
increased. This can be easily tolerated for active members of
the array, but ZFS would prevent replacement of a vdev with
another identical device because it now appears that the smaller
allocation size required by the pool is not supported by the new
device.
2. The device's physical block size may be too large to be supported
by ZFS. The optimal allocation size for the vdev may be quite
large. For example, a RAID controller may export a vdev that
requires read-modify-write cycles unless accessed using 64k
aligned/sized requests. ZFS currently has an 8k minimum block
size limit.
Reporting both the logical and physical allocation sizes for vdevs
solves these problems. A device may be used so long as the logical
block size is compatible with the configuration. By comparing the
logical and physical block sizes, new configurations can be optimized
and administrators can be notified of any existing pools that are
sub-optimal.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Co-authored-by: Matthew Macy <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #10619
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This PR adds two new compression types, based on ZStandard:
- zstd: A basic ZStandard compression algorithm Available compression.
Levels for zstd are zstd-1 through zstd-19, where the compression
increases with every level, but speed decreases.
- zstd-fast: A faster version of the ZStandard compression algorithm
zstd-fast is basically a "negative" level of zstd. The compression
decreases with every level, but speed increases.
Available compression levels for zstd-fast:
- zstd-fast-1 through zstd-fast-10
- zstd-fast-20 through zstd-fast-100 (in increments of 10)
- zstd-fast-500 and zstd-fast-1000
For more information check the man page.
Implementation details:
Rather than treat each level of zstd as a different algorithm (as was
done historically with gzip), the block pointer `enum zio_compress`
value is simply zstd for all levels, including zstd-fast, since they all
use the same decompression function.
The compress= property (a 64bit unsigned integer) uses the lower 7 bits
to store the compression algorithm (matching the number of bits used in
a block pointer, as the 8th bit was borrowed for embedded block
pointers). The upper bits are used to store the compression level.
It is necessary to be able to determine what compression level was used
when later reading a block back, so the concept used in LZ4, where the
first 32bits of the on-disk value are the size of the compressed data
(since the allocation is rounded up to the nearest ashift), was
extended, and we store the version of ZSTD and the level as well as the
compressed size. This value is returned when decompressing a block, so
that if the block needs to be recompressed (L2ARC, nop-write, etc), that
the same parameters will be used to result in the matching checksum.
All of the internal ZFS code ( `arc_buf_hdr_t`, `objset_t`,
`zio_prop_t`, etc.) uses the separated _compress and _complevel
variables. Only the properties ZAP contains the combined/bit-shifted
value. The combined value is split when the compression_changed_cb()
callback is called, and sets both objset members (os_compress and
os_complevel).
The userspace tools all use the combined/bit-shifted value.
Additional notes:
zdb can now also decode the ZSTD compression header (flag -Z) and
inspect the size, version and compression level saved in that header.
For each record, if it is ZSTD compressed, the parameters of the decoded
compression header get printed.
ZSTD is included with all current tests and new tests are added
as-needed.
Per-dataset feature flags now get activated when the property is set.
If a compression algorithm requires a feature flag, zfs activates the
feature when the property is set, rather than waiting for the first
block to be born. This is currently only used by zstd but can be
extended as needed.
Portions-Sponsored-By: The FreeBSD Foundation
Co-authored-by: Allan Jude <[email protected]>
Co-authored-by: Brian Behlendorf <[email protected]>
Co-authored-by: Sebastian Gottschall <[email protected]>
Co-authored-by: Kjeld Schouten-Lebbing <[email protected]>
Co-authored-by: Michael Niewöhner <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Allan Jude <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Kjeld Schouten-Lebbing <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
Closes #6247
Closes #9024
Closes #10277
Closes #10278
|
|
|
|
|
|
|
|
|
| |
FreeBSD numbers `ZFS_IOC_*` starting at 0, so pick a different
sentinel value to avoid unintentionally messing with
`ZFS_IOC_POOL_CREATE` ioctls.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #10729
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise compiler errors with:
```
libzfs_pool.c:449:1: error: 'zpool_is_bootable'
defined but not used [-Werror=unused-function]
```
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Melikov <[email protected]>
Closes #10734
|