| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Closes #11993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also always free tmp2 at the end
Before:
nabijaczleweli@tarta:~/uwu$ valgrind --leak-check=full ./blergh
==8947== Memcheck, a memory error detector
==8947== Using Valgrind-3.14.0 and LibVEX
==8947== Command: ./blergh
==8947==
(null)
==8947==
==8947== HEAP SUMMARY:
==8947== in use at exit: 23 bytes in 1 blocks
==8947== total heap usage: 3 allocs, 2 frees, 1,147 bytes allocated
==8947==
==8947== 23 bytes in 1 blocks are definitely lost in loss record 1 of 1
==8947== at 0x483577F: malloc (vg_replace_malloc.c:299)
==8947== by 0x48D74B7: vasprintf (vasprintf.c:73)
==8947== by 0x48B7833: asprintf (asprintf.c:35)
==8947== by 0x401258: zfs_get_enclosure_sysfs_path
(zutil_device_path_os.c:191)
==8947== by 0x401482: main (blergh.c:107)
==8947==
==8947== LEAK SUMMARY:
==8947== definitely lost: 23 bytes in 1 blocks
==8947== indirectly lost: 0 bytes in 0 blocks
==8947== possibly lost: 0 bytes in 0 blocks
==8947== still reachable: 0 bytes in 0 blocks
==8947== suppressed: 0 bytes in 0 blocks
==8947==
==8947== For counts of detected and suppressed errors, rerun with: -v
==8947== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
nabijaczleweli@tarta:~/uwu$ sed -n 191p zutil_device_path_os.c
tmpsize = asprintf(&tmp1, "/sys/block/%s/device", dev_name);
After:
nabijaczleweli@tarta:~/uwu$ valgrind --leak-check=full ./blergh
==9512== Memcheck, a memory error detector
==9512== Using Valgrind-3.14.0 and LibVEX
==9512== Command: ./blergh
==9512==
(null)
==9512==
==9512== HEAP SUMMARY:
==9512== in use at exit: 0 bytes in 0 blocks
==9512== total heap usage: 3 allocs, 3 frees, 1,147 bytes allocated
==9512==
==9512== All heap blocks were freed -- no leaks are possible
==9512==
==9512== For counts of detected and suppressed errors, rerun with: -v
==9512== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Closes #11993
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Closes #11993
|
|
|
|
|
|
|
|
| |
For example, this would happily return "/dev/(null)" for /dev/sda1
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Closes #11935
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As found by
git grep -E '(open|setmntent|pipe2?)\(' |
grep -vE '((zfs|zpool)_|fd|dl|lzc_re|pidfile_|g_)open\('
FreeBSD's pidfile_open() says nothing about the flags of the files it
opens, but we can't do anything about it anyway; the implementation does
open all files with O_CLOEXEC
Consider this output with zpool.d/media appended with
"pid=$$; (ls -l /proc/$pid/fd > /dev/tty)":
$ /sbin/zpool iostat -vc media
lrwx------ 0 -> /dev/pts/0
l-wx------ 1 -> 'pipe:[3278500]'
l-wx------ 2 -> /dev/null
lrwx------ 3 -> /dev/zfs
lr-x------ 4 -> /proc/31895/mounts
lrwx------ 5 -> /dev/zfs
lr-x------ 10 -> /usr/lib/zfs-linux/zpool.d/media
vs
$ ./zpool iostat -vc vendor,upath,iostat,media
lrwx------ 0 -> /dev/pts/0
l-wx------ 1 -> 'pipe:[3279887]'
l-wx------ 2 -> /dev/null
lr-x------ 10 -> /usr/lib/zfs-linux/zpool.d/media
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Closes #11866
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zpool list, which is the only user, would mistakenly try to parse the
empty string as the interval in this case:
$ zpool list "a"
cannot open 'a': no such pool
$ zpool list ""
interval cannot be zero
usage: <usage string follows>
which is now symmetric with zpool get:
$ zpool list ""
cannot open '': name must begin with a letter
Avoid breaking the "interval cannot be zero" string.
There simply isn't a need for this, and it's user-facing.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Ahelenia Ziemiańska <[email protected]>
Closes #11841
Closes #11843
|
|
|
|
|
|
|
|
|
|
| |
Correct an assortment of typos throughout the code base.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Andrea Gelmini <[email protected]>
Closes #11774
|
|
|
|
|
|
| |
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Andrea Gelmini <[email protected]>
Closes #11775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Importing a pool using the cachefile is ideal to reduce the time
required to import a pool. However, if the devices associated with
a pool in the cachefile have changed, then the import would fail.
This can easily be corrected by doing a normal import which would
then read the pool configuration from the labels.
The goal of this change is make importing using a cachefile more
resilient and auto-correcting. This is accomplished by having
the cachefile import logic automatically fallback to reading the
labels of the devices similar to a normal import. The main difference
between the fallback logic and a normal import is that the cachefile
import logic will only look at the device directories that were
originally used when the cachefile was populated. Additionally,
the fallback logic will always import by guid to ensure that only
the pools in the cachefile would be imported.
External-issue: DLPX-71980
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: George Wilson <[email protected]>
Closes #11716
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A multpathed disk will have several 'underlying' paths to the disk. For
example, multipath disk 'dm-0' may be made up of paths:
/dev/{sda,sdb,sdc,sdd}. On many enclosures those underlying sysfs
paths will have a symlink back to their enclosure device entry
(like 'enclosure_device0/slot1'). This is used by the
statechange-led.sh script to set/clear the fault LED for a disk, and
by 'zpool status -c'.
However, on some enclosures, those underlying paths may not all have
symlinks back to the enclosure device. Maybe only two out of four
of them might.
This patch updates zfs_get_enclosure_sysfs_path() to favor returning
paths that have symlinks back to their enclosure devices, rather
than just returning the first path.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #11617
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Property to allow sets of features to be specified; for compatibility
with specific versions / releases / external systems. Influences
the behavior of 'zpool upgrade' and 'zpool create'. Initial man
page changes and test cases included.
Brief synopsis:
zpool create -o compatibility=off|legacy|file[,file...] pool vdev...
compatibility = off : disable compatibility mode (enable all features)
compatibility = legacy : request that no features be enabled
compatibility = file[,file...] : read features from specified files.
Only features present in *all* files will be enabled on the
resulting pool. Filenames may be absolute, or relative to
/etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d (/etc
checked first).
Only affects zpool create, zpool upgrade and zpool status.
ABI changes in libzfs:
* New function "zpool_load_compat" to load and parse compat sets.
* Add "zpool_compat_status_t" typedef for compatibility parse status.
* Add ZPOOL_PROP_COMPATIBILITY to the pool properties enum
* Add ZPOOL_STATUS_COMPATIBILITY_ERR to the pool status enum
An initial set of base compatibility sets are included in
cmd/zpool/compatibility.d, and the Makefile for cmd/zpool is
modified to install these in $pkgdatadir/compatibility.d and to
create symbolic links to a reasonable set of aliases.
Reviewed-by: ericloewe
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Colm Buckley <[email protected]>
Closes #11468
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order for cppcheck to perform a proper analysis it needs to be
aware of how the sources are compiled (source files, include
paths/files, extra defines, etc). All the needed information is
available from the Makefiles and can be leveraged with a generic
cppcheck Makefile target. So let's add one.
Additional minor changes:
* Removing the cppcheck-suppressions.txt file. With cppcheck 2.3
and these changes it appears to no longer be needed. Some inline
suppressions were also removed since they appear not to be
needed. We can add them back if it turns out they're needed
for older versions of cppcheck.
* Added the ax_count_cpus m4 macro to detect at configure time how
many processors are available in order to run multiple cppcheck
jobs. This value is also now used as a replacement for nproc
when executing the kernel interface checks.
* "PHONY =" line moved in to the Rules.am file which is included
at the top of all Makefile.am's. This is just convenient becase
it allows us to use the += syntax to add phony targets.
* One upside of this integration worth mentioning is it now allows
`make cppcheck` to be run in any directory to check that subtree.
* For the moment, cppcheck is not run against the FreeBSD specific
kernel sources. The cppcheck-FreeBSD target will need to be
implemented and testing on FreeBSD to support this.
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11508
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default, FreeBSD does not allow zpools to be backed by zvols (that
can be changed with the "vfs.zfs.vol.recursive" sysctl). When that
sysctl is set to 0, the kernel does not attempt to read zvols when
looking for vdevs. But the zpool command still does. This change brings
the zpool command into line with the kernel's behavior. It speeds "zpool
import" when an already imported pool has many zvols, or a zvol with
many snapshots.
https://svnweb.freebsd.org/base?view=revision&revision=357235
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=241083
https://reviews.freebsd.org/D22077
Obtained from: FreeBSD
Reported by: Martin Birgmeier <[email protected]>
Sponsored by: Axcient
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
Closes #11502
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Read all labels in parallel instead of sequentially.
Originally committed as
https://cgit.freebsd.org/src/commit/?id=b49e9abcf44cafaf5cfad7029c9a6adbb28346e8
Obtained from: FreeBSD
Sponsored by: Spectra Logic, Axcient
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alek Pinchuk <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
Closes #11467
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zpool_read_label doesn't need the full labels including uberblocks. It
only needs the vdev_phys_t. This reduces by half the amount of data
read to check for a label, speeding up "zpool import", "zpool
labelclear", etc.
Originally committed as
https://cgit.freebsd.org/src/commit/?id=63f8025d6acab1b334373ddd33f940a69b3b54cc
Obtained from: FreeBSD
Sponsored by: Spectra Logic Corp, Axcient
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alek Pinchuk <[email protected]>
Signed-off-by: Alan Somers <[email protected]>
Closes #11467
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In `zpool_find_config()`, the `pools` nvlist is leaked. Part of it (a
sub-nvlist) is returned in `*configp`, but the callers also leak that.
Additionally, in `zdb.c:main()`, the `searchdirs` is leaked.
The leaks were detected by ASAN (`configure --enable-asan`).
This commit resolves the leaks.
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #11396
|
|
|
|
|
|
|
|
|
|
|
| |
This shows up when compiling freebsd-head on amd64 using gcc-6.4.
The lib32 compat build ends up tripping over this assumption.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: adrian chadd <[email protected]>
Closes #11068
Closes #11069
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change updates the documentation to refer to the project
as OpenZFS instead ZFS on Linux. Web links have been updated
to refer to https://github.com/openzfs/zfs. The extraneous
zfsonlinux.org web links in the ZED and SPL sources have been
dropped.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11007
|
|
|
|
|
|
|
|
|
|
|
|
| |
libzutil depends on libnvpair, but this dependency is undeclared in the
build system. Therefore it isn't possible to make a new command that
depends on libzutil, but does not (directly) depend on libnvpair.
This commit makes this dependency explicit.
Reviewed-by: Brian Behlendorf <[email protected]>
Reivewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #10915
|
|
|
|
|
|
|
|
|
| |
Update the zfs commands such that they're backwards compatible with
the version of ZFS is the base FreeBSD.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #10542
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libtool stores absolute paths in the dependency_libs component of the
.la files. If the Makefile for a dependent library refers to the
libraries by relative path, some libraries end up duplicated on the link
command line.
As an example, libzfs specifies libzfs_core, libnvpair and libuutil as
dependencies to be linked in. The .la file for libzfs_core also
specifies libnvpair, but using an absolute path, with the result that
libnvpair is present twice in the linker command line for producing
libzfs.
While the only thing this causes is to slightly slow down the linking,
we can avoid it by using absolute paths everywhere, including for
convenience libraries just for consistency.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Arvind Sankar <[email protected]>
Closes #10538
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libzutil is currently statically linked into libzfs, libzfs_core and
libzpool. Avoid the unnecessary duplication by removing it from libzfs
and libzpool, and adding libzfs_core to libzpool.
Remove a few unnecessary dependencies:
- libuutil from libzfs_core
- libtirpc from libspl
- keep only libcrypto in libzfs, as we don't use any functions from
libssl
- librt is only used for clock_gettime, however on modern systems that's
in libc rather than librt. Add a configure check to see if we actually
need librt
- libdl from raidz_test
Add a few missing dependencies:
- zlib to libefi and libzfs
- libuuid to zpool, and libuuid and libudev to zed
- libnvpair uses assertions, so add assert.c to provide aok and
libspl_assertf
Sort the LDADD for programs so that libraries that satisfy dependencies
come at the end rather than the beginning of the linker command line.
Revamp the configure tests for libaries to use FIND_SYSTEM_LIBRARY
instead. This can take advantage of pkg-config, and it also avoids
polluting LIBS.
List all the required dependencies in the pkgconfig files, and move the
one for libzfs_core into the latter's directory. Install pkgconfig files
in $(libdir)/pkgconfig on linux and $(prefix)/libdata/pkgconfig on
FreeBSD, instead of /usr/share/pkgconfig, as the more correct location
for library .pc files.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Arvind Sankar <[email protected]>
Closes #10538
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reduce the usage of EXTRA_DIST. If files are conditionally included in
_SOURCES, _HEADERS etc, automake is smart enough to dist all files that
could possibly be included, but this does not apply to EXTRA_DIST,
resulting in make dist depending on the configuration.
Add some files that were missing altogether in various Makefile's.
The changes to disted files in this commit (excluding deleted files):
+./cmd/zed/agents/README.md
+./etc/init.d/README.md
+./lib/libspl/os/freebsd/getexecname.c
+./lib/libspl/os/freebsd/gethostid.c
+./lib/libspl/os/freebsd/getmntany.c
+./lib/libspl/os/freebsd/mnttab.c
-./lib/libzfs/libzfs_core.pc
-./lib/libzfs/libzfs.pc
+./lib/libzfs/os/freebsd/libzfs_compat.c
+./lib/libzfs/os/freebsd/libzfs_fsshare.c
+./lib/libzfs/os/freebsd/libzfs_ioctl_compat.c
+./lib/libzfs/os/freebsd/libzfs_zmount.c
+./lib/libzutil/os/freebsd/zutil_compat.c
+./lib/libzutil/os/freebsd/zutil_device_path_os.c
+./lib/libzutil/os/freebsd/zutil_import_os.c
+./module/lua/README.zfs
+./module/os/linux/spl/README.md
+./tests/README.md
+./tests/zfs-tests/tests/functional/cli_root/zfs_clone/zfs_clone_rm_nested.ksh
+./tests/zfs-tests/tests/functional/cli_root/zfs_send/zfs_send_encrypted_unloaded.ksh
+./tests/zfs-tests/tests/functional/inheritance/README.config
+./tests/zfs-tests/tests/functional/inheritance/README.state
+./tests/zfs-tests/tests/functional/rsend/rsend_016_neg.ksh
+./tests/zfs-tests/tests/perf/fio/sequential_readwrite.fio
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Arvind Sankar <[email protected]>
Closes #10501
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These terms reinforce the incorrect notion that black is bad and white
is good.
Replace this language with more specific terms which are also more clear
and don't rely on metaphor. Specifically:
* When vdevs are specified on the command line, they are the "selected"
vdevs.
* Entries in /dev/ which should not be considered as possible disks are
"excluded" devices.
Reviewed-by: John Kennedy <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Wilson <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #10457
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The horrible effects of human slavery continue to impact society. The
casual use of the term "slave" in computer software is an unnecessary
reference to a painful human experience.
This commit removes all possible references to the term "slave".
Implementation notes:
The zpool.d/slaves script is renamed to dm-deps, which uses the same
terminology as `dmsetup deps`.
References to the `/sys/class/block/$dev/slaves` directory remain. This
directory name is determined by the Linux kernel. Although
`dmsetup deps` provides the same information, it unfortunately requires
elevated privileges, whereas the `/sys/...` directory is world-readable.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #10435
|
|
|
|
|
|
|
|
|
|
| |
cmd/zpool and lib/libzutil Makefile's use -I., which won't work with a
VPATH build. Replace it with -I$(srcdir) instead.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Arvind Sankar <[email protected]>
Closes #10379
Closes #10421
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the FreeBSD platform code to the OpenZFS repository. As of this
commit the source can be compiled and tested on FreeBSD 11 and 12.
Subsequent commits are now required to compile on FreeBSD and Linux.
Additionally, they must pass the ZFS Test Suite on FreeBSD which is
being run by the CI. As of this commit 1230 tests pass on FreeBSD
and there are no unexpected failures.
Reviewed-by: Sean Eric Fagan <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Co-authored-by: Ryan Moeller <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #898
Closes #8987
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD needs a wrapper for handling zfs_cmd ioctls.
In libzfs this is handled by zfs_ioctl. However, here
we need to wrap the call directly.
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9511
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Linux the full path preceding devices is stripped when formatting
vdev names. On FreeBSD we only want to strip "/dev/". Hide the
implementation details of path stripping behind zfs_strip_path().
Make zfs_strip_partition_path() static in Linux implementation while
here, since it is never used outside of the file it is defined in.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #9565
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD has no analog. Buffered block devices were removed a decade
plus ago.
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9508
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 0.7.0, zpool import would unconditionally block on udev for 30
seconds. This introduced a regression in initramfs environments that
lack udev (particularly mdev based environments), yet use a zfs userland
tools intended for the system that had been built against udev. Gentoo's
genkernel is the main example, although custom user initramfs
environments would be similarly impacted unless special builds of the
ZFS userland utilities were done for them. Such environments already
have their own mechanisms for blocking until device nodes are ready
(such as genkernel's scandelay parameter), so it is unnecessary for
zpool import to block on a non-existent udev until a timeout is reached
inside of them.
Rather than trying to intelligently determine whether udev is available
on the system to avoid unnecessarily blocking in such environments, it
seems best to just allow the environment to override the timeout. I
propose that we add an environment variable called
ZPOOL_IMPORT_UDEV_TIMEOUT_MS. Setting it to 0 would restore the 0.6.x
behavior that was more desirable in mdev based initramfs environments.
This allows the system user land utilities to be reused when building
mdev-based initramfs archives.
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Georgy Yakovlev <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #9436
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements a new tree structure for ZFS, and uses it to
store range trees more efficiently.
The new structure is approximately a B-tree, though there are some
small differences from the usual characterizations. The tree has core
nodes and leaf nodes; each contain data elements, which the elements
in the core nodes acting as separators between its children. The
difference between core and leaf nodes is that the core nodes have an
array of children, while leaf nodes don't. Every node in the tree may
be only partially full; in most cases, they are all at least 50% full
(in terms of element count) except for the root node, which can be
less full. Underfull nodes will steal from their neighbors or merge to
remain full enough, while overfull nodes will split in two. The data
elements are contained in tree-controlled buffers; they are copied
into these on insertion, and overwritten on deletion. This means that
the elements are not independently allocated, which reduces overhead,
but also means they can't be shared between trees (and also that
pointers to them are only valid until a side-effectful tree operation
occurs). The overhead varies based on how dense the tree is, but is
usually on the order of about 50% of the element size; the per-node
overheads are very small, and so don't make a significant difference.
The trees can accept arbitrary records; they accept a size and a
comparator to allow them to be used for a variety of purposes.
The new trees replace the AVL trees used in the range trees today.
Currently, the range_seg_t structure contains three 8 byte integers
of payload and two 24 byte avl_tree_node_ts to handle its storage in
both an offset-sorted tree and a size-sorted tree (total size: 64
bytes). In the new model, the range seg structures are usually two 4
byte integers, but a separate one needs to exist for the size-sorted
and offset-sorted tree. Between the raw size, the 50% overhead, and
the double storage, the new btrees are expected to use 8*1.5*2 = 24
bytes per record, or 33.3% as much memory as the AVL trees (this is
for the purposes of storing metaslab range trees; for other purposes,
like scrubs, they use ~50% as much memory).
We reduced the size of the payload in the range segments by teaching
range trees about starting offsets and shifts; since metaslabs have a
fixed starting offset, and they all operate in terms of disk sectors,
we can store the ranges using 4-byte integers as long as the size of
the metaslab divided by the sector size is less than 2^32. For 512-byte
sectors, this is a 2^41 (or 2TB) metaslab, which with the default
settings corresponds to a 256PB disk. 4k sector disks can handle
metaslabs up to 2^46 bytes, or 2^63 byte disks. Since we do not
anticipate disks of this size in the near future, there should be
almost no cases where metaslabs need 64-byte integers to store their
ranges. We do still have the capability to store 64-byte integer ranges
to account for cases where we are storing per-vdev (or per-dnode) trees,
which could reasonably go above the limits discussed. We also do not
store fill information in the compact version of the node, since it
is only used for sorted scrub.
We also optimized the metaslab loading process in various other ways
to offset some inefficiencies in the btree model. While individual
operations (find, insert, remove_from) are faster for the btree than
they are for the avl tree, remove usually requires a find operation,
while in the AVL tree model the element itself suffices. Some clever
changes actually caused an overall speedup in metaslab loading; we use
approximately 40% less cpu to load metaslabs in our tests on Illumos.
Another memory and performance optimization was achieved by changing
what is stored in the size-sorted trees. When a disk is heavily
fragmented, the df algorithm used by default in ZFS will almost always
find a number of small regions in its initial cursor-based search; it
will usually only fall back to the size-sorted tree to find larger
regions. If we increase the size of the cursor-based search slightly,
and don't store segments that are smaller than a tunable size floor
in the size-sorted tree, we can further cut memory usage down to
below 20% of what the AVL trees store. This also results in further
reductions in CPU time spent loading metaslabs.
The 16KiB size floor was chosen because it results in substantial memory
usage reduction while not usually resulting in situations where we can't
find an appropriate chunk with the cursor and are forced to use an
oversized chunk from the size-sorted tree. In addition, even if we do
have to use an oversized chunk from the size-sorted tree, the chunk
would be too small to use for ZIL allocations, so it isn't as big of a
loss as it might otherwise be. And often, more small allocations will
follow the initial one, and the cursor search will now find the
remainder of the chunk we didn't use all of and use it for subsequent
allocations. Practical testing has shown little or no change in
fragmentation as a result of this change.
If the size-sorted tree becomes empty while the offset sorted one still
has entries, it will load all the entries from the offset sorted tree
and disregard the size floor until it is unloaded again. This operation
occurs rarely with the default setting, only on incredibly thoroughly
fragmented pools.
There are some other small changes to zdb to teach it to handle btrees,
but nothing major.
Reviewed-by: George Wilson <[email protected]>
Reviewed-by: Matt Ahrens <[email protected]>
Reviewed by: Sebastien Roy [email protected]
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes #9181
|
|
|
|
|
|
|
|
| |
Factor Linux specific functionality out of libzutil.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #9356
|
|
|
|
|
|
|
|
|
| |
Factor Linux specific pieces out of libspl.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Sean Eric Fagan <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9336
|
|
|
|
|
|
|
|
|
|
|
| |
Factor Linux specific functions out of the zpool command.
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Sean Eric Fagan <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9333
|
|
|
|
|
|
|
| |
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Andrea Gelmini <[email protected]>
Closes #9237
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Preferentially sort by the full path name instead of GUID when determining
which device links to use. This helps ensure that the pool vdevs are named
consistently when multiple links for a device appear in the same directory.
For example, the /dev/disk/by-id/scsi* and /dev/disk/by-id/wwn* links.
Reviewed-by: Alek Pinchuk <[email protected]>
Reviewed-by: Richard Elling <[email protected]>
Authored-by: Brian Behlendorf <[email protected]>
Signed-off-by: Kash Pande <[email protected]>
Closes #8108
Closes #8440
|
|
|
|
|
|
|
|
|
|
| |
ZFS should be able to build without libudev installed. The recent
change for libzutil inadvertently broke that. Make the libudev code
conditional in zutil_import.c to resolve the build failure.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #8097
|
|
Adds a libzutil for utility functions that are common to libzfs and
libzpool consumers (most of what was in libzfs_import.c). This
removes the need for utilities to link against both libzpool and
libzfs.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #8050
|