| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move platform specific Linux headers under include/os/linux/.
Update the build system accordingly to detect the platform.
This lays some of the initial groundwork to supporting building
for other platforms.
As part of this change it was necessary to create both a user
and kernel space sys/simd.h header which can be included in
either context. No functional change, the source has been
refactored and the relevant #include's updated.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Igor Kozhukhov <[email protected]>
Signed-off-by: Matthew Macy <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #9198
|
|
|
|
|
|
|
| |
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Andrea Gelmini <[email protected]>
Closes #9232
|
|
|
|
|
|
|
|
|
|
| |
Resolve an assortment of style inconsistencies including
use of white space, typos, capitalization, and line wrapping.
There is no functional change.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #9030
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Restore the SIMD optimization for 4.19.38 LTS, 4.14.120 LTS,
and 5.0 and newer kernels. This is accomplished by leveraging
the fact that by definition dedicated kernel threads never need
to concern themselves with saving and restoring the user FPU state.
Therefore, they may use the FPU as long as we can guarantee user
tasks always restore their FPU state before context switching back
to user space.
For the 5.0 and 5.1 kernels disabling preemption and local
interrupts is sufficient to allow the FPU to be used. All non-kernel
threads will restore the preserved user FPU state.
For 5.2 and latter kernels the user FPU state restoration will be
skipped if the kernel determines the registers have not changed.
Therefore, for these kernels we need to perform the additional
step of saving and restoring the FPU registers. Invalidating the
per-cpu global tracking the FPU state would force a restore but
that functionality is private to the core x86 FPU implementation
and unavailable.
In practice, restricting SIMD to kernel threads is not a major
restriction for ZFS. The vast majority of SIMD operations are
already performed by the IO pipeline. The remaining cases are
relatively infrequent and can be handled by the generic code
without significant impact. The two most noteworthy cases are:
1) Decrypting the wrapping key for an encrypted dataset,
i.e. `zfs load-key`. All other encryption and decryption
operations will use the SIMD optimized implementations.
2) Generating the payload checksums for a `zfs send` stream.
In order to avoid making any changes to the higher layers of ZFS
all of the `*_get_ops()` functions were updated to take in to
consideration the calling context. This allows for the fastest
implementation to be used as appropriate (see kfpu_allowed()).
The only other notable instance of SIMD operations being used
outside a kernel thread was at module load time. This code
was moved in to a taskq in order to accommodate the new kernel
thread restriction.
Finally, a few other modifications were made in order to further
harden this code and facilitate testing. They include updating
each implementations operations structure to be declared as a
constant. And allowing "cycle" to be set when selecting the
preferred ops in the kernel as well as user space.
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #8754
Closes #8793
Closes #8965
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit ensures make(1) targets that build .deb packages fail if
alien(1) can't convert all .rpm files; additionally it also updates
the zfs-dracut package name which was changed to "noarch" in ca4e5a7.
Reviewed-by: Neal Gompa <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Olaf Faaland <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #8990
Closes #8991
|
|
|
|
|
|
|
|
|
|
|
| |
Don't require Python at configure/build unless building pyzfs.
Move ZFS_AC_PYTHON_MODULE to always-pyzfs.m4 where it is used.
Make test syntax more consistent.
Sponsored by: iXsystems, Inc.
Reviewed-by: Neal Gompa <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #8895
|
|
|
|
|
|
|
|
| |
If ZFS is built with enable_linux_builtin, it seems to be possible
to compile the kernel with TRIM_UNUSED_KSYM.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Torsten Wörtwein <[email protected]>
Closes #8820
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, --without-python would cause ./configure to fail. Now it is
able to proceed, and the Python scripts will not be built.
Use portable parameter expansion matching instead of nonstandard
substring matching to detect the Python version. This test is
duplicated in several places, so define a function for it.
Don't assume the full path to binaries, since different platforms do
install things in different places. Use AC_CHECK_PROGS instead.
When building without Python, also build without pyzfs.
Sponsored by: iXsystems, Inc.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: Eli Schwartz <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #8809
Closes #8731
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per suggestion from @behlendorf in #8777, remove vn_set_fs_pwd() and
vn_set_pwd() which are only used in zfs_ioctl.c:_init() while loading
zfs.ko.
The rest of initialization functions being called here after cwd set
to / don't depend on cwd of the process except for spa_config_load().
spa_config_load() uses a relative path ".//etc/zfs/zpool.cache" when
`rootdir` is non-NULL, which is "/etc/zfs/zpool.cache" given cwd is /,
so just unconditionally use the absolute path without "./", so that
`vn_set_pwd("/")` as well as the entire functions can be removed.
This is also what FreeBSD does.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #8826
|
|
|
|
|
|
|
|
|
|
|
|
| |
"whether ->count_objects callback exists" test failed with
"error: error" message for using an incomplete function shrinker_cb().
This is caused by torvalds/linux@83da1bed86. It's configurable,
but we would want to be able to compile with default kbuild setting.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #8776
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This failed on 5.2-rc1 with "error: unknown" message, for set_fs_pwd()
not being visible in both const and non-const tests.
This is caused by torvalds/linux@83da1bed86. It's configurable,
but we would want to be able to compile with default kbuild setting.
set_fs_pwd() has never been exported with exception of some distro
kernels, and set_fs_pwd() wasn't used in ZoL to begin with. The test
result was used for a spl function vn_set_fs_pwd().
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #8777
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
kstrtoul() exists only after torvalds/linux@33ee3b2e2eb9 in 2.6.39.
Use strict_strtoul() if kstrtoul() doesn't exist.
Note that strict_strtoul() has existed as an alias for kstrtoul()
for a while, but removed in torvalds/linux@3db2e9cdc085.
It looks like RHEL6 (2.6.32 based) has backported kstrtoul(),
and this caused build CI to pass compilation test.
It should fail on vanilla < 2.6.39 kernels or distro kernels without
backport as reported in #8760.
--
# grep "kstrtoul(" /lib/modules/2.6.32-754.12.1.el6.x86_64/build/ \
include/linux/kernel.h >/dev/null
# echo $?
0
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #8760
Closes #8761
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In `config/kernel-timer.m4` refactor slightly to check more generally
for the new `timer_setup()` APIs, but also check the callback signature
because some kernels (notably 4.14) have the new `timer_setup()` API but
use the old callback signature. Also add a check for a `flags` member in
`struct timer_list`, which was added in 4.1-rc8.
Add compatibility shims to `include/spl/sys/timer.h` to allow using the
new timer APIs with the only two caveats being that the callback
argument type must be declared as `spl_timer_list_t` and an explicit
assignment is required to get the timer variable for the `timer_of()`
macro. So the callback would look like this:
```c
__cv_wakeup(spl_timer_list_t t)
{
struct timer_list *tmr = (struct timer_list *)t;
struct thing *parent = from_timer(parent, tmr,
parent_timer_field);
... /* do stuff with parent */
```
Make some minor changes to `spl-condvar.c` and `spl-taskq.c` to use the
new timer APIs instead of conditional code.
Reviewed-by: Tomohiro Kusumi <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rafael Kitover <[email protected]>
Closes #8647
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Linux kernel commit ca79b0c211af63fa3276f0e3fd7dd9ada2439839
"mm: convert totalram_pages and totalhigh_pages variables to atomic"
replaced `totalhigh_pages` with an inline function `totalhigh_pages()`.
This broke compilation on IA32, etc, as ZoL uses `totalhigh_pages`
on archs with highmem. Confirmed on Fedora 30 (5.0.9-301.fc30.i686).
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #8677
Closes #8701
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`make distclean` removes an empty file config/config.rpath.
Avoid that by adding some text.
Also see e1245d83e9("Prevent `make distclean` removing 0 sized file").
--
# find . -size 0
./config/config.rpath
# ./autogen.sh && ./configure
# git diff
# make distclean
# git diff
diff --git a/config/config.rpath b/config/config.rpath
deleted file mode 100644
index e69de29bb..000000000
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #8665
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Detect in autoconf whether `-lintl` and possibly `-liconv` are necessary
for translation functions like `gettext()`.
The actual autoconf code is just:
```
AM_ICONV
AM_GNU_GETTEXT([external])
LIBS="$LIBS $LTLIBINTL $LTLIBICONV"
```
References:
https://www.gnu.org/software/gettext/manual/html_node/AM_005fGNU_005fGETTEXT.html
https://www.gnu.org/software/gettext/manual/html_node/AM_005fICONV.html
The reason to check for `libiconv` and add it separately is that this is
sometimes necessary if users are linking statically.
The `config/*.m4` files were added by running `gettextize` and removing
everything else.
The empty file `config/config.rpath` is necessary to avoid an error with
some versions of autotools, see:
http://ramblingfoo.blogspot.com/2007/07/required-file-configrpath-not-found.html
The `config.rpath` copied by `gettextize` does not currently work, there
is some kind of missing interaction with `libtool` and it tries to apply
`libtool` flags to the compiler.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Signed-off-by: Rafael Kitover <[email protected]>
Closes #8554
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By default, depending on the version, gcc can reuse the frame pointer register.
This is a micro-optimization that might help on some very old x86 processors.
However, it also makes dynamic tracing less useful because the stacks cannot
be easily observed.
This rule change instructs gcc to use the -fno-omit-frame-pointer option
when compiling.
Reviewed-by: Olaf Faaland <[email protected]>
Reviewed by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Elling <[email protected]>
Closes #8617
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
UNMAP/TRIM support is a frequently-requested feature to help
prevent performance from degrading on SSDs and on various other
SAN-like storage back-ends. By issuing UNMAP/TRIM commands for
sectors which are no longer allocated the underlying device can
often more efficiently manage itself.
This TRIM implementation is modeled on the `zpool initialize`
feature which writes a pattern to all unallocated space in the
pool. The new `zpool trim` command uses the same vdev_xlate()
code to calculate what sectors are unallocated, the same per-
vdev TRIM thread model and locking, and the same basic CLI for
a consistent user experience. The core difference is that
instead of writing a pattern it will issue UNMAP/TRIM commands
for those extents.
The zio pipeline was updated to accommodate this by adding a new
ZIO_TYPE_TRIM type and associated spa taskq. This new type makes
is straight forward to add the platform specific TRIM/UNMAP calls
to vdev_disk.c and vdev_file.c. These new ZIO_TYPE_TRIM zios are
handled largely the same way as ZIO_TYPE_READs or ZIO_TYPE_WRITEs.
This makes it possible to largely avoid changing the pipieline,
one exception is that TRIM zio's may exceed the 16M block size
limit since they contain no data.
In addition to the manual `zpool trim` command, a background
automatic TRIM was added and is controlled by the 'autotrim'
property. It relies on the exact same infrastructure as the
manual TRIM. However, instead of relying on the extents in a
metaslab's ms_allocatable range tree, a ms_trim tree is kept
per metaslab. When 'autotrim=on', ranges added back to the
ms_allocatable tree are also added to the ms_free tree. The
ms_free tree is then periodically consumed by an autotrim
thread which systematically walks a top level vdev's metaslabs.
Since the automatic TRIM will skip ranges it considers too small
there is value in occasionally running a full `zpool trim`. This
may occur when the freed blocks are small and not enough time
was allowed to aggregate them. An automatic TRIM and a manual
`zpool trim` may be run concurrently, in which case the automatic
TRIM will yield to the manual TRIM.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Tim Chase <[email protected]>
Reviewed-by: Matt Ahrens <[email protected]>
Reviewed-by: George Wilson <[email protected]>
Reviewed-by: Serapheim Dimitropoulos <[email protected]>
Contributions-by: Saso Kiselkov <[email protected]>
Contributions-by: Tim Chase <[email protected]>
Contributions-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #8419
Closes #598
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a few issues when detecting which kernel_fpu functions
are available.
- Use kernel_fpu_begin() if it's exported on newer kernels.
- Use ZFS_LINUX_TRY_COMPILE_SYMBOL() to choose the right kernel_fpu
function when using --enable-linux-builtin.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #8259
Closes #8363
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improve the autoconf code for finding libtirpc and do not assume the
headers are in /usr/include/tirpc.
Also remove this assumption from the `rpc/xdr.h` header in libspl and
use the same `#include_next` mechanism that is used for other libspl
headers.
Include pkg.m4 from pkg-config in config/ for PKG_CHECK_MODULES(), the
file license allows this.
Include ax_save_flags.m4 and ax_restore_flags.m4 from autoconf-archive,
the file licenses are compatible. Use the 2012 versions so as not rely
on a more recent autoconf feature AS_VAR_COPY(), which breaks some build
slaves.
Add new macro library `config/find_system_library.m4` which defines the
FIND_SYSTEM_LIBRARY() macro which is a convenience wrapper over using
PKG_CHECK_MODULES() with a fallback to standard library locations and
some sanity checks.
The parameters are:
```
FIND_SYSTEM_LIBRARY(VARIABLE-PREFIX, MODULE, HEADER, HEADER-PREFIXES,
LIBRARY, FUNCTIONS, [ACTION-IF-FOUND], [ACTION-IF-NOT-FOUND])
```
`HEADER-PREFIXES` and `FUNCTIONS` are comma-separated m4 lists.
For libtirpc we are using:
```
FIND_SYSTEM_LIBRARY(LIBTIRPC, [libtirpc], [rpc/xdr.h], [tirpc], [tirpc],
[xdrmem_create], [], [...])
```
The headers are first checked for without the prefixes and then with.
This system works with pkg-config and falls back on checking standard
header/library locations, it can be easily overridden by the user by
setting the `PREFIX_CFLAGS` and `PREFIX_LIBS` variables which are
automatically added to the `./configure --help` output.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Rafael Kitover <[email protected]>
Closes #7422
Closes #8313
|
|
|
|
|
|
|
|
|
|
|
| |
The Linux 5.0 kernel updated the bio_set_dev() macro so it calls the
GPL-only bio_associate_blkg() symbol thus inadvertently converting
the entire macro. Provide a minimal version which always assigns the
request queue's root_blkg to the bio.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #8287
|
|
|
|
|
|
|
|
|
|
| |
The 5.0 kernel no longer exports the functions we need to do vector
(SSE/SSE2/SSE3/AVX...) instructions. Disable vector-based checksum
algorithms when building against those kernels.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #8259
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
totalram_pages() was converted to an atomic variable in 5.0:
https://patchwork.kernel.org/patch/10652795/
Its value should now be read though the totalram_pages() helper
function.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #8263
|
|
|
|
|
|
|
|
| |
access_ok no longer needs a 'type' parameter in the 5.0 kernel.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #8261
|
|
|
|
|
|
|
|
|
| |
Newer kernels remove current_kernel_time64(). Use
ktime_get_coarse_real_ts64() in its place.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #8258
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Almost all of the Python code in the respository has been updated
to be compatibile with Python 2.6, Python 3.4, or newer. The only
exceptions are arc_summery3.py which requires Python 3, and pyzfs
which requires at least Python 2.7. This allows us to maintain a
single version of the code and support most default versions of
python. This change does the following:
* Sets the default shebang for all Python scripts to python3. If
only Python 2 is available, then at install time scripts which
are compatible with Python 2 will have their shebangs replaced
with /usr/bin/python. This is done for compatibility until
Python 2 goes end of life. Since only the installed versions
are changed this means Python 3 must be installed on the system
for test-runner when testing in-tree.
* Added --with-python=<2|3|3.4,etc> configure option which sets
the PYTHON environment variable to target a specific python
version. By default the newest installed version of Python
will be used or the preferred distribution version when
creating pacakges.
* Fixed --enable-pyzfs configure checks so they are run when
--enable-pyzfs=check and --enable-pyzfs=yes.
* Enabled pyzfs for Python 3.4 and newer, which is now supported.
* Renamed pyzfs package to python<VERSION>-pyzfs and updated to
install in the appropriate site location. For example, when
building with --with-python=3.4 a python34-pyzfs will be
created which installs in /usr/lib/python3.4/site-packages/.
* Renamed the following python scripts according to the Fedora
guidance for packaging utilities in /bin
- dbufstat.py -> dbufstat
- arcstat.py -> arcstat
- arc_summary.py -> arc_summary
- arc_summary3.py -> arc_summary3
* Updated python-cffi package name. On CentOS 6, CentOS 7, and
Amazon Linux it's called python-cffi, not python2-cffi. For
Python3 it's called python3-cffi or python3x-cffi.
* Install one version of arc_summary. Depending on the version
of Python available install either arc_summary2 or arc_summary3
as arc_summary. The user output is only slightly different.
Reviewed-by: John Ramsden <[email protected]>
Reviewed-by: Neal Gompa <[email protected]>
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #8096
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This partially reverts commit 8005ca4 by moving the strlcat()
and strlcpy() compatibility implementations back to their original
location.
In addition, these two functions were added to the AC_CHECK_FUNCS
macro. When these functions are available from the C library,
HAVE_STRLCAT and HAVE_STRLCPY will be defined and library version
used. Otherwise the compatibility version is built.
Reviewed-by: Sebastian Gottschall <[email protected]>
Reviewed-by: Alek Pinchuk <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #8157
Closes #8202
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Macro ZFS_MINOR, introduced in commit a6cc9756 to record the chosen
static minor number for /dev/zfs, conflicts with an existing macro
in Lustre. The lustre macro (along with _MAJOR, _PATCH, _FIX) is
used to record the zfsonlinux version Lustre is being built against.
Since the Lustre macro came first, and is used in past versions of
lustre at least going back to 2.10, it makes sense to rename the
macro in ZFS instead of doing so in Lustre which would require
backporting the patch.
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Olaf Faaland <[email protected]>
Closes #8195
|
|
|
|
|
|
|
|
|
|
| |
This fixes the build when cross-compiling, where the preprocessor might
be prefixed.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Ben Wolsieffer <[email protected]>
Closes #8180
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that the _unitdir, _presetdir, _modulesloaddir, and
_systemdgeneratordir macros are always defined. If not set
them to the expected default values. Pass all of these options
to ./configure and package the resulting files in those locations.
Additionally, set __brp_mangle_shebangs_exclude_from until the
conversion to Python 3 is complete so they may be built cleanly
under mock.
Reviewed-by: Neal Gompa <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7567
Closes #8119
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When handling a 32-bit statfs() system call the returned fields,
although 64-bit in the kernel, must be limited to 32-bits or an
EOVERFLOW error will be returned.
This is less of an issue for block counts since the default
reported block size in 128KiB. But since it is possible to
set a smaller block size, these values will be scaled as
needed to fit in a 32-bit unsigned long.
Unlike most other filesystems the total possible file counts
are more likely to overflow because they are calculated based
on the available free space in the pool. In order to prevent
this the reported value must be capped at 2^32-1. This is
only for statfs(2) reporting, there are no changes to the
internal ZFS limits.
Reviewed-by: Andreas Dilger <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #7927
Closes #7122
Closes #7937
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Direct IO via the O_DIRECT flag was originally introduced in XFS by
IRIX for database workloads. Its purpose was to allow the database
to bypass the page and buffer caches to prevent unnecessary IO
operations (e.g. readahead) while preventing contention for system
memory between the database and kernel caches.
On Illumos, there is a library function called directio(3C) that
allows user space to provide a hint to the file system that Direct IO
is useful, but the file system is free to ignore it. The semantics
are also entirely a file system decision. Those that do not
implement it return ENOTTY.
Since the semantics were never defined in any standard, O_DIRECT is
implemented such that it conforms to the behavior described in the
Linux open(2) man page as follows.
1. Minimize cache effects of the I/O.
By design the ARC is already scan-resistant which helps mitigate
the need for special O_DIRECT handling. Data which is only
accessed once will be the first to be evicted from the cache.
This behavior is in consistent with Illumos and FreeBSD.
Future performance work may wish to investigate the benefits of
immediately evicting data from the cache which has been read or
written with the O_DIRECT flag. Functionally this behavior is
very similar to applying the 'primarycache=metadata' property
per open file.
2. O_DIRECT _MAY_ impose restrictions on IO alignment and length.
No additional alignment or length restrictions are imposed.
3. O_DIRECT _MAY_ perform unbuffered IO operations directly
between user memory and block device.
No unbuffered IO operations are currently supported. In order
to support features such as transparent compression, encryption,
and checksumming a copy must be made to transform the data.
4. O_DIRECT _MAY_ imply O_DSYNC (XFS).
O_DIRECT does not imply O_DSYNC for ZFS. Callers must provide
O_DSYNC to request synchronous semantics.
5. O_DIRECT _MAY_ disable file locking that serializes IO
operations. Applications should avoid mixing O_DIRECT
and normal IO or mmap(2) IO to the same file. This is
particularly true for overlapping regions.
All I/O in ZFS is locked for correctness and this locking is not
disabled by O_DIRECT. However, concurrently mixing O_DIRECT,
mmap(2), and normal I/O on the same file is not recommended.
This change is implemented by layering the aops->direct_IO operations
on the existing AIO operations. Code already existed in ZFS on Linux
for bypassing the page cache when O_DIRECT is specified.
References:
* http://xfs.org/docs/xfsdocs-xml-dev/XFS_User_Guide/tmp/en-US/html/ch02s09.html
* https://blogs.oracle.com/roch/entry/zfs_and_directio
* https://ext4.wiki.kernel.org/index.php/Clarifying_Direct_IO's_Semantics
* https://illumos.org/man/3c/directio
Reviewed-by: Richard Elling <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #224
Closes #7823
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add two new module parameters to icp (icp_aes_impl, icp_gcm_impl)
that control the crypto implementation. At the moment there is a
choice between generic and aesni (on platforms that support it).
- This enables support for AES-NI and PCLMULQDQ-NI on AMD Family
15h (bulldozer) and newer CPUs (zen).
- Modify aes_key_t to track what implementation it was generated
with as key schedules generated with various implementations
are not necessarily interchangable.
Reviewed by: Gvozden Neskovic <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tom Caputi <[email protected]>
Reviewed-by: Richard Laager <[email protected]>
Signed-off-by: Nathaniel R. Lewis <[email protected]>
Closes #7102
Closes #7103
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While the autoexpand property may seem like a small feature it
depends on a significant amount of system infrastructure. Enough
of that infrastructure is now in place that with a few modifications
for Linux it can be supported.
Auto-expand works as follows; when a block device is modified
(re-sized, closed after being open r/w, etc) a change uevent is
generated for udev. The ZED, which is monitoring udev events,
passes the change event along to zfs_deliver_dle() if the disk
or partition contains a zfs_member as identified by blkid.
From here the device is matched against all imported pool vdevs
using the vdev_guid which was read from the label by blkid. If
a match is found the ZED reopens the pool vdev. This re-opening
is important because it allows the vdev to be briefly closed so
the disk partition table can be re-read. Otherwise, it wouldn't
be possible to report the maximum possible expansion size.
Finally, if the property autoexpand=on a vdev expansion will be
attempted. After performing some sanity checks on the disk to
verify that it is safe to expand, the primary partition (-part1)
will be expanded and the partition table updated. The partition
is then re-opened (again) to detect the updated size which allows
the new capacity to be used.
In order to make all of the above possible the following changes
were required:
* Updated the zpool_expand_001_pos and zpool_expand_003_pos tests.
These tests now create a pool which is layered on a loopback,
scsi_debug, and file vdev. This allows for testing of non-
partitioned block device (loopback), a partition block device
(scsi_debug), and a file which does not receive udev change
events. This provided for better test coverage, and by removing
the layering on ZFS volumes there issues surrounding layering
one pool on another are avoided.
* zpool_find_vdev_by_physpath() updated to accept a vdev guid.
This allows for matching by guid rather than path which is a
more reliable way for the ZED to reference a vdev.
* Fixed zfs_zevent_wait() signal handling which could result
in the ZED spinning when a signal was not handled.
* Removed vdev_disk_rrpart() functionality which can be abandoned
in favor of kernel provided blkdev_reread_part() function.
* Added a rwlock which is held as a writer while a disk is being
reopened. This is important to prevent errors from occurring
for any configuration related IOs which bypass the SCL_ZIO lock.
The zpool_reopen_007_pos.ksh test case was added to verify IO
error are never observed when reopening. This is not expected
to impact IO performance.
Additional fixes which aren't critical but were discovered and
resolved in the course of developing this functionality.
* Added PHYS_PATH="/dev/zvol/dataset" to the vdev configuration for
ZFS volumes. This is as good as a unique physical path, while the
volumes are not used in the test cases anymore for other reasons
this improvement was included.
Reviewed by: Richard Elling <[email protected]>
Signed-off-by: Sara Hartse <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #120
Closes #2437
Closes #5771
Closes #7366
Closes #7582
Closes #7629
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The blk_queue_stackable() function was replaced in the 4.14 kernel
by queue_is_rq_based(), commit torvalds/linux@5fdee212. This change
resulted in the default elevator being used which can negatively
impact performance.
Rather than adding additional compatibility code to detect the
new interface unconditionally attempt to set the elevator. Since
we expect this to fail for block devices without an elevator the
error message has been moved in to zfs_dbgmsg().
Finally, it was observed that the elevator_change() was removed
from the 4.12 kernel, commit torvalds/linux@c033269. Update the
comment to clearly specify which are expected to export the
elevator_change() symbol.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7645
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit torvalds/linux@95582b0 changes the inode i_atime, i_mtime,
and i_ctime members form timespec's to timespec64's to make them
2038 safe. As part of this change the current_time() function was
also updated to return the timespec64 type.
Resolve this issue by introducing a new inode_timespec_t type which
is defined to match the timespec type used by the inode. It should
be used when working with inode timestamps to ensure matching types.
The timestruc_t type under Illumos was used in a similar fashion but
was specified to always be a timespec_t. Rather than incorrectly
define this type all timespec_t types have been replaced by the new
inode_timespec_t type.
Finally, the kernel and user space 'sys/time.h' headers were aligned
with each other. They define as appropriate for the context several
constants as macros and include static inline implementation of
gethrestime(), gethrestime_sec(), and gethrtime().
Reviewed-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7643
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added support for the bops->check_events() interface which was
added in the 2.6.38 kernel to replace bops->media_changed().
Fully implementing this functionality allows the volume resize
code to rely on revalidate_disk(), which is the preferred
mechanism, and removes the need to use check_disk_size_change().
In order for bops->check_events() to lookup the zvol_state_t
stored in the disk->private_data the zvol_state_lock needs to
be held. Since the check events interface may poll the mutex
has been converted to a rwlock for better concurrently. The
rwlock need only be taken as a writer in the zvol_free() path
when disk->private_data is set to NULL.
The configure checks for the block_device_operations structure
were consolidated in a single kernel-block-device-operations.m4
file.
The ZFS_AC_KERNEL_BDEV_BLOCK_DEVICE_OPERATIONS configure checks
and assoicated dead code was removed. This interface was added
to the 2.6.28 kernel which predates the oldest supported 2.6.32
kernel and will therefore always be available.
Updated maximum Linux version in META file. The 4.17 kernel
was released on 2018-06-03 and ZoL is compatible with the
finalized kernel.
Reviewed-by: Boris Protopopov <[email protected]>
Reviewed-by: Sara Hartse <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7611
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zpool and zed place scripts in subdirectories of libexecdir. Some
distributions locate architecture independent scripts in other locations
(e.g. Debian). To avoid these paths getting out of sync, centralize the
definitions.
Build zfs-test's default.cfg by Makefile. Use the new directory
logic building tests/zfs-tests/include/default.cfg.in.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Antonio Russo <[email protected]>
Closes #7597
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Minimal changes required to integrate the SPL sources in to the
ZFS repository build infrastructure and packaging.
Build system and packaging:
* Renamed SPL_* autoconf m4 macros to ZFS_*.
* Removed redundant SPL_* autoconf m4 macros.
* Updated the RPM spec files to remove SPL package dependency.
* The zfs package obsoletes the spl package, and the zfs-kmod
package obsoletes the spl-kmod package.
* The zfs-kmod-devel* packages were updated to add compatibility
symlinks under /usr/src/spl-x.y.z until all dependent packages
can be updated. They will be removed in a future release.
* Updated copy-builtin script for in-kernel builds.
* Updated DKMS package to include the spl.ko.
* Updated stale AUTHORS file to include all contributors.
* Updated stale COPYRIGHT and included the SPL as an exception.
* Renamed README.markdown to README.md
* Renamed OPENSOLARIS.LICENSE to LICENSE.
* Renamed DISCLAIMER to NOTICE.
Required code changes:
* Removed redundant HAVE_SPL macro.
* Removed _BOOT from nvpairs since it doesn't apply for Linux.
* Initial header cleanup (removal of empty headers, refactoring).
* Remove SPL repository clone/build from zimport.sh.
* Use of DEFINE_RATELIMIT_STATE and DEFINE_SPINLOCK removed due
to build issues when forcing C99 compilation.
* Replaced legacy ACCESS_ONCE with READ_ONCE.
* Include needed headers for `current` and `EXPORT_SYMBOL`.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Olaf Faaland <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Pavel Zakharov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
TEST_ZIMPORT_SKIP="yes"
Closes #7556
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Merge a minimal version of the zfsonlinux/spl repository in to the
zfsonlinux/zfs repository. Care was taken to prevent file conflicts
when merging and to preserve the spl repository history. The spl
kernel module remains under the GPLv2 license as documented by the
additional THIRDPARTYLICENSE.gplv2 file.
Signed-off-by: Brian Behlendorf <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit removes everything from the repository except the core
SPL implementation for Linux. Those files which remain have been
moved to non-conflicting locations to facilitate the merge.
The README.md and associated files have been updated accordingly.
Signed-off-by: Brian Behlendorf <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Always invoke the SPL_AC_DEBUG* macro's when running configure
so RPM_DEFINE_COMMON is correctly expanded. A similar change
was already applied to ZFS.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #703
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
With rpm-software-management/rpm@5e94633 a package version containing
invalid characters (most commonly a double '-') causes the kmod package
generation to terminate with an error. This change takes advantage of
the newly introduced rpm macro "_wrong_version_format_terminate_build"
to allow kmod packages to be built.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #691
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Split the kernel interface configure checks in to seperate m4
macro files. This is intended to facilitate moving the spl
source code in to the zfs repository.
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #682
|
| |
| |
| |
| |
| |
| |
| |
| | |
Add missing helper function cv_timedwait_io(), it should be used
when waiting on IO with a specified timeout.
Reviewed-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #674
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When building packages on Debian-based systems specify the target
architecture used by 'alien' to convert .rpm packages into .deb: this
avoids detecting an incorrect value which results in the following
errors:
<package>.aarch64.rpm is for architecture aarch64 ; the package cannot be built on this system
<package>.armv7l.rpm is for architecture armel ; the package cannot be built on this system
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes zfsonlinux/zfs#7046
Closes #678
|
| |
| |
| |
| |
| |
| |
| |
| | |
Use timer_setup() macro and new timeout function definition.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #670
Closes #671
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The kernel_read & kernel_write functions have always wrapped the
vfs_read & vfs_write functions respectively. However, they could
not be used by vn_rdwr() since the offset wasn't passed as a
pointer. This prevented us from being able to properly update
the file offset.
Linux 4.14 unexported vfs_read & vfs_write but also changed the
signature of kernel_read & kernel_write to provide the needed
functionality. Use these updated functions when available.
Reviewed-by: Pritam Baral <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #656
Closes #667
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Both vn_rename and vn_remove have been historically problematic
to implement reliably. Rather than fixing them yet again they
are being removed.
Reviewed-by: Arkadiusz Bubala <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #648
Closes #661
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
* config/deb.am: Enable building DKMS packages for Debian
* rpm/generic/spl-dkms.spec.in: Adjust spec to be Debian-compatible
* Condition kernel-devel Requires to RPM distros
* Ensure that --rpm_safe_upgrade isn't used on non-RPM distros
* config/deb.am: Drop CONFIG_KERNEL and CONFIG_USER guards
* Makefile.am: Add pkg-dkms target
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Neal Gompa <[email protected]>
Closes #657
|