| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Other (all?) Linux filesystems seem to return -EPERM instead of -EACCESS
when trying to set FS_APPEND_FL or FS_IMMUTABLE_FL without the
CAP_LINUX_IMMUTABLE capability. This was detected by generic/545 test
in the fstest suite.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Luis Henriques <[email protected]>
Closes #11791
|
|
|
|
|
|
| |
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Andrea Gelmini <[email protected]>
Closes #11775
|
|
|
|
|
|
|
|
|
|
|
|
| |
It used to be required to pass a enum km_type to kmap_atomic() and
kunmap_atomic(), however this is no longer necessary and the wrappers
zfs_k(un)map_atomic removed these. This is confusing in the ABD code as
the struct abd_iter member iter_km no longer exists and the wrapper
macros simply compile them out.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Adam Moss <[email protected]>
Signed-off-by: Brian Atkinson <[email protected]>
Closes #11768
|
|
|
|
|
|
|
|
|
|
|
| |
The BIO_MAX_PAGES macro is being retired in favor of a bio_max_segs()
function that implements the typical MIN(x,y) logic used throughout the
kernel for bounding the allocation, and also the new implementation is
intended to be signed-safe (which the former was not).
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Coleman Kane <[email protected]>
Closes #11765
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Linux 5.12, the filesystem API was modified to support ipmapped
mounts by adding a "struct user_namespace *" parameter to a number
functions and VFS handlers. This change adds the needed autoconf
macros to detect the new interfaces and updates the code appropriately.
This change does not add support for idmapped mounts, instead it
preserves the existing behavior by passing the initial user namespace
where needed. A subsequent commit will be required to add support
for idmapped mounted.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Co-authored-by: Brian Behlendorf <[email protected]>
Signed-off-by: Coleman Kane <[email protected]>
Closes #11712
|
|
|
|
|
|
|
| |
Avoids tripping on asserts when doing pool recovery.
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Mateusz Guzik <[email protected]>
Closes #11739
|
|
|
|
|
|
|
|
|
|
|
| |
Don't handle (incorrectly) kmem_zalloc() failure. With KM_SLEEP,
will never return NULL.
Free the data allocated for non-virtual kstats when deleting the object.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11767
|
|
|
|
|
|
|
|
|
|
|
| |
zhold() wraps igrab() on Linux, and igrab() may fail when the inode
is in the process of being deleted. This means zhold() must only be
called when a reference exists and therefore it cannot be deleted.
This is the case for all existing consumers so add a VERIFY and a
comment explaining this requirement.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Adam Moss <[email protected]>
Closes #11704
|
|
|
|
|
|
|
|
|
|
|
|
| |
To make use of zfs_refcount_held tunable it should be a module
parameter in open-zfs. Also, since the macros will auto-generate OS
specific tunables, removed the existing zfs_refcount_held reference
in module/os/freebsd/zfs/sysctl_os.c.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Allan Jude <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #11753
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add parsing of the rewind options.
When I was upstreaming the change [1], I omitted the part where we
detect that the pool should be rewind. When the FreeBSD repo has
synced with the OpenZFS, this part of the code was removed.
[1] FreeBSD repo: 277f38abffc6a8160b5044128b5b2c620fbb970c
[2] OpenZFS repo: f2c027bd6a003ec5793f8716e6189c389c60f47a
External-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=254152
Originally reviewed by: tsoome, allanjude
Originally reviewed by: kevans (ok from high-level overview)
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Mariusz Zaborski <[email protected]>
Closes #11730
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolve some oddities in zfsdev_close() which could result in a
panic and were not present in the equivalent function for Linux.
- Remove unused definition ZFS_MIN_MINOR
- FreeBSD: Simplify zfsdev state destruction
- Assert zs_minor is valid in zfsdev_close
- Make locking around zfsdev state match Linux
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11720
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow platforms to implement it as they see fit, in particular
in a different manner than rrm locks.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matt Macy <[email protected]>
Signed-off-by: Mateusz Guzik <[email protected]>
Closes #11153
|
|
|
|
|
|
|
|
| |
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matt Macy <[email protected]>
Signed-off-by: Mateusz Guzik <[email protected]>
Closes #11153
|
|
|
|
|
|
|
|
|
|
|
| |
They are not very useful and hard to implement in the rms routine
the code is about to start using.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matt Macy <[email protected]>
Signed-off-by: Mateusz Guzik <[email protected]>
Closes #11153
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. even up ifdefs
2. drop the arguably useless teardown lock asserts -- nothing else
checks for it
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matt Macy <[email protected]>
Signed-off-by: Mateusz Guzik <[email protected]>
Closes #11153
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zil_replaying(zil, tx) has the side-effect of informing the ZIL that an
entry has been replayed in the (still open) tx. The ZIL uses that
information to record the replay progress in the ZIL header when that
tx's txg syncs.
ZPL log entries are not idempotent and logically dependent and thus
calling zil_replaying() is necessary for correctness.
For ZVOLs the question of correctness is more nuanced: ZVOL logs only
TX_WRITE and TX_TRUNCATE, both of which are idempotent. Logical
dependencies between two records exist only if the write or discard
request had sync semantics or if the ranges affected by the records
overlap.
Thus, at a first glance, it would be correct to restart replay from
the beginning if we crash before replay completes. But this does not
address the following scenario:
Assume one log record per LWB.
The chain on disk is
HDR -> 1:W(1, "A") -> 2:W(1, "B") -> 3:W(2, "X") -> 4:W(3, "Z")
where N:W(O, C) represents log entry number N which is a TX_WRITE of C
to offset A.
We replay 1, 2 and 3 in one txg, sync that txg, then crash.
Bit flips corrupt 2, 3, and 4.
We come up again and restart replay from the beginning because
we did not call zil_replaying() during replay.
We replay 1 again, then interpret 2's invalid checksum as the end
of the ZIL chain and call replay done.
The replayed zvol content is "AX".
If we had called zil_replaying() the HDR would have pointed to 3
and our resumed replay would not have replayed anything because
3 was corrupted, resulting in zvol content "BX".
If 3 logically depends on 2 then the replay corrupted the ZVOL_OBJ's
contents.
This patch adds the zil_replaying() calls to the replay functions.
Since the callbacks in the replay function need the zilog_t* pointer
so that they can call zil_replaying() we open the ZIL while
replaying in zvol_create_minor(). We also verify that replay has
been done when on-demand-opening the ZIL on the first modifying
bio.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Christian Schwarz <[email protected]>
Closes #11667
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ZFS_READONLY represents the "DOS R/O" attribute.
When that flag is set, we should behave as if write access
were not granted by anything in the ACL. In particular:
We _must_ allow writes after opening the file r/w, then
setting the DOS R/O attribute, and writing some more.
(Similar to how you can write after fchmod(fd, 0444).)
Restore these semantics which were lost on FreeBSD when refactoring
zfs_write. To my knowledge Linux does not actually expose this flag,
but we'll need it to eventually so I've added the supporting checks.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11693
|
|
|
|
|
|
|
|
|
| |
When populating a ZIL destination buffer ensure it is always
zeroed before its contents are constructed.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Tom Caputi <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11687
|
|
|
|
|
|
|
|
|
|
|
|
| |
The spl_kmem_alloc showed up in some flamegraphs in a single-threaded
4k sync write workload at 85k IOPS on an
Intel(R) Xeon(R) Silver 4215 CPU @ 2.50GHz.
Certainly not a huge win but I believe the change is clean and
easy to maintain down the road.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Christian Schwarz <[email protected]>
Closes #11666
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function has three similar pieces of code: for read-behind pages,
requested pages and read-ahead pages. All three pieces had an
assert to ensure that the page is not mapped. Later the assert was
relaxed to require that the page is not mapped for writing. But that
was done in two places out of three. This change fixes the third piece,
read-ahead.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Andriy Gapon <[email protected]>
Closes #11654
|
|
|
|
|
|
|
|
|
|
| |
The struct bio member bi_disk was moved underneath a new member named
bi_bdev. So all attempts to reference bio->bi_disk need to now become
bio->bi_bdev->bd_disk.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Coleman Kane <[email protected]>
Closes #11639
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Linux increase the maximum allowed size of the src nvlist which
can be passed to the /dev/zfs ioctl. Originally, this was set
to a maximum of KMALLOC_MAX_SIZE (4M) because it was kmalloc'd.
Since that time it's been converted to a vmalloc so that's no
longer a hard limit, and it's desirable for `zfs send/recv` to
allow larger nvlists so more snapshots can be sent at once.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #6572
Closes #11638
|
|
|
|
|
|
|
|
|
| |
Making uio_impl.h the common header interface between Linux and FreeBSD
so both OS's can share a common header file. This also helps reduce code
duplication for zfs_uio_t for each OS.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Brian Atkinson <[email protected]>
Closes #11622
|
|
|
|
|
|
|
| |
Add zfs_racct_* interfaces for platform-dependent read/write accounting.
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11613
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
First, the crypto request completion handler contains a bug in that it
fails to reset fs_done correctly after the request is completed. This
is only a problem for asynchronous drivers. Second, some hardware
drivers have input constraints which ZFS does not satisfy. For
instance, ccp(4) apparently requires the AAD length for AES-GCM to be a
multiple of the cipher block size, and with qat(4) the AES-GCM AAD
length may not be longer than 240 bytes. FreeBSD's generic crypto
framework doesn't have a mechanism to automatically fall back to a
software implementation if a hardware driver cannot process a request,
and ZFS does not tolerate such errors.
The plan is to implement such a fallback mechanism, but with FreeBSD
13.0 approaching we should simply disable the use hardware drivers for
now.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Mark Johnston <[email protected]>
Closes #11612
|
|
|
|
|
|
|
|
|
|
| |
Remove function that become unused after refactoring in
e2af2acce3436acdb2b35fdc7c9de1a30ea85514.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Libby <[email protected]>
Closes #11614
|
|
|
|
|
|
|
|
|
|
|
| |
zfs_znode_update_vfs is a more platform-agnostic name than
zfs_inode_update. Besides that, the function's prototype is moved to
include/sys/zfs_znode.h as the function is also used in common code.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ka Ho Ng <[email protected]>
Sponsored by: The FreeBSD Foundation
Closes #11580
|
|
|
|
|
|
|
|
|
|
| |
The first time through the loop prevdb and prevhdl are NULL. They
are then both set, but only prevdb is checked. Add an ASSERT to
make it clear that prevhdl must be set when prevdb is.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Kleber <[email protected]>
Closes #10754
Closes #11575
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`__vdev_disk_physio()` uses `abd_nr_pages_off()` to allocate a bio with
a sufficient number of iovec's to process this zio (i.e.
`nr_iovecs`/`bi_max_vecs`). If there are not enough iovec's in the bio,
then additional bio's will be allocated. However, this is a sub-optimal
code path. In particular, it requires several abd calls (to
`abd_nr_pages_off()` and `abd_bio_map_off()`) which will have to walk
the constituents of the ABD (the pages or the gang children) because
they are looking for offsets > 0.
For gang ABD's, `abd_nr_pages_off()` returns the number of iovec's
needed for the first constituent, rather than the sum of all
constituents (within the requested range). This always under-estimates
the required number of iovec's, which causes us to always need several
bio's. The end result is that `__vdev_disk_physio()` is usually O(n^2)
for gang ABD's (and occasionally O(n^3), when more than 16 bio's are
needed).
This commit fixes `abd_nr_pages_off()`'s handling of gang ABD's, to
correctly determine how many iovec's are needed, by adding up the number
of iovec's for each of the gang children in the requested range.
Reviewed-by: Mark Maybee <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Brian Atkinson <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #11536
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a race condition in zfs_zrele_async when we are checking if
we would be the one to evict an inode. This can lead to a txg sync
deadlock.
Instead of calling into iput directly, we attempt to perform the atomic
decrement ourselves, unless that would set the i_count value to zero.
In that case, we dispatch a call to iput to run later, to prevent a
deadlock from occurring.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Closes #11527
Closes #11530
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order for cppcheck to perform a proper analysis it needs to be
aware of how the sources are compiled (source files, include
paths/files, extra defines, etc). All the needed information is
available from the Makefiles and can be leveraged with a generic
cppcheck Makefile target. So let's add one.
Additional minor changes:
* Removing the cppcheck-suppressions.txt file. With cppcheck 2.3
and these changes it appears to no longer be needed. Some inline
suppressions were also removed since they appear not to be
needed. We can add them back if it turns out they're needed
for older versions of cppcheck.
* Added the ax_count_cpus m4 macro to detect at configure time how
many processors are available in order to run multiple cppcheck
jobs. This value is also now used as a replacement for nproc
when executing the kernel interface checks.
* "PHONY =" line moved in to the Rules.am file which is included
at the top of all Makefile.am's. This is just convenient becase
it allows us to use the += syntax to add phony targets.
* One upside of this integration worth mentioning is it now allows
`make cppcheck` to be run in any directory to check that subtree.
* For the moment, cppcheck is not run against the FreeBSD specific
kernel sources. The cppcheck-FreeBSD target will need to be
implemented and testing on FreeBSD to support this.
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11508
|
|
|
|
|
|
|
|
|
| |
Identical condition and return expression 'rc', return value is
always 0.
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11508
|
|
|
|
|
|
|
|
|
| |
The ASSERT that the passed pointer isn't NULL appears after the
pointer has already been dereferenced. Remove the redundant check.
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11508
|
|
|
|
|
|
|
|
|
| |
Like any other thread created by thread_create() we need to call
thread_exit() to properly clean it up. In particular, this ensures the
tsd hash for the thread is cleared.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #11512
|
|
|
|
|
|
|
|
|
| |
Set VIRF_MOUNTPOINT flag on snapshot mountpoint.
Authored-by: Mateusz Guzik <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11458
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of commit 1c2358c1 the custom uio_prefaultpages() code
was removed in favor of using the generic kernel provided
iov_iter_fault_in_readable() interface. Unfortunately, it
turns out that up until the Linux 4.7 kernel the function would
only ever fault in the first iovec of the iov_iter. The result
being uiomove_iov() may hang waiting for the page.
This commit effectively restores the custom uio_prefaultpages()
pages code for Linux 4.9 and earlier kernels which contain the
troublesome version of iov_iter_fault_in_readable().
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11463
Closes #11484
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In FreeBSD the struct uio was just a typedef to uio_t. In order to
extend this struct, outside of the definition for the struct uio, the
struct uio has been embedded inside of a uio_t struct.
Also renamed all the uio_* interfaces to be zfs_uio_* to make it clear
this is a ZFS interface.
Reviewed-by: Ryan Moeller <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Brian Atkinson <[email protected]>
Closes #11438
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The `abd_get_offset_*()` routines create an abd_t that references
another abd_t, and doesn't allocate any pages/buffers of its own. In
some workloads, these routines may be called frequently, to create many
abd_t's representing small pieces of a single large abd_t. In
particular, the upcoming RAIDZ Expansion project makes heavy use of
these routines.
This commit adds the ability for the caller to allocate and provide the
abd_t struct to a variant of `abd_get_offset_*()`. This eliminates the
cost of allocating the abd_t and performing the accounting associated
with it (`abdstat_struct_size`). The RAIDZ/DRAID code uses this for
the `rc_abd`, which references the zio's abd. The upcoming RAIDZ
Expansion project will leverage this infrastructure to increase
performance of reads post-expansion by around 50%.
Additionally, some of the interfaces around creating and destroying
abd_t's are cleaned up. Most significantly, the distinction between
`abd_put()` and `abd_free()` is eliminated; all types of abd_t's are
now disposed of with `abd_free()`.
Reviewed-by: Brian Atkinson <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Issue #8853
Closes #11439
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Virtuozzo 7 kernels starting 3.10.0-1127.18.2.vz7.163.46
have the following configuration:
* no HAVE_VFS_RW_ITERATE
* HAVE_VFS_DIRECT_IO_ITER_RW_OFFSET
=> let's add implementation of zpl_direct_IO() via
zpl_aio_{read,write}() in this case.
https://bugs.openvz.org/browse/OVZ-7243
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Konstantin Khorenko <[email protected]>
Closes #11410
Closes #11411
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of 5.11 the blk_register_region() and blk_unregister_region()
functions have been retired. This isn't a problem since add_disk()
has implicitly allocated minor numbers for a very long time.
Reviewed-by: Rafael Kitover <[email protected]>
Reviewed-by: Coleman Kane <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11387
Closes #11390
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both revalidate_disk_size() and revalidate_disk() have been removed.
Functionally this isn't a problem because we only relied on these
functions to call zvol_revalidate_disk() for us and to perform any
additional handling which might be needed for that kernel version.
When neither are available we know there's no additional handling
needed and we can directly call zvol_revalidate_disk().
Reviewed-by: Rafael Kitover <[email protected]>
Reviewed-by: Coleman Kane <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11387
Closes #11390
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bd_contains member was removed from the block_device structure.
Callers needing to determine if a vdev is a whole block device should
use the new bdev_whole() wrapper. For older kernels we provide our
own bdev_whole() wrapper which relies on bd_contains for compatibility.
Reviewed-by: Rafael Kitover <[email protected]>
Reviewed-by: Coleman Kane <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11387
Closes #11390
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The generic IO accounting functions have been removed in favor of the
bio_start_io_acct() and bio_end_io_acct() functions which provide a
better interface. These new functions were introduced in the 5.8
kernels but it wasn't until the 5.11 kernel that the previous generic
IO accounting interfaces were removed.
This commit updates the blk_generic_*_io_acct() wrappers to provide
and interface similar to the updated kernel interface. It's slightly
different because for older kernels we need to pass the request queue
as well as the bio.
Reviewed-by: Rafael Kitover <[email protected]>
Reviewed-by: Coleman Kane <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11387
Closes #11390
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The lookup_bdev() function has been updated to require a dev_t
be passed as the second argument. This is actually pretty nice
since the major number stored in the dev_t was the only part we
were interested in. This allows to us avoid handling the bdev
entirely. The vdev_lookup_bdev() wrapper was updated to emulate
the behavior of the new lookup_bdev() for all supported kernels.
Reviewed-by: Rafael Kitover <[email protected]>
Reviewed-by: Coleman Kane <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11387
Closes #11390
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 1c2358c12 restructured this code and introduced a warning
about the variable maybe not being initialized. This cannot happen
with the updated code but we should initialize the variable anyway
to silence the warning.
zpl_file.c: In function ‘zpl_iter_write’:
zpl_file.c:324:9: warning: ‘count’ may be used uninitialized
in this function [-Wmaybe-uninitialized]
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11373
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no need to call iov_iter_advance() in zpl_iter_read().
This was preserved from the previous code where it wasn't needed
but also didn't cause any problems. Now that the iter functions
also handle pipes that's no longer the case. When fully reading a
pipe buffer iov_iter_advance() may results in the pipe buf release
function being called which will not be registered resulting in
a NULL dereference.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11375
Closes #11378
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 59b68723 added a configure check for 5.10, which removed
revalidate_disk(), and conditionally replaced it's usage with a call to
the new revalidate_disk_size() function. However, the old function also
invoked the device's registered callback, in our case
zvol_revalidate_disk(). This commit adds a call to zvol_revalidate_disk()
in zvol_update_volsize() to make sure the code path stays the same.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Michael D Labriola <[email protected]>
Closes #11358
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of the 5.10 kernel the generic splice compatibility code has been
removed. All filesystems are now responsible for registering a
->splice_read and ->splice_write callback to support this operation.
The good news is the VFS provided generic_file_splice_read() and
iter_file_splice_write() callbacks can be used provided the ->iter_read
and ->iter_write callback support pipes. However, this is currently
not the case and only iovecs and bvecs (not pipes) are ever attached
to the uio structure.
This commit changes that by allowing full iov_iter structures to be
attached to uios. Ever since the 4.9 kernel the iov_iter structure
has supported iovecs, kvecs, bvevs, and pipes so it's desirable to
pass the entire thing when possible. In conjunction with this the
uio helper functions (i.e uiomove(), uiocopy(), etc) have been
updated to understand the new UIO_ITER type.
Note that using the kernel provided uio_iter interfaces allowed the
existing Linux specific uio handling code to be simplified. When
there's no longer a need to support kernel's older than 4.9, then
it will be possible to remove the iovec and bvec members from the
uio structure and always use a uio_iter. Until then we need to
maintain all of the existing types for older kernels.
Some additional refactoring and cleanup was included in this change:
- Added checks to configure to detect available iov_iter interfaces.
Some are available all the way back to the 3.10 kernel and are used
when available. In particular, uio_prefaultpages() now always uses
iov_iter_fault_in_readable() which is available for all supported
kernels.
- The unused UIO_USERISPACE type has been removed. It is no longer
needed now that the uio_seg enum is platform specific.
- Moved zfs_uio.c from the zcommon.ko module to the Linux specific
platform code for the zfs.ko module. This gets it out of libzfs
where it was never needed and keeps this Linux specific code out
of the common sources.
- Removed unnecessary O_APPEND handling from zfs_iter_write(), this
is redundant and O_APPEND is already handled in zfs_write();
Reviewed-by: Colin Ian King <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11351
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vfs.zfs.arc_no_grow_shift has an invalid type (15) and this causes
py-sysctl to format it as a bytearray when it should be an integer.
"U" is not a valid format, it should be "I" and the type should match
the variable type, int. We can return EINVAL if the value is set below
zero.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11318
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolve an uninitialized variable warning when compiling.
In function ‘zfs_domount’:
warning: ‘root_inode’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
sb->s_root = d_make_root(root_inode);
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #11306
|