| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
The timestamp_truncate() function was added, it replaces the existing
timespec64_trunc() function. This change renames our wrapper function
to be consistent with the upstream name and updates the compatibility
code for older kernels accordingly.
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #9956
Closes #9961
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The proc_ops structure was introduced to replace the use of of the
file_operations structure when registering proc handlers. This
change creates a new kstat_proc_op_t typedef for compatibility
which can be used to pass around the correct structure.
This change additionally adds the 'const' keyword to all of the
existing proc operations structures.
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #9961
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous code used 4 atomics to do aggsum_flush_bucket() and 2 more to
re-borrow after the flush. But since asc_borrowed and asc_delta are
accessed only while holding asc_lock, it makes no any sense to modify
as_lower_bound and as_upper_bound in multiple steps. Instead of that
the new code uses only 2 atomics in all the cases, one per as_*_bound
variable. I think even that is overkill, simple atomic store and
load could be used here, since all modifications are done under the
as_lock, but there are no such primitives in ZFS code now.
While there, make borrow code consider previous borrow value, so that
on mixed request patterns reduce chance of needing to borrow again if
much larger request follows tiny one that needed borrow.
Also reduce as_numbuckets from uint64_t to u_int. It makes no sense
to use so large division operation on every aggsum_add().
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes #9930
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move db_link into the same cache line as db_blkid and db_level.
It allows significantly reduce avl_add() time in dbuf_create() on
systems with large RAM and huge number of dbufs per dnode.
Avoid few accesses to dbuf_caches[].size, which is highly congested
under high IOPS and never stays in cache for a long time. Use local
value we are receiving from zfs_refcount_add_many() any way.
Remove cache_size_bytes_max bump from dbuf_evict_one(). I don't see
a point to do it on dbuf eviction after we done it on insertion in
dbuf_rele_and_unlock().
Reviewed-by: Matt Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored-By: iXsystems, Inc.
Closes #9931
|
|
|
|
|
|
|
|
|
| |
Additionally pull in state machine comments about
upcoming async cow work.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matt Ahrens <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9902
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This replaces the placeholder ZFS_PROP_PRIVATE with ZFS_PROP_ACLMODE,
matching what is done in the NFSv4 ACLs PR (#9709).
On FreeBSD we hide ZFS_PROP_ACLTYPE, while on Linux we hide
ZFS_PROP_ACLMODE.
The tests already assume this arrangement.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #9913
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we finish a zfs receive, dmu_recv_end_sync() calls
zvol_create_minors(async=TRUE). This kicks off some other threads that
create the minor device nodes (in /dev/zvol/poolname/...). These async
threads call zvol_prefetch_minors_impl() and zvol_create_minor(), which
both call dmu_objset_own(), which puts a "long hold" on the dataset.
Since the zvol minor node creation is asynchronous, this can happen
after the `ZFS_IOC_RECV[_NEW]` ioctl and `zfs receive` process have
completed.
After the first receive ioctl has completed, userland may attempt to do
another receive into the same dataset (e.g. the next incremental
stream). This second receive and the asynchronous minor node creation
can interfere with one another in several different ways, because they
both require exclusive access to the dataset:
1. When the second receive is finishing up, dmu_recv_end_check() does
dsl_dataset_handoff_check(), which can fail with EBUSY if the async
minor node creation already has a "long hold" on this dataset. This
causes the 2nd receive to fail.
2. The async udev rule can fail if zvol_id and/or systemd-udevd try to
open the device while the the second receive's async attempt at minor
node creation owns the dataset (via zvol_prefetch_minors_impl). This
causes the minor node (/dev/zd*) to exist, but the udev-generated
/dev/zvol/... to not exist.
3. The async minor node creation can silently fail with EBUSY if the
first receive's zvol_create_minor() trys to own the dataset while the
second receive's zvol_prefetch_minors_impl already owns the dataset.
To address these problems, this change synchronously creates the minor
node. To avoid the lock ordering problems that the asynchrony was
introduced to fix (see #3681), we create the minor nodes from open
context, with no locks held, rather than from syncing contex as was
originally done.
Implementation notes:
We generally do not need to traverse children or prefetch anything (e.g.
when running the recv, snapshot, create, or clone subcommands of zfs).
We only need recursion when importing/opening a pool and when loading
encryption keys. The existing recursive, asynchronous, prefetching code
is preserved for use in these cases.
Channel programs may need to create zvol minor nodes, when creating a
snapshot of a zvol with the snapdev property set. We figure out what
snapshots are created when running the LUA program in syncing context.
In this case we need to remember what snapshots were created, and then
try to create their minor nodes from open context, after the LUA code
has completed.
There are additional zvol use cases that asynchronously own the dataset,
which can cause similar problems. E.g. changing the volmode or snapdev
properties. These are less problematic because they are not recursive
and don't touch datasets that are not involved in the operation, there
is still potential for interference with subsequent operations. In the
future, these cases should be similarly converted to create the zvol
minor node synchronously from open context.
The async tasks of removing and renaming minors do not own the objset,
so they do not have this problem. However, it may make sense to also
convert these operations to happen synchronously from open context, in
the future.
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Prakash Surya <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
External-issue: DLPX-65948
Closes #7863
Closes #9885
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implements the RAID-Z function using AltiVec SIMD.
This is basically the NEON code translated to AltiVec.
Note that the 'fletcher' algorithm requires 64-bits
operations, and the initial implementations of AltiVec
(PPC74xx a.k.a. G4, PPC970 a.k.a. G5) only has up to
32-bits operations, so no 'fletcher'.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Romain Dolbeau <[email protected]>
Closes #9539
|
|
|
|
|
|
|
|
|
| |
This adds support in channel programs to inherit properties analogous
to `zfs inherit` by adding `zfs.sync.inherit` and `zfs.check.inherit`
functions to the ZFS LUA API.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Jason King <[email protected]>
Closes #9738
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the handling for errata #4 has two issues which allow
the checks for this issue to be bypassed using resumable sends.
The first issue is that drc->drc_fromsnapobj is not set in the
resuming code as it is in the non-resuming code. This causes
dsl_crypto_recv_key_check() to skip its checks for the
from_ivset_guid. The second issue is that resumable sends do not
clean up their on-disk state if they fail the checks in
dmu_recv_stream() that happen before any data is received.
As a result of these two bugs, a user can attempt a resumable send
of a dataset without a from_ivset_guid. This will fail the initial
dmu_recv_stream() checks, leaving a valid resume state. The send
can then be resumed, which skips those checks, allowing the receive
to be completed.
This commit fixes these issues by setting drc->drc_fromsnapobj in
the resuming receive path and by ensuring that resumablereceives
are properly cleaned up if they fail the initial dmu_recv_stream()
checks.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #9818
Closes #9829
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zfs_mount_at() mounts a dataset at an arbitrary mountpoint rather than
at the configured mountpoint. This may be used by consumers that wish to
temporarily expose a dataset at another mountpoint without altering
dataset/pool properties.
This will be used by FreeBSD's libbe be_mount(), which mounts a boot
environment at an arbitrary mountpoint.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Kyle Evans <[email protected]>
Closes #9833
|
|
|
|
|
|
|
|
|
|
|
| |
Update the project website links contained in to repository to
reference the secure https://zfsonlinux.org address.
Reviewed-By: Richard Laager <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Garrett Fields <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #9837
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds the --saved (-S) to the 'zfs send' command.
This flag allows a user to send a partially received dataset,
which can be useful when migrating a backup server to new
hardware. This flag is compatible with resumable receives, so
even if the saved send is interrupted, it can be resumed.
The flag does not require any user / kernel ABI changes or any
new feature flags in the send stream format.
Reviewed-by: Paul Dagnelie <[email protected]>
Reviewed-by: Alek Pinchuk <[email protected]>
Reviewed-by: Paul Zuchowski <[email protected]>
Reviewed-by: Christian Schwarz <[email protected]>
Reviewed-by: Matt Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #9007
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of libzfs.h doesn't provide names for the parameters
in its signatures. These few functions included them. That
wouldn't be a problem, per se, but the 'lines' parameter
conflicts with the 'lines' #define from terminfo's term.h,
present for at least a decade. This makes it difficult to
compile code making use of both ZFS and terminfo.
Reviewed-by: Matt Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Signed-off-by: Nick Black <[email protected]>
Closes #9821
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For dedup, special and log devices "zpool add -n" does not print
correctly their vdev type:
~# zpool add -n pool dedup /tmp/dedup special /tmp/special log /tmp/log
would update 'pool' to the following configuration:
pool
/tmp/normal
/tmp/dedup
/tmp/special
/tmp/log
This could lead storage administrators to modify their ZFS pools to
unexpected and unintended vdev configurations.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #9783
Closes #9390
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the ZFS_COLOR env variable is set, then use ANSI color
output in zpool status:
- Column headers are bold
- Degraded or offline pools/vdevs are yellow
- Non-zero error counters and faulted vdevs/pools are red
- The 'status:' and 'action:' sections are yellow if they're
displaying a warning.
This also includes a new 'faketty' function in libtest.shlib that is
compatible with FreeBSD (code provided by @freqlabs).
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Ryan Moeller <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #9340
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD's vfs currently doesn't permit file systems
to do their own locking. To avoid having to have
duplicate zfs functions with and without locking add
locking here. With luck these changes can be removed
in the future.
Reviewed-by: Sean Eric Fagan <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9715
|
|
|
|
|
|
|
|
|
|
|
| |
The quota functions are common to all implementations and can be
moved to common code. As a simplification they were moved to the
Linux platform code in the initial refactoring.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #9710
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the 'zfs jail/unjail' subcommands along with the relevant
documentation from FreeBSD. This feature is not supported on
Linux and still requires the match kernel ioctls which will
be included when the FreeBSD platform code is integrated.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #9686
|
|
|
|
|
|
|
|
|
|
| |
Change many of the znops routines to take a znode rather
than an inode so that zfs_replay code can be largely shared
and in the future the much of the znops code may be shared.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9708
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fxsave and xsave require the target address to be 16-/64-byte aligned.
kmalloc(_node) does not (yet) offer such fine-grained control over
alignment[0,1], even though it does "the right thing" most of the time
for power-of-2 sizes. unfortunately, alignment is completely off when
using certain debugging or hardening features/configs, such as KASAN,
slub_debug=Z or the not-yet-upstream SLAB_CANARY.
Use alloc_pages_node() instead which allows us to allocate page-aligned
memory. Since fpregs_state is padded to a full page anyway, and this
code is only relevant for x86 which has 4k pages, this approach should
not allocate any unnecessary memory but still guarantee the needed
alignment.
0: https://lwn.net/Articles/787740/
1: https://lore.kernel.org/linux-block/[email protected]/
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Fabian Grünbichler <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #9608
Closes #9674
|
|
|
|
|
|
|
|
| |
The zfsvfs->z_sb field is Linux specified and should be abstracted.
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9697
|
|
|
|
|
|
|
|
|
| |
This change allows us to align the code dump logic across platforms.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Don Brady <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9691
|
|
|
|
|
|
|
|
|
|
| |
Followup for #9671, remove stale comment.
Reviewed-by: Kjeld Schouten <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Issue #9671
Closes #9682
|
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD uses its own crypto framework in-kernel which, at this time,
has no EDONR implementation.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #9664
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD requires three additional ioctls, they are ZFS_IOC_NEXTBOOT,
ZFS_IOC_JAIL, and ZFS_IOC_UNJAIL. These have been added after the
Linux-specific ioctls. The range 0x80-0xFF has been reserved for
future optional platform-specific ioctls. Any platform may choose
to implement these as appropriate.
None of the existing ioctl numbers have been changed to maintain
compatibility. For Linux no vectors have been registered for the
new ioctls and they are reported as unsupported.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9667
|
|
|
|
|
|
|
|
|
| |
Update zfs_deadman_failmode to use the ZFS_MODULE_PARAM_CALL
wrapper, and split the common and platform specific portions.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9670
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the ASSERTV macro and handle suppressing unused
compiler warnings for variables only in ASSERTs using the
__attribute__((unused)) compiler annotation. The annotation
is understood by both gcc and clang.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9671
|
|
|
|
|
|
|
|
|
|
|
|
| |
- on Linux move Linux specific headers to zfs_context_os.h
- on FreeBSD move FreeBSD specific definitions to zfs_context_os.h
- remove duplicate tsd_ definitions
- remove unused AT_TYPE
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Don Brady <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9668
|
|
|
|
|
|
|
|
|
| |
Moving qsort to the platform header allows each platform to
provide an appropriate sorting implementation.
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9663
|
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD needs to cope with multiple version of the zfs_cmd_t
structure. Allowing the platform code to pre and post
process the cmd structure makes it possible to work with
legacy tooling.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9624
|
|
|
|
|
|
|
| |
Linux and FreeBSD use different names for suid / setuid.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9632
|
|
|
|
|
|
|
|
| |
Minimal compatibility changes for FreeBSD.
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9631
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD needs to be able to pass the jail id to the jail/unjail ioctls
and the struct file in the device structure is unused.
Reviewed-by: Kjeld Schouten <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9625
|
|
|
|
|
|
|
|
|
|
| |
- move linux/ includes to platform headers
- add void * io_bio to zio for tracking the underlying bio
- add freebsd specific fields to abd_scatter
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9615
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As described in commit f81d5ef6 the zfs_vdev_elevator module
option is being removed. Users who require this functionality
should update their systems to set the disk scheduler using a
udev rule.
Reviewed-by: Richard Laager <[email protected]>
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #8664
Closes #9417
Closes #9609
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a device is participating in an active resilver, then it will have a
non-empty DTL. Operations like vdev_{open,reopen,probe}() can cause the
resilver to be restarted (or deferred to be restarted later), which is
unnecessary if the DTL is still covered by the current scan range. This
is similar to the logic in vdev_dtl_should_excise() where the DTL can
only be excised if it's max txg is in the resilvered range.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: John Gallagher <[email protected]>
Reviewed-by: Kjeld Schouten <[email protected]>
Signed-off-by: John Poduska <[email protected]>
Issue #840
Closes #9155
Closes #9378
Closes #9551
Closes #9588
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide a common zfs_file_* interface which can be implemented on all
platforms to perform normal file access from either the kernel module
or the libzpool library.
This allows all non-portable vnode_t usage in the common code to be
replaced by the new portable zfs_file_t. The associated vnode and
kobj compatibility functions, types, and macros have been removed
from the SPL. Moving forward, vnodes should only be used in platform
specific code when provided by the native operating system.
Reviewed-by: Sean Eric Fagan <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Igor Kozhukhov <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9556
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reinstate the zpl_revalidate() functionality to resolve a regression
where dentries for open files during a rollback are not invalidated.
The unrelated functionality for automatically unmounting .zfs/snapshots
was not reverted. Nor was the addition of shrink_dcache_sb() to the
zfs_resume_fs() function.
This issue was not immediately caught by the CI because the test case
intended to catch it was included in the list of ZTS tests which may
occasionally fail for unrelated reasons. Remove all of the rollback
tests from this list to help identify the frequency of any spurious
failures.
The rollback_003_pos.ksh test case exposes a real issue with the
long standing code which needs to be investigated. Regardless,
it has been enable with a small workaround in the test case itself.
Reviewed-by: Matt Ahrens <[email protected]>
Reviewed-by: Pavel Snajdr <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #9587
Closes #9592
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a new KMC_KVMEM flag was added to enforce use of the
kvmalloc allocator in kmem_cache_create even for large blocks, which
may also increase performance in some specific cases (e.g. zstd), too.
Default to KVMEM instead of VMEM in spl_kmem_cache_create.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matt Ahrens <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
Closes #9034
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements use of kvmalloc for GFP_KERNEL allocations, which
may increase performance if the allocator is able to allocate physical
memory, if kvmalloc is available as a public kernel interface (since
v4.12). Otherwise it will simply fall back to virtual memory (vmalloc).
Also fix vmem_alloc implementation which can lead to slow allocations
since the first attempt with kmalloc does not make use of the noretry
flag but tells the linux kernel to retry several times before it fails.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Matt Ahrens <[email protected]>
Signed-off-by: Sebastian Gottschall <[email protected]>
Signed-off-by: Michael Niewöhner <[email protected]>
Closes #9034
|
|
|
|
|
|
|
|
|
|
| |
FreeBSD needs a wrapper for handling zfs_cmd ioctls.
In libzfs this is handled by zfs_ioctl. However, here
we need to wrap the call directly.
Reviewed-by: Allan Jude <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9511
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Increase the minimum supported kernel version from 2.6.32 to 3.10.
This removes support for the following Linux enterprise distributions.
Distribution | Kernel | End of Life
---------------- | ------ | -------------
Ubuntu 12.04 LTS | 3.2 | Apr 28, 2017
SLES 11 | 3.0 | Mar 32, 2019
RHEL / CentOS 6 | 2.6.32 | Nov 30, 2020
The following changes were made as part of removing support.
* Updated `configure` to enforce a minimum kernel version as
specified in the META file (Linux-Minimum: 3.10).
configure: error:
*** Cannot build against kernel version 2.6.32.
*** The minimum supported kernel version is 3.10.
* Removed all `configure` kABI checks and matching C code for
interfaces which solely predate the Linux 3.10 kernel.
* Updated all `configure` kABI checks to fail when an interface is
missing which was in the 3.10 kernel up to the latest 5.1 kernel.
Removed the HAVE_* preprocessor defines for these checks and
updated the code to unconditionally use the verified interface.
* Inverted the detection logic in several kABI checks to match
the new interface as it appears in 3.10 and newer and not the
legacy interface.
* Consolidated the following checks in to individual files. Due
the large number of changes in the checks it made sense to handle
this now. It would be desirable to group other related checks in
the same fashion, but this as left as future work.
- config/kernel-blkdev.m4 - Block device kABI checks
- config/kernel-blk-queue.m4 - Block queue kABI checks
- config/kernel-bio.m4 - Bio interface kABI checks
* Removed the kABI checks for sops->nr_cached_objects() and
sops->free_cached_objects(). These interfaces are currently unused.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #9566
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Linux the full path preceding devices is stripped when formatting
vdev names. On FreeBSD we only want to strip "/dev/". Hide the
implementation details of path stripping behind zfs_strip_path().
Make zfs_strip_partition_path() static in Linux implementation while
here, since it is never used outside of the file it is defined in.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #9565
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the need for zpl_revalidate altogether.
There were 3 main reasons why we used d_revalidate:
1. periodic automounted snapshots umount deferral
2. negative dentries created before snapshot rollback
3. stale inodes referenced by dentry cache after snapshot rollback
Periodic snapshots deferral solution introduces zfs_exit_fs function,
which is called as a part of ZFS_EXIT(zfsvfs_t) macro.
Negative dentries and stale inodes are solved by flushing the dcache
for the particular dataset on zfs_resume_fs call.
This patch also removes now unused HAVE_S_D_OP configure test.
Reviewed-by: Aleksa Sarai <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pavel Snajdr <[email protected]>
Closes #8774
Closes #9549
|
|
|
|
|
|
|
| |
This adds basic support for RISC-V, specifically RV64G.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Romain Dolbeau <[email protected]>
Closes #9540
|
|
|
|
|
|
|
|
|
| |
Some of the znode fields are different and functions
consuming an inode don't exist on FreeBSD.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Jorgen Lundman <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9536
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds some new DTRACE_PROBE* endpoints so that we can observe taskq
latencies on a system. Additionally, a new "taskqlatency.bt" script is
added to do this observation via "bpftrace". Lastly, a "zfs-trace.sh"
script is added to wrap "bpftrace" with the proper options required to
run and use "taskqlatency.bt".
For example, with these changes in place, a user can run the following:
$ cd ./contrib/bpftrace
$ sudo ./zfs-trace.sh taskqlatency.bt
Attaching 6 probes...
^C
Here's some example output, showing latency information for time spent
executing the taskq entry's function:
@exec_lat_us[dp_sync_taskq, userquota_updates_task]:
[2, 4) 5 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[4, 8) 0 | |
[8, 16) 1 |@@@@@@@@@@ |
[16, 32) 2 |@@@@@@@@@@@@@@@@@@@@ |
@exec_lat_us[z_wr_int_h, zio_execute]:
[8, 16) 16 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16, 32) 2 |@@@@@@ |
@exec_lat_us[z_wr_iss_h, zio_execute]:
[16, 32) 4 |@@@@@@@@@@@@@@@@ |
[32, 64) 13 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[64, 128) 1 |@@@@ |
@exec_lat_us[z_ioctl_int, zio_execute]:
[2, 4) 1 |@@@@ |
[4, 8) 11 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[8, 16) 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
@exec_lat_us[dp_sync_taskq, sync_dnodes_task]:
[2, 4) 1 |@@@@@@ |
[4, 8) 7 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[8, 16) 8 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16, 32) 2 |@@@@@@@@@@@@@ |
[32, 64) 4 |@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[64, 128) 1 |@@@@@@ |
[128, 256) 0 | |
[256, 512) 1 |@@@@@@
Here's some example output, showing latency information for time spent
waiting on the taskq, prior to starting execution of entry's function:
@queue_lat_us[dp_sync_taskq]:
[2, 4) 1 |@@@@ |
[4, 8) 7 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[8, 16) 2 |@@@@@@@@ |
[16, 32) 3 |@@@@@@@@@@@@@ |
[32, 64) 12 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[64, 128) 6 |@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[128, 256) 0 | |
[256, 512) 1 |@@@@ |
@queue_lat_us[z_wr_iss]:
[4, 8) 4 |@@@@ |
[8, 16) 13 |@@@@@@@@@@@@@@@ |
[16, 32) 6 |@@@@@@@ |
[32, 64) 2 |@@ |
[64, 128) 12 |@@@@@@@@@@@@@@ |
[128, 256) 15 |@@@@@@@@@@@@@@@@@@ |
[256, 512) 33 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[512, 1K) 27 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[1K, 2K) 7 |@@@@@@@@ |
[2K, 4K) 14 |@@@@@@@@@@@@@@@@ |
[4K, 8K) 14 |@@@@@@@@@@@@@@@@ |
[8K, 16K) 23 |@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[16K, 32K) 43 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
@queue_lat_us[z_wr_int]:
[2, 4) 10 |@@@@@ |
[4, 8) 71 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[8, 16) 88 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[16, 32) 50 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[32, 64) 65 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ |
[64, 128) 43 |@@@@@@@@@@@@@@@@@@@@@@@@@ |
[128, 256) 19 |@@@@@@@@@@@ |
[256, 512) 3 |@ |
[512, 1K) 1 | |
Reviewed by: Brad Lewis <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Prakash Surya <[email protected]>
Closes #9525
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change modifies some of the infrastructure for enabling the use of
the DTRACE_PROBE* macros, such that we can use tehm in the "spl" module.
Currently, when the DTRACE_PROBE* macros are used, they get expanded to
create new functions, and these dynamically generated functions become
part of the "zfs" module.
Since the "spl" module does not depend on the "zfs" module, the use of
DTRACE_PROBE* in the "spl" module would result in undefined symbols
being used in the "spl" module. Specifically, DTRACE_PROBE* would turn
into a function call, and the function being called would be a symbol
only contained in the "zfs" module; which results in a linker and/or
runtime error.
Thus, this change adds the necessary logic to the "spl" module, to
mirror the tracing functionality available to the "zfs" module. After
this change, we'll have a "trace_zfs.h" header file which defines the
probes available only to the "zfs" module, and a "trace_spl.h" header
file which defines the probes available only to the "spl" module.
Reviewed by: Brad Lewis <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Prakash Surya <[email protected]>
Closes #9525
|
|
|
|
|
|
|
|
|
|
| |
MODULE_VERSION is already defined on FreeBSD. Wrap all of the
used MODULE_* macros for the sake of consistency and portability.
Add a user space noop version to reduce the need for _KERNEL ifdefs.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Matt Macy <[email protected]>
Closes #9542
|