| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
5117 space map reallocation can cause corruption
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Sebastien Roy <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/projects/illumos-gate/issues/5117
https://github.com/illumos/illumos-gate/commit/e503a68
Ported by: Turbo Fredriksson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2662
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a non-ZAP object is passed to zap_lockdir() it will be treated
as a valid ZAP object. This can result in zap_lockdir() attempting
to read what it believes are leaf blocks from invalid disk locations.
The SCSI layer will eventually generate errors for these bogus IOs
but the caller will hang in zap_get_leaf_byblk().
The good news is that is a situation which can not occur unless the
pool has been damaged. The bad news is that there are reports from
both FreeBSD and Solaris of damaged pools. Specifically, there are
normal files in the filesystem which reference another normal file
as their parent.
Since pools like this are known to exist the zap_lockdir() function
has been updated to verify the type of the object. If a non-ZAP
object has been passed it EINVAL will be returned immediately.
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #2597
Issue #2602
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nfsd uses do_readv_writev() to implement fops->read and fops->write.
do_readv_writev() will attempt to read/write using fops->aio_read and
fops->aio_write, but it will fallback to fops->read and fops->write when
AIO is not available. However, the fallback will perform a call for each
individual data page. Since our default recordsize is 128KB, sequential
operations on NFS will generate 32 DMU transactions where only 1
transaction was needed. That was unnecessary overhead and we implement
fops->aio_read and fops->aio_write to eliminate it.
ZFS originated in OpenSolaris, where the AIO API is entirely implemented
in userland's libc by intelligently mapping them to VOP_WRITE, VOP_READ
and VOP_FSYNC. Linux implements AIO inside the kernel itself. Linux
filesystems therefore must implement their own AIO logic and nearly all
of them implement fops->aio_write synchronously. Consequently, they do
not implement aio_fsync(). However, since the ZPL works by mapping
Linux's VFS calls to the functions implementing Illumos' VFS operations,
we instead implement AIO in the kernel by mapping the operations to the
VOP_READ, VOP_WRITE and VOP_FSYNC equivalents. We therefore implement
fops->aio_fsync.
One might be inclined to make our fops->aio_write implementation
synchronous to make software that expects this behavior safe. However,
there are several reasons not to do this:
1. Other platforms do not implement aio_write() synchronously and since
the majority of userland software using AIO should be cross platform,
expectations of synchronous behavior should not be a problem.
2. We would hurt the performance of programs that use POSIX interfaces
properly while simultaneously encouraging the creation of more
non-compliant software.
3. The broader community concluded that userland software should be
patched to properly use POSIX interfaces instead of implementing hacks
in filesystems to cater to broken software. This concept is best
described as the O_PONIES debate.
4. Making an asynchronous write synchronous is non sequitur.
Any software dependent on synchronous aio_write behavior will suffer
data loss on ZFSOnLinux in a kernel panic / system failure of at most
zfs_txg_timeout seconds, which by default is 5 seconds. This seems like
a reasonable consequence of using non-compliant software.
It should be noted that this is also a problem in the kernel itself
where nfsd does not pass O_SYNC on files opened with it and instead
relies on a open()/write()/close() to enforce synchronous behavior when
the flush is only guarenteed on last close.
Exporting any filesystem that does not implement AIO via NFS risks data
loss in the event of a kernel panic / system failure when something else
is also accessing the file. Exporting any file system that implements
AIO the way this patch does bears similar risk. However, it seems
reasonable to forgo crippling our AIO implementation in favor of
developing patches to fix this problem in Linux's nfsd for the reasons
stated earlier. In the interim, the risk will remain. Failing to
implement AIO will not change the problem that nfsd created, so there is
no reason for nfsd's mistake to block our implementation of AIO.
It also should be noted that `aio_cancel()` will always return
`AIO_NOTCANCELED` under this implementation. It is possible to implement
aio_cancel by deferring work to taskqs and use `kiocb_set_cancel_fn()`
to set a callback function for cancelling work sent to taskqs, but the
simpler approach is allowed by the specification:
```
Which operations are cancelable is implementation-defined.
```
http://pubs.opengroup.org/onlinepubs/009695399/functions/aio_cancel.html
The only programs on my system that are capable of using `aio_cancel()`
are QEMU, beecrypt and fio use it according to a recursive grep of my
system's `/usr/src/debug`. That suggests that `aio_cancel()` users are
rare. Implementing aio_cancel() is left to a future date when it is
clear that there are consumers that benefit from its implementation to
justify the work.
Lastly, it is important to know that handling of the iovec updates differs
between Illumos and Linux in the implementation of read/write. On Linux,
it is the VFS' responsibility whle on Illumos, it is the filesystem's
responsibility. We take the intermediate solution of copying the iovec
so that the ZFS code can update it like on Solaris while leaving the
originals alone. This imposes some overhead. We could always revisit
this should profiling show that the allocations are a problem.
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #223
Closes #2373
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When com.delphix:spacemap_histogram is disabled, the value of
fragmentation was printing as 18446744073709551615 (UINT64_MAX),
when it should print as '-'.
The issue was caused by a small mistake during the merge of
"4980 metaslabs should have a fragmentation metric."
upstream: https://github.com/illumos/illumos-gate/commit/2e4c998
ZoL: https://github.com/zfsonlinux/zfs/commit/f3a7f66
The problem is in zpool_get_prop_literal, where the handling of the
pool property ZPOOL_PROP_FRAGMENTATION was added to wrong the
section. In particular, ZPOOL_PROP_FRAGMENTATION should not be in
the section where zpool_get_state(zhp) == POOL_STATE_UNAVAIL, but
lower down after it's already been determined that the pool is in
fact available, which is where upstream illumos correctly has had
it.
Thanks to lundman for helping to track down this bug.
Signed-off-by: Jorgen Lundman <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2664
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by: George Wilson <[email protected]>
Reviewed by: Mattew Ahrens <[email protected]>
Reviewed by: Paul Dagnelie <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Saso Kiselkov <[email protected]>
Approved by: Rich Lowe <[email protected]>
References:
https://www.illumos.org/issues/5049
https://github.com/illumos/illumos-gate/commit/2986efa
Ported-by: Brian Behlendorf <[email protected]>
Closes #2636
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit should prevent a deadlock on dp_config_rwlock when
running `zfs rename` by ensuring zvol_rename_minors() is not
called under this lock.
Signed-off-by: Stanislav Seletskiy <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2652.
Closes #2525.
|
|
|
|
|
|
|
|
|
|
| |
The zfs-import-cache.service and zfs-import-scan.service should
should be started after cryptsetup to ensure all LUKS devices have
been opened.
Signed-off-by: alteriks <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1474
|
|
|
|
|
|
|
|
|
| |
This gives a huge performance improvement in operations with deduped
datasets especially when the bottleneck is the amount of ram
available for zfs.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2639
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change mount code to diagnose filesystem versions that
are not supported by the current implementation.
Change upgrade code to do likewise and refuse to upgrade
a pool if any filesystems on it are a version which is
not supported by the current implementation.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Dan Swartzendruber <[email protected]>
Closes: #2616
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the ZED_EMAIL_INTERVAL_SECS="3600" option is set in zed.rc
configuration file then notification emails should be rate limited.
Rate limiting is accomplished by maintaining a colon delimited state
file which includes the device name. Unfortunately there are valid
device names which include a colon and therefore prevent the rate
limiting for working properly. For this reason the delimiter has
been changed to a semi-colon.
Signed-off-by: louwrentius <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris Dunlap <[email protected]>
Closes #2645
|
|
|
|
|
|
|
|
|
|
|
|
| |
Change the startup mode of ZED to non-forking. While systemd can
track processes that detach from the terminal just fine, running
processes in non-forking mode is the preferred mode of operation.
Also remove user/group definitions as root/root is the default.
Signed-off-by: Chris Dunlap <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2252
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a set of minor cleanup changes related to zed logging:
- Remove the program identity prefix from messages written to stderr
since systemd already prepends this output with the program name.
- Replace the copy of the program identity string with a ptr reference.
- Replace "pid" with "PID" for consistency in comments & strings.
- Rename the zed_log.c struct _ctx component "level" to "priority".
- Add the LOG_PID option for messages written to syslog.
Signed-off-by: Chris Dunlap <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #2252
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the zed is started as a forking daemon (by default),
a race-condition exists where the parent process can terminate before
the pidfile has been created by the grandchild process. When invoked
as a Type=forking systemd service, this can result in the following:
systemd[1]: Starting ZFS Event Daemon (zed)...
systemd[1]: PID file /var/run/zed.pid not readable (yet?) after start.
This commit adds a daemonize pipe to allow the grandchild process to
signal the parent process that initialization is complete (and the
pidfile has been created). The parent process will wait for this
notification before exiting.
Signed-off-by: Chris Dunlap <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #2252
|
|
|
|
|
|
|
| |
An email address in the AUTHORS file was missing its trailing >.
This patch fixes that typo.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Providing a pkg-config file makes is easy for 3rd party applications
to link against the libzfs libraries. It also allows the libzfs
developers to modify the list of required libraries and cflags
without breaking existing applications.
The following example illustrates how pkg-config can be used:
cc `pkg-config --cflags --libs libzfs` -o myapp myapp.c
/*
* myapp.c
*/
void main()
{
libzfs_handle_t *hdl;
hdl = libzfs_init();
if (hdl)
libzfs_fini(hdl);
}
Signed-off-by: Turbo Fredriksson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes: #585
|
|
|
|
|
|
|
|
|
|
|
| |
The HAVE_IOCTL_* configure checks were originally added for
compatibility with an ancient version of glibc. This support
and additional complexity is no longer needed and is therefore
being removed.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Turbo Fredriksson <[email protected]>
Closes #585
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4970 need controls on i/o issued by zpool import -XF
4971 zpool import -T should accept hex values
4972 zpool import -T implies extreme rewind, and thus a scrub
4973 spa_load_retry retries the same txg
4974 spa_load_verify() reads all data twice
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
References:
https://www.illumos.org/issues/4970
https://www.illumos.org/issues/4971
https://www.illumos.org/issues/4972
https://www.illumos.org/issues/4973
https://www.illumos.org/issues/4974
https://github.com/illumos/illumos-gate/commit/e42d205
Notes:
This set of patches adds a set of tunable parameters for the
"extreme rewind" mode of pool import which allows control over
the traversal performed during such an import.
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2598
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
5034 ARC's buf_hash_table is too small
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Saso Kiselkov <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Approved by: Gordon Ross <[email protected]>
References:
https://www.illumos.org/issues/5034
https://github.com/illumos/illumos-gate/commit/63e911b
Ported-by: Brian Behlendorf <[email protected]>
Closes #2615
|
|
|
|
|
|
|
|
|
|
|
| |
Change efi_rescan() to loop 10 times instead of 5 on EBUSY and
to sleep at the end of each loop. This helps with some instances
where the kernel does not reload the partition table fast enough
for ZFS to detect.
Signed-off-by: Andrew Hamilton <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2493
|
|
|
|
|
|
|
|
|
|
| |
Some nvlist_t could be leaked in error handling paths.
Also make sure cb argument to zfs_zevent_post() cannnot
be NULL.
Signed-off-by: Isaac Huang <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2158
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4631 zvol_get_stats triggering too many reads
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Sebastien Roy <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4631
https://github.com/illumos/illumos-gate/commit/bbfa8ea
Ported-by: Boris Protopopov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2612
Closes #2480
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Intel DC S3500 and Intel DC S3700 are optimized to handle 4KB
sectors well despite of their 8KB page sizes, so we move them to a new
category for enterprise drives where they will receive ashift=12. They
are joined by the Intel 730 series, which uses the same disk controller,
as well as a San Disk enterprise drive. The drive IDs for these two were
obtained by myself with the drive_id utility. The drive ID for the 240GB
Intel 730 model was extrapolated from the drive ID for the 480GB model.
Lastly, we also add some Western Digital mobile drives. ryuo in
\#zfsonlinux on freenode obtained "ATA WDC WD2500BEVT-0" from
running drive_id on his own hardware. The additional drives in that
family were extrapolated from that identifer.
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2601
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Illumos 4982 added code to metaslab_fragmentation() to proactively update
space maps when the spacemap_histogram feature is enabled. This should
only happen when the pool is writeable.
References:
https://www.illumos.org/issues/4982
https://github.com/illumos/illumos-gate/commit/2e4c998
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2595
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4976 zfs should only avoid writing to a failing non-redundant top-level vdev
4978 ztest fails in get_metaslab_refcount()
4979 extend free space histogram to device and pool
4980 metaslabs should have a fragmentation metric
4981 remove fragmented ops vector from block allocator
4982 space_map object should proactively upgrade when feature is enabled
4983 need to collect metaslab information via mdb
4984 device selection should use fragmentation metric
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Approved by: Garrett D'Amore <[email protected]>
References:
https://www.illumos.org/issues/4976
https://www.illumos.org/issues/4978
https://www.illumos.org/issues/4979
https://www.illumos.org/issues/4980
https://www.illumos.org/issues/4981
https://www.illumos.org/issues/4982
https://www.illumos.org/issues/4983
https://www.illumos.org/issues/4984
https://github.com/illumos/illumos-gate/commit/2e4c998
Notes:
The "zdb -M" option has been re-tasked to display the new metaslab
fragmentation metric and the new "zdb -I" option is used to control
the maximum number of in-flight I/Os.
The new fragmentation metric is derived from the space map histogram
which has been rolled up to the vdev and pool level and is presented
to the user via "zpool list".
Add a number of module parameters related to the new metaslab weighting
logic.
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2595
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new 'overlay' property (default 'off') that controls whether the
filesystem should be mounted even if the mountpoint is busy or if it
should fail with a 'mountpoint not empty'.
Doing overlay mounts is the default mount behavior on Linux, but not
in ZFS. It have been decided that following the ZFS behavior should
be the default, but this overlay allows for site administrator to
override this decision on a per-dataset basis.
Signed-off-by: Turbo Fredriksson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes: #2503
|
|
|
|
|
|
|
|
|
|
|
|
| |
We should have included sys/taskq.h directly because we use the taskq
code here, but we instead had files that included sys/taskq.h also
include sys/kmem.h, which happened to include sys/taskq.h. sys/kmem.h no
longer does this, so we must define the include as we should
have done in the first place.
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2411
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove lines that contain only a hyphen (match '^-$' instead of '-').
I had a root fs with a hyphen in the name (fedora/ROOT/Fedora20-Dev),
it was not detected because sed eliminated that line of output from
'zpool list -Ho bootfs'.
Signed-off-by: Evan Susarret <[email protected]>
Signed-off-by: Turbo Fredriksson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2196
|
|
|
|
|
|
|
|
|
| |
Add #ifndef PAGESIZE to avoid redefinition warning on platforms
where this value is already provided.
Signed-off-by: Alec Salazar <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2588
|
|
|
|
|
|
|
|
|
|
| |
Most of the code base already uses va_list, which is specified by
iso-c. gcc/glibc provides 'typedef __gnuc_va_list va_list'. and
when not using gcc/glibc we can't expect to find __gnuc_va_list.
Signed-off-by: Alec Salazar <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2588
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 7973e46 which brings the basic flow of the
code back in line with the other ZFS implementations. This
was possible due to the following related changes.
e89260a Directory xattr znodes hold a reference on their parent
6f9548c Fix deadlock in zfs_zget()
0a50679 Add zfs_iput_async() interface
4dd1893 Avoid 128K kmem allocations in mzap_upgrade()
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #457
Closes #2058
Closes #2128
Closes #2240
|
|
|
|
|
|
|
|
|
|
|
| |
Handle all iputs in zfs_purgedir() and zfs_inode_destroy()
asynchronously to prevent deadlocks. When the iputs are allowed
to run synchronously in the destroy call path deadlocks between
xattr directory inodes and their parent file inodes are possible.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Closes #457
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As originally implemented the mzap_upgrade() function will
perform up to SPA_MAXBLOCKSIZE allocations using kmem_alloc().
These large allocations can potentially block indefinitely
if contiguous memory is not available. Since this allocation
is done under the zap->zap_rwlock it can appear as if there is
a deadlock in zap_lockdir(). This is shown below.
The optimal fix for this would be to rework mzap_upgrade()
such that no large allocations are required. This could be
done but it would result in us diverging further from the other
implementations. Therefore I've opted against doing this
unless it becomes absolutely necessary.
Instead mzap_upgrade() has been updated to use zio_buf_alloc()
which can reliably provide buffers of up to SPA_MAXBLOCKSIZE.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Yao <[email protected]>
Close #2580
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of commit e8b96c6 the search zio used by the
vdev_queue_io_to_issue() function was moved to the heap
to minimize stack usage. Functionally this is fine, but
to maximize performance it's best to minimize the number
of dynamic allocations.
To avoid this allocation temporary space for the search
zio has been reserved in the vdev_queue structure. All
access must be serialized through the vq_lock.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ned Bass <[email protected]>
Closes #2572
|
|
|
|
|
|
|
|
|
|
|
| |
The dsl_dataset_rollback_check() function is executed in the
txg_sync context. To prevent a potential deadlock due to direct
memory reclaim it must use KM_PUSHPAGE. This was introduced by
the recent 'zfs bookmark' features, commit da53684.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Eric Dillmann <[email protected]>
Closes #2569
|
|
|
|
|
|
|
|
|
|
| |
These can be manually installed as needed by end users. They
have been added to the repository so they can be kept up to date
with the latest code.
Signed-off-by: Turbo Fredriksson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1588
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4914 zfs on-disk bookmark structure should be named *_phys_t
Reviewed by: George Wilson <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Richard Lowe <[email protected]>
Reviewed by: Saso Kiselkov <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
References:
https://www.illumos.org/issues/4914
https://github.com/illumos/illumos-gate/commit/7802d7b
Porting notes:
There were a number of zfsonlinux-specific uses of zbookmark_t which
needed to be updated. This should reduce the likelihood of further
problems like issue #2094 from occurring.
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2558
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4881 zfs send performance degradation when embedded block pointers
are encountered
Reviewed by: George Wilson <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4881
https://github.com/illumos/illumos-gate/commit/06315b7
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2547
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4897 Space accounting mismatch in L2ARC/zpool
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Boris Protopopov <[email protected]>
Approved by: Dan McDonald <[email protected]>
From the illumos issue tracker:
L2ARC vdev space usage statistics are calculated as the delta
between the maximum and minimum vdev offset ever written to
by the L2ARC fill thread, but do not inform the user of how
much space in between these two offsets is actually taken up by
cached buffers. This fix changes that so that vdev space usage
stats on L2ARC devices accurately track the volume of buffers
stored on them, allowing users to see the exact L2ARC usage in
"zpool iostat -v".
References:
https://www.illumos.org/issues/4897
https://github.com/illumos/illumos-gate/commit/3038a2b
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2555
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4390 i/o errors when deleting filesystem/zvol can lead to space map corruption
Reviewed by: George Wilson <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Reviewed by: Saso Kiselkov <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4390
https://github.com/illumos/illumos-gate/commit/7fd05ac
Porting notes:
Previous stack-reduction efforts in traverse_visitb() caused a fair
number of un-mergable pieces of code. This patch should reduce its
stack footprint a bit more.
The new local bptree_entry_phys_t in bptree_add() is dynamically-allocated
using kmem_zalloc() for the purpose of stack reduction.
The new global zfs_free_leak_on_eio has been defined as an integer
rather than a boolean_t as was the case with the related zfs_recover
global. Also, zfs_free_leak_on_eio's definition has been inserted into
zfs_debug.c for consistency with the existing definition of zfs_recover.
Illumos placed it in spa_misc.c.
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2545
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4757 ZFS embedded-data block pointers ("zero block compression")
4913 zfs release should not be subject to space checks
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Max Grossman <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4757
https://www.illumos.org/issues/4913
https://github.com/illumos/illumos-gate/commit/5d7b4d4
Porting notes:
For compatibility with the fastpath code the zio_done() function
needed to be updated. Because embedded-data block pointers do
not require DVAs to be allocated the associated vdevs will not
be marked and therefore should not be unmarked.
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2544
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reviewed by: George Wilson <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Approved by: Richard Lowe <[email protected]>
Description from Matt Ahrens's bug report at Delphix:
Add a new zfs property, "redundant_metadata" which can have values
"all" or "most". The default will be "all", which is the current
behavior. Setting to "most" will cause us to only store 1 copy of
level-1 indirect blocks of user data files.
Additional notes:
The new man page section for this property states
"The exact behavior of which metadata blocks
are stored redundantly may change in future releases."
and:
"When set to most, ZFS stores an extra copy of most types of
metadata. This can improve performance of random writes,
because less metadata must be written."
The current implementation is as described above in Matt's blog.
It is controlled by a new global integer
"zfs_redundant_metadata_most_ditto_level", currently initialized
to 2. When "redundant_metadata" is set to "most", only indirect
blocks of the specified level and higher will have additional ditto
blocks created.
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2542
|
|
|
|
|
|
|
|
|
|
| |
As of a recent group of Illumos/Delphix updates, zed needs libzfs_core
in order to resolve lzc_get_bookmarks() and likely other functions
going forward.
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2534
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4754 io issued to near-full luns even after setting noalloc threshold
4755 mg_alloc_failures is no longer needed
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4754
https://www.illumos.org/issues/4755
https://github.com/illumos/illumos-gate/commit/b6240e8
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2533
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4374 dn_free_ranges should use range_tree_t
Reviewed by: George Wilson <[email protected]>
Reviewed by: Max Grossman <[email protected]>
Reviewed by: Christopher Siden <[email protected]
Reviewed by: Garrett D'Amore <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4374
https://github.com/illumos/illumos-gate/commit/bf16b11
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2531
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4369 implement zfs bookmarks
4368 zfs send filesystems from readonly pools
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Garrett D'Amore <[email protected]>
References:
https://www.illumos.org/issues/4369
https://www.illumos.org/issues/4368
https://github.com/illumos/illumos-gate/commit/78f1710
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2530
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4370 avoid transmitting holes during zfs send
4371 DMU code clean up
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Josef 'Jeff' Sipek <[email protected]>
Approved by: Garrett D'Amore <[email protected]>a
References:
https://www.illumos.org/issues/4370
https://www.illumos.org/issues/4371
https://github.com/illumos/illumos-gate/commit/43466aa
Ported by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2529
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4171 clean up spa_feature_*() interfaces
4172 implement extensible_dataset feature for use by other zpool features
Reviewed by: Max Grossman <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Jerry Jelinek <[email protected]>
Approved by: Garrett D'Amore <[email protected]>a
References:
https://www.illumos.org/issues/4171
https://www.illumos.org/issues/4172
https://github.com/illumos/illumos-gate/commit/2acef22
Ported-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2528
|
|
|
|
|
|
|
|
| |
Support for ZFS has now been merged in to both blkid and grub.
Therefore, there is no longer a need to carry these stale
patches in the ZFS source tree.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This functionality is already available in 'zfs get'. Providing
it for 'zpool get' is useful and good for consistency.
Signed-off-by: Turbo Fredriksson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes: #2522
|
|
|
|
|
|
|
|
|
|
|
|
| |
In no way complete - most have been trial and error and some
deducing what they could mean. It needs more information from
someone that knows the code better. But this is a start and
it lays the basic structure for adding this additional detail.
Signed-off-by: Turbo Fredriksson <[email protected]>
Signed-off-by: Prakash Surya <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #2357
|