| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use the standard Linux MODULE_VERSION macro to expose the installed
zavl, znvpair, zunicode, zcommon, zfs, and zpios module versions.
This will also automatically add a checksum of the .c files and
headers in "srcversion". See:
/sys/module/zavl/version
/sys/module/zavl/srcversion
/sys/module/znvpair/version
/sys/module/znvpair/srcversion
/sys/module/zunicode/version
/sys/module/zunicode/srcversion
/sys/module/zcommon/version
/sys/module/zcommon/srcversion
/sys/module/zfs/version
/sys/module/zfs/srcversion
/sys/module/zpios/version
/sys/module/zpios/srcversion
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1923
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4045 zfs write throttle & i/o scheduler performance work
1. The ZFS i/o scheduler (vdev_queue.c) now divides i/os into 5 classes: sync
read, sync write, async read, async write, and scrub/resilver. The scheduler
issues a number of concurrent i/os from each class to the device. Once a class
has been selected, an i/o is selected from this class using either an elevator
algorithem (async, scrub classes) or FIFO (sync classes). The number of
concurrent async write i/os is tuned dynamically based on i/o load, to achieve
good sync i/o latency when there is not a high load of writes, and good write
throughput when there is. See the block comment in vdev_queue.c (reproduced
below) for more details.
2. The write throttle (dsl_pool_tempreserve_space() and
txg_constrain_throughput()) is rewritten to produce much more consistent delays
when under constant load. The new write throttle is based on the amount of
dirty data, rather than guesses about future performance of the system. When
there is a lot of dirty data, each transaction (e.g. write() syscall) will be
delayed by the same small amount. This eliminates the "brick wall of wait"
that the old write throttle could hit, causing all transactions to wait several
seconds until the next txg opens. One of the keys to the new write throttle is
decrementing the amount of dirty data as i/o completes, rather than at the end
of spa_sync(). Note that the write throttle is only applied once the i/o
scheduler is issuing the maximum number of outstanding async writes. See the
block comments in dsl_pool.c and above dmu_tx_delay() (reproduced below) for
more details.
This diff has several other effects, including:
* the commonly-tuned global variable zfs_vdev_max_pending has been removed;
use per-class zfs_vdev_*_max_active values or zfs_vdev_max_active instead.
* the size of each txg (meaning the amount of dirty data written, and thus the
time it takes to write out) is now controlled differently. There is no longer
an explicit time goal; the primary determinant is amount of dirty data.
Systems that are under light or medium load will now often see that a txg is
always syncing, but the impact to performance (e.g. read latency) is minimal.
Tune zfs_dirty_data_max and zfs_dirty_data_sync to control this.
* zio_taskq_batch_pct = 75 -- Only use 75% of all CPUs for compression,
checksum, etc. This improves latency by not allowing these CPU-intensive tasks
to consume all CPU (on machines with at least 4 CPU's; the percentage is
rounded up).
--matt
APPENDIX: problems with the current i/o scheduler
The current ZFS i/o scheduler (vdev_queue.c) is deadline based. The problem
with this is that if there are always i/os pending, then certain classes of
i/os can see very long delays.
For example, if there are always synchronous reads outstanding, then no async
writes will be serviced until they become "past due". One symptom of this
situation is that each pass of the txg sync takes at least several seconds
(typically 3 seconds).
If many i/os become "past due" (their deadline is in the past), then we must
service all of these overdue i/os before any new i/os. This happens when we
enqueue a batch of async writes for the txg sync, with deadlines 2.5 seconds in
the future. If we can't complete all the i/os in 2.5 seconds (e.g. because
there were always reads pending), then these i/os will become past due. Now we
must service all the "async" writes (which could be hundreds of megabytes)
before we service any reads, introducing considerable latency to synchronous
i/os (reads or ZIL writes).
Notes on porting to ZFS on Linux:
- zio_t gained new members io_physdone and io_phys_children. Because
object caches in the Linux port call the constructor only once at
allocation time, objects may contain residual data when retrieved
from the cache. Therefore zio_create() was updated to zero out the two
new fields.
- vdev_mirror_pending() relied on the depth of the per-vdev pending queue
(vq->vq_pending_tree) to select the least-busy leaf vdev to read from.
This tree has been replaced by vq->vq_active_tree which is now used
for the same purpose.
- vdev_queue_init() used the value of zfs_vdev_max_pending to determine
the number of vdev I/O buffers to pre-allocate. That global no longer
exists, so we instead use the sum of the *_max_active values for each of
the five I/O classes described above.
- The Illumos implementation of dmu_tx_delay() delays a transaction by
sleeping in condition variable embedded in the thread
(curthread->t_delay_cv). We do not have an equivalent CV to use in
Linux, so this change replaced the delay logic with a wrapper called
zfs_sleep_until(). This wrapper could be adopted upstream and in other
downstream ports to abstract away operating system-specific delay logic.
- These tunables are added as module parameters, and descriptions added
to the zfs-module-parameters.5 man page.
spa_asize_inflation
zfs_deadman_synctime_ms
zfs_vdev_max_active
zfs_vdev_async_write_active_min_dirty_percent
zfs_vdev_async_write_active_max_dirty_percent
zfs_vdev_async_read_max_active
zfs_vdev_async_read_min_active
zfs_vdev_async_write_max_active
zfs_vdev_async_write_min_active
zfs_vdev_scrub_max_active
zfs_vdev_scrub_min_active
zfs_vdev_sync_read_max_active
zfs_vdev_sync_read_min_active
zfs_vdev_sync_write_max_active
zfs_vdev_sync_write_min_active
zfs_dirty_data_max_percent
zfs_delay_min_dirty_percent
zfs_dirty_data_max_max_percent
zfs_dirty_data_max
zfs_dirty_data_max_max
zfs_dirty_data_sync
zfs_delay_scale
The latter four have type unsigned long, whereas they are uint64_t in
Illumos. This accommodates Linux's module_param() supported types, but
means they may overflow on 32-bit architectures.
The values zfs_dirty_data_max and zfs_dirty_data_max_max are the most
likely to overflow on 32-bit systems, since they express physical RAM
sizes in bytes. In fact, Illumos initializes zfs_dirty_data_max_max to
2^32 which does overflow. To resolve that, this port instead initializes
it in arc_init() to 25% of physical RAM, and adds the tunable
zfs_dirty_data_max_max_percent to override that percentage. While this
solution doesn't completely avoid the overflow issue, it should be a
reasonable default for most systems, and the minority of affected
systems can work around the issue by overriding the defaults.
- Fixed reversed logic in comment above zfs_delay_scale declaration.
- Clarified comments in vdev_queue.c regarding when per-queue minimums take
effect.
- Replaced dmu_tx_write_limit in the dmu_tx kstat file
with dmu_tx_dirty_delay and dmu_tx_dirty_over_max. The first counts
how many times a transaction has been delayed because the pool dirty
data has exceeded zfs_delay_min_dirty_percent. The latter counts how
many times the pool dirty data has exceeded zfs_dirty_data_max (which
we expect to never happen).
- The original patch would have regressed the bug fixed in
zfsonlinux/zfs@c418410, which prevented users from setting the
zfs_vdev_aggregation_limit tuning larger than SPA_MAXBLOCKSIZE.
A similar fix is added to vdev_queue_aggregate().
- In vdev_queue_io_to_issue(), dynamically allocate 'zio_t search' on the
heap instead of the stack. In Linux we can't afford such large
structures on the stack.
Reviewed by: George Wilson <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Ned Bass <[email protected]>
Reviewed by: Brendan Gregg <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
References:
http://www.illumos.org/issues/4045
illumos/illumos-gate@69962b5647e4a8b9b14998733b765925381b727e
Ported-by: Ned Bass <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1913
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a lock contention issue by allowing threads not holding
ZPL locks to block when waiting to assign a transaction.
Porting Notes:
zfs_putpage() still uses TXG_NOWAIT, unlike the upstream version. This
case may be a contention point just like zfs_write(), however it is not
safe to block here since it may be called during memory reclaim.
Reviewed by: George Wilson <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Reviewed by: Boris Protopopov <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4347
illumos/illumos-gate@e722410c49fe67cbf0f639cbcc288bd6cbcf7dd1
Ported-by: Ned Bass <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As part of zfs_sb_teardown() there is an assertion that all inodes
which are part of the zsb->z_all_znodes list have at least one
reference on them. This is always true for the standard unmount
case but there are two other cases where it is not strictly true.
* zfs_ioc_rollback() - This is the most common case and it results
from the fact that we aren't unmounting the filesystem. During a
normal unmount the MS_ACTIVE flag will be cleared on the super block
causing iput_final() to evict the inode when its reference count
drops to zero. However, during a rollback MS_ACTIVE remains set
since we're rolling back a live filesystem and need to preserve the
existing super block. This allows inodes with a zero reference count
to stay in the cache thereby violating the assertion.
* destroy_inode() / zfs_sb_teardown() - There exists a small race
between dropping the last reference on an inode and removing it from
the zsb->z_all_znodes list. This is unlikely to occur but could also
trigger the assertion which is incorrect. The inode may safely have
a zero reference count in this case.
Since allowing a zero reference count on the inode is expected and
safe for both of these cases the simplest thing to do is remove the
ASSERT. This code is only enabled for default builds so removing
this entirely is a very safe change.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chris Dunlop <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Closes #1417
Closes #1536
|
|
|
|
|
|
|
|
|
|
| |
This should hopefully catch the rest of the allocations in the
user hold/release processing that were missed by commit
65c67ea86e9f112177f1ad32de8e780f10798a64.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1852
Closes #1855
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, using msync() results in the following code path:
sys_msync -> zpl_fsync -> filemap_write_and_wait_range -> zpl_writepages -> write_cache_pages -> zpl_putpage
In such a code path, zil_commit() is called as part of zpl_putpage().
This means that for each page, the write is handed to the DMU, the ZIL
is committed, and only then do we move on to the next page. As one might
imagine, this results in atrocious performance where there is a large
number of pages to write: instead of committing a batch of N writes,
we do N commits containing one page each. In some extreme cases this
can result in msync() being ~700 times slower than it should be, as well
as very inefficient use of ZIL resources.
This patch fixes this issue by making sure that the requested writes
are batched and then committed only once. Unfortunately, the
implementation is somewhat non-trivial because there is no way to run
write_cache_pages in SYNC mode (so that we get all pages) without
making it wait on the writeback tag for each page.
The solution implemented here is composed of two parts:
- I added a new callback system to the ZIL, which allows the caller to
be notified when its ITX gets written to stable storage. One nice
thing is that the callback is called not only in zil_commit() but
in zil_sync() as well, which means that the caller doesn't have to
care whether the write ended up in the ZIL or the DMU: it will get
notified as soon as it's safe, period. This is an improvement over
dmu_tx_callback_register() that was used previously, which only
supports DMU writes. The rationale for this change is to allow
zpl_putpage() to be notified when a ZIL commit is completed without
having to block on zil_commit() itself.
- zpl_writepages() now calls write_cache_pages in non-SYNC mode, which
will prevent (1) write_cache_pages from blocking, and (2) zpl_putpage
from issuing ZIL commits. zpl_writepages() will issue the commit
itself instead of relying on zpl_putpage() to do it, thus nicely
batching the writes. Note, however, that we still have to call
write_cache_pages() again in SYNC mode because there is an edge case
documented in the implementation of write_cache_pages() whereas it
will not give us all dirty pages when running in non-SYNC mode. Thus
we need to run it at least once in SYNC mode to make sure we honor
persistency guarantees. This only happens when the pages are
modified at the same time msync() is running, which should be rare.
In most cases there won't be any additional pages and this second
call will do nothing.
Note that this change also fixes a bug related to #907 whereas calling
msync() on pages that were already handed over to the DMU in a previous
writepages() call would make msync() block until the next TXG sync
instead of returning as soon as the ZIL commit is complete. The new
callback system fixes that problem.
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1849
Closes #907
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because ZFS bypasses the page cache we don't inherit per-task I/O
accounting for free. However, the Linux kernel does provide helper
functions allow us to perform our own accounting. These are most
commonly used for direct IO which also bypasses the page cache, but
they can be used for the common read/write call paths as well.
Signed-off-by: Pavel Snajdr <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #313
Closes #1275
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4322 ZFS deadlock on dp_config_rwlock
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Ilya Usvyatsky <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4322
illumos/illumos-gate@c50d56f667f119d78fa3d94d6bef2c298ba556f6
Ported by: Chris Dunlop <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1886
|
|
|
|
|
|
|
|
| |
Under Linux this restriction does not apply because we have access
to all the required devices.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1631
|
|
|
|
|
|
|
|
| |
Properly initialize SELinux xattrs for all inode types. The
initial implementation accidentally only did this for files.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1832
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During pool import stack overflows may still occur due to the
potentially deep recursion of traverse_visitbp(). This is most
likely to occur when additional layers are added to the block
device stack such as DM multipath. To minimize the stack usage
for this call path the following changes were made:
1) Added the keywork 'noinline' to the vdev_*_map_alloc() functions
to prevent them from being inlined by gcc. This reduced the
stack usage of vdev_raidz_io_start() from 208 to 128 bytes, and
vdev_mirror_io_start() from 144 to 128 bytes.
2) The 'saved_poolname' charater array in zfsdev_ioctl() was moved
from the stack to the heap. This reduced the stack usage of
zfsdev_ioctl() from 368 to 112 bytes.
3) The major saving came from slimming down traverse_visitbp() from
from 224 to 144 bytes. Since this function is called recursively
the 80 bytes saved per invokation adds up. The following changes
were made:
a) The 'hard' local variable was replaced by a TD_HARD() macro.
b) The 'pd' local variable was replaced by 'td->td_pfd' references.
c) The zbookmark_t was moved to the heap. This does cost us an
additional memory allocation per recursion by that cost should
still be minimal. The cost could be further reduced by adding
a dedicated zbookmark_t slab cache.
d) The variable declarations in 'if (BP_GET_LEVEL()) { }' were
restructured to use the minimum amount of stack. This includes
removing the 'cbp' local variable.
Overall for the offending use case roughly 1584 of total stack space
has been saved. This is enough to avoid overflowing the stack on
stock kernels with 8k stacks. See #1778 for additional details.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ned Bass <[email protected]>
Closes #1778
|
|
|
|
|
|
|
|
|
|
| |
Commit 95fd54a1c5b93bb2aa3e7dffc28c784b1e21a8bb restructured the
hold/release processing and moved some of the work into the sync task.
A number of nvlist allocations now need to use KM_PUSHPAGE.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1852
Closes #1855
|
|
|
|
|
|
|
|
|
|
|
| |
The Illumos #3875 patch reverted a part of ZoL's 7b3e34b which added
special-case error handling for zfs_rezget(). The error handling dealt
with the case in which an all-ones object number ended up being passed
to dnode_hold() and causing an EINVAL to be returned from zfs_rezget().
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1859
Closes #1861
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the current snapshot automount implementation, it is possible for
multiple mounts to attempted concurrently. Only one of the mounts will
succeed and the other will fail. The failed mounts will cause an EREMOTE
to be propagated back to the application.
This commit works around the problem by adding a new exit status,
MOUNT_BUSY to the mount.zfs program which is used when the underlying
mount(2) call returns EBUSY. The zfs code detects this condition and
treats it as if the mount had succeeded.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1819
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The required Posix ACL interfaces are only available for kernels
with CONFIG_FS_POSIX_ACL defined. Therefore, only enable Posix
ACL support for these kernels. All major distribution kernels
enable CONFIG_FS_POSIX_ACL by default.
If your kernel does not support Posix ACLs the following warning
will be printed at ZFS module load time.
"ZFS: Posix ACLs disabled by kernel"
Signed-off-by: Massimo Maggi <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1825
|
|
|
|
|
|
|
|
|
| |
References:
delphix/delphix-os@931c8aaab74b6412933d299890894262e2ef8380
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1650
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A couple of kmem_alloc() allocations were using KM_SLEEP in
the sync thread context. These were accidentally introduced
by the recent set of Illumos patches. The solution is to
switch to KM_PUSHPAGE.
dsl_dataset_promote_sync() -> promote_hold() -> snaplist_make() ->
kmem_alloc(sizeof (*snap), KM_SLEEP);
dsl_dataset_user_hold_sync() -> dsl_onexit_hold_cleanup() ->
kmem_alloc(sizeof (*ca), KM_SLEEP)
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
| |
3995 Memory leak of compressed buffers in l2arc_write_done
References:
https://illumos.org/issues/3995
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #1688
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4168 ztest assertion failure in dbuf_undirty
4169 verbatim import causes zdb to segfault
4170 zhack leaves pool in ACTIVE state
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Eric Schrock <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4168
https://www.illumos.org/issues/4169
https://www.illumos.org/issues/4170
illumos/illumos-gate@7fdd916c474ea52896c671bbe7b56ba34a1ca132
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4082 zfs receive gets EFBIG from dmu_tx_hold_free()
Reviewed by: Eric Schrock <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/4082
illumos/illumos-gate@5253393b09789ec67bec153b866d7285a1cf1645
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3954 metaslabs continue to load even after hitting zfs_mg_alloc_failure limit
4080 zpool clear fails to clear pool
4081 need zfs_mg_noalloc_threshold
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/3954
https://www.illumos.org/issues/4080
https://www.illumos.org/issues/4081
illumos/illumos-gate@22e30981d82a0b6dc89253596ededafae8655e00
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4046 dsl_dataset_t ds_dir->dd_lock is highly contended
Reviewed by: Eric Schrock <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Garrett D'Amore <[email protected]>
References:
https://www.illumos.org/issues/4046
illumos/illumos-gate@b62969f868a827f0823a084bc0af9c7d8b76c659
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Porting notes:
1. This commit removed dsl_dataset_namelen in Illumos, but that
appears to have been removed from ZFSOnLinux in an earlier commit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
4047 panic from dbuf_free_range() from dmu_free_object() while
doing zfs receive
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/4047
illumos/illumos-gate@713d6c208802cfbb806329ec0d154b641b80c355
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Porting notes:
1. The exported symbol dmu_free_object() was renamed to
dmu_free_long_object() in Illumos.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3996 want a libzfs_core API to rollback to latest snapshot
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Andy Stormont <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/3996
illumos/illumos-gate@a7027df17fad220a20367b9d1eb251bc6300d203
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3956 ::vdev -r should work with pipelines
3957 ztest should update the cachefile before killing itself
3958 multiple scans can lead to partial resilvering
3959 ddt entries are not always resilvered
3960 dsl_scan can skip over dedup-ed blocks if physical birth != logical birth
3961 freed gang blocks are not resilvered and can cause pool to suspend
3962 ztest should print out zfs debug buffer before exiting
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/3956
https://www.illumos.org/issues/3957
https://www.illumos.org/issues/3958
https://www.illumos.org/issues/3959
https://www.illumos.org/issues/3960
https://www.illumos.org/issues/3961
https://www.illumos.org/issues/3962
illumos/illumos-gate@b4952e17e8858d3225793b28788278de9fe6038d
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Porting notes:
1. zfs_dbgmsg_print() is only used in userland. Since we do not have
mdb on Linux, it does not make sense to make it available in the
kernel. This means that a build failure will occur if any future
kernel patch depends on it. However, that is unlikely given that
this functionality was added to support zdb.
2. zfs_dbgmsg_print() is only invoked for -VVV or greater log levels.
This preserves the existing behavior of minimal noise when running
with -V, and -VV.
3. In vdev_config_generate() the call to nvlist_alloc() was not
changed to fnvlist_alloc() because we must pass KM_PUSHPAGE in
the txg_sync context.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3949 ztest fault injection should avoid resilvering devices
3950 ztest: deadman fires when we're doing a scan
3951 ztest hang when running dedup test
3952 ztest: ztest_reguid test and ztest_fault_inject don't place nice together
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/3949
https://www.illumos.org/issues/3950
https://www.illumos.org/issues/3951
https://www.illumos.org/issues/3952
illumos/illumos-gate@2c1e2b44148432fb7a509dd216a99299b6740250
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Porting notes:
1. The deadman thread was removed from ztest during the original
port because it depended on Solaris thr_create() interface.
This functionality should be reintroduced using the more
portable pthreads.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3955 ztest failure: assertion refcount_count(&tx->tx_space_written) +
delta <= tx->tx_space_towrite
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/3955
illumos/illumos-gate@be9000cc677e0a8d04e5be45c61d7370fc8c7b54
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3973 zfs_ioc_rename alters passed in zc->zc_name
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Christopher Siden <[email protected]>
References:
https://www.illumos.org/issues/3973
illumos/illumos-gate@a0c1127b147dc6a0372b141deb8c0c2b0195b8ea
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3834 incremental replication of 'holey' file systems is slow
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/3834
illumos/illumos-gate@ca48f36f20f6098ceb19d5b084b6b3d4b8eca9fa
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3836 zio_free() can be processed immediately in the common case
Reviewed by: George Wilson <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/3836
illumos/illumos-gate@9cb154a3c9f170904dce9bad5bd5a7d256b922a4
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3112 ztest does not honor ZFS_DEBUG
3113 ztest should use watchpoints to protect frozen arc bufs
3114 some leaked nvlists in zfsdev_ioctl
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Matt Amdur <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Approved by: Eric Schrock <[email protected]>
References:
https://www.illumos.org/issues/3112
https://www.illumos.org/issues/3113
https://www.illumos.org/issues/3114
illumos/illumos-gate@cd1c8b85eb30b568e9816221430c479ace7a559d
The /proc/self/cmd watchpoint interface is specific to Solaris.
Therefore, the #3113 implementation was reworked to use the more
portable mprotect(2) system call. When the pages are watched they
are marked read-only for protection. Any write to the protected
address range immediately trigger a SIGSEGV. The pages are marked
writable again when they are unwatched.
Ported-by: Brian Behlendorf <[email protected]>
Issue #1489
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3236 zio nop-write
Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Approved by: Garrett D'Amore <[email protected]>
References:
illumos/illumos-gate@80901aea8e78a2c20751f61f01bebd1d5b5c2ba5
https://www.illumos.org/issues/3236
Porting Notes
1. This patch is being merged dispite an increased instance of
https://www.illumos.org/issues/3113 being triggered by ztest.
Ported-by: Brian Behlendorf <[email protected]>
Issue #1489
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3875 panic in zfs_root() after failed rollback
Reviewed by: Jerry Jelinek <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Gordon Ross <[email protected]>
References:
https://www.illumos.org/issues/3875
illumos/illumos-gate@91948b51b8e978ddc88a36b2bc3ae83c20cdc9aa
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3888 zfs recv -F should destroy any snapshots created since
the incremental source
Reviewed by: George Wilson <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Peng Dai <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/3888
illumos/illumos-gate@34f2f8cf94052481c81be2e134b94a00b501bf21
Porting notes:
1. Commit 1fde1e37208c2f56c72c70a06676676f04b65998 wrapped a
declaration in dsl_dataset_modified_since_lastsnap in ASSERTV().
The ASSERTV() and local variable have been removed to avoid an
unused variable warning.
Signed-off-by: Brian Behlendorf <[email protected]>
Ported-by: Richard Yao <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3894 zfs should not allow snapshot of inconsistent dataset
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Gordon Ross <[email protected]>
References:
https://www.illumos.org/issues/3894
illumos/illumos-gate@ca48f36f20f6098ceb19d5b084b6b3d4b8eca9fa
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3829 fix for 3740 changed behavior of zfs destroy/hold/release ioctl
Reviewed by: Matt Amdur <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/3829
illumos/illumos-gate@bb6e70758d0c30c09f148026d6e686e21cfc8d18
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3740 Poor ZFS send / receive performance due to snapshot
hold / release processing
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Christopher Siden <[email protected]>
References:
https://www.illumos.org/issues/3740
illumos/illumos-gate@a7a845e4bf22fd1b2a284729ccd95c7370a0438c
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Porting notes:
1. 13fe019870c8779bf2f5b3ff731b512cf89133ef introduced a merge conflict
in dsl_dataset_user_release_tmp where some variables were moved
outside of the preprocessor directive.
2. dea9dfefdd747534b3846845629d2200f0616dad made the previous merge
conflict worse by switching KM_SLEEP to KM_PUSHPAGE. This is notable
because this commit refactors the code, adding a new KM_SLEEP
allocation. It is not clear to me whether this should be converted
to KM_PUSHPAGE.
3. We had a merge conflict in libzfs_sendrecv.c because of copyright
notices.
4. Several small C99 compatibility fixed were made.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3744 zfs shouldn't ignore errors unmounting snapshots
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Christopher Siden <[email protected]>
References:
https://www.illumos.org/issues/3744
illumos/illumos-gate@fc7a6e3fefc649cb65c8e2a35d194781445008b0
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Porting notes:
1. There is no clear way to distinguish between a failure when we
tried to unmount the snapdir of a zvol (which does not exist)
and the failure when we try to unmount a snapdir of a dataset,
so the changes to zfs_unmount_snap() were dropped in favor of
an altered Linux function that unconditionally returns 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3743 zfs needs a refcount audit
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Eric Schrock <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Christopher Siden <[email protected]>
References:
https://www.illumos.org/issues/3743
illumos/illumos-gate@b287be1ba86043996f49b1cc34c80cc620f9b841
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3742 zfs comments need cleaner, more consistent style
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Eric Schrock <[email protected]>
Approved by: Christopher Siden <[email protected]>
References:
https://www.illumos.org/issues/3742
illumos/illumos-gate@f7170741490edba9d1d9c697c177c887172bc741
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Porting notes:
1. The change to zfs_vfsops.c was dropped because it involves
zfs_mount_label_policy, which does not exist in the Linux port.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3741 zfs needs better comments
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Eric Schrock <[email protected]>
Approved by: Christopher Siden <[email protected]>
References:
https://www.illumos.org/issues/3741
illumos/illumos-gate@3e30c24aeefdee1631958ecf17f18da671781956
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3699 zfs hold or release of a non-existent snapshot does not output error
3739 cannot set zfs quota or reservation on pool version < 22
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Eric Shrock <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/3699
https://www.illumos.org/issues/3739
illumos/illumos-gate@013023d4ed2f6d0cf75380ec686a4aac392b4e43
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3582 zfs_delay() should support a variable resolution
3584 DTrace sdt probes for ZFS txg states
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Approved by: Garrett D'Amore <[email protected]>
References:
https://www.illumos.org/issues/3582
illumos/illumos-gate@0689f76
Ported by: Ned Bass <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
| |
References:
illumos/illumos-gate@44bffe012cad6481c82ad67bacd6b40bd29def2b
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
References:
illumos/illumos-gate@d39ee142a97a7c58f60f7b52c62409f2ff64b234
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Porting notes:
1. This commit was so old that only two lines applied to the modern
code base.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3642 dsl_scan_active() should not issue I/O to determine if async
destroying is active
3643 txg_delay should not hold the tc_lock
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Approved by: Gordon Ross <[email protected]>
References:
https://www.illumos.org/issues/3642
https://www.illumos.org/issues/3643
illumos/illumos-gate@4a92375985c37d61406d66cd2b10ee642eb1f5e7
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Porting Notes:
1. The alignment assumptions for the tx_cpu structure assume that
a kmutex_t is 8 bytes. This isn't true under Linux but tc_pad[]
was adjusted anyway for consistency since this structure was
never carefully aligned in ZoL. If careful alignment does impact
performance significantly this should be reworked to be portable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3645 dmu_send_impl: possibilty of pool hold leak
3692 Panic on zfs receive of a recursive deduplicated stream
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: Dan McDonald <[email protected]>
Approved by: Richard Lowe <[email protected]>
References:
https://www.illumos.org/issues/3645
https://www.illumos.org/issues/3692
illumos/illumos-gate@de8d9cff565e928d0ace86f3ea0e2b15094d61df
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1792
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3598 want to dtrace when errors are generated in zfs
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Approved by: Garrett D'Amore <[email protected]>
References:
https://www.illumos.org/issues/3598
illumos/illumos-gate@be6fd75a69ae679453d9cda5bff3326111e6d1ca
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Porting notes:
1. include/sys/zfs_context.h has been modified to render some new
macros inert until dtrace is available on Linux.
2. Linux-specific changes have been adapted to use SET_ERROR().
3. I'm NOT happy about this change. It does nothing but ugly
up the code under Linux. Unfortunately we need to take it to
avoid more merge conflicts in the future. -Brian
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3517 importing pool with autoreplace=on and "hole" vdevs crashes syseventd
Reviewed by: Albert Lee <[email protected]>
Reviewed by: Jeffry Molanus <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Christopher Siden <[email protected]>
References:
https://www.illumos.org/issues/3517
illumos/illumos-gate@efb4a871d8fd510a833bdca610528dde5ed69e42
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3603 panic from bpobj_enqueue_subobj()
3604 zdb should print bpobjs more verbosely
3871 GCC 4.5.3 does not like issue 3604 patch
Reviewed by: Henrik Mattson <[email protected]>
Reviewed by: Adam Leventhal <[email protected]>
Reviewed by: Christopher Siden <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Garrett D'Amore <[email protected]>
Approved by: Dan McDonald <[email protected]>
References:
https://www.illumos.org/issues/3603
https://www.illumos.org/issues/3604
https://www.illumos.org/issues/3871
illumos/illumos-gate@d04756377ddd1cf28ebcf652541094e17b03c889
Ported-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #1775
Note that the patch from Illumos issue 3871 is not accepted into Illumos
at the time of this writing. It is something that I wrote when porting
this. Documentation is in the Illumos issue.
|