| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
CID 176037: Uninitialized scalar variable
This patch fixes an uninitialized variable defect caught by
coverity and introduced in 69830602
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7667
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, there is a bug where older send streams without the
DMU_BACKUP_FEATURE_LARGE_DNODE flag are not handled correctly.
The code in receive_object() fails to handle cases where
drro->drr_dn_slots is set to 0, which is always the case when the
sending code does not support this feature flag. This patch fixes
the issue by ensuring that that a value of 0 is treated as
DNODE_MIN_SLOTS.
Tested-by: DHE <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7617
Closes #7662
|
|
|
|
|
|
| |
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes #7661
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes two problems with the encryption code. First, the
current code does not correctly prohibit the DMU from updating
dn_maxblkid during object truncation within a raw receive. This
usually only causes issues when the truncating DRR_FREE record is
aggregated with DRR_FREE records later in the receive, so it is
relatively hard to hit.
Second, this patch fixes a security issue where reading blocks
within an encrypted object did not guarantee that the dnode block
itself had ever been verified against its MAC. Usually the
verification happened anyway when the bonus buffer was read, but
some use cases (notably zvols) might never perform the check.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7632
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Details about the motivation of this feature and its usage can
be found in this blogpost:
https://sdimitro.github.io/post/zpool-checkpoint/
A lightning talk of this feature can be found here:
https://www.youtube.com/watch?v=fPQA8K40jAM
Implementation details can be found in big block comment of
spa_checkpoint.c
Side-changes that are relevant to this commit but not explained
elsewhere:
* renames members of "struct metaslab trees to be shorter without
losing meaning
* space_map_{alloc,truncate}() accept a block size as a
parameter. The reason is that in the current state all space
maps that we allocate through the DMU use a global tunable
(space_map_blksz) which defauls to 4KB. This is ok for metaslab
space maps in terms of bandwirdth since they are scattered all
over the disk. But for other space maps this default is probably
not what we want. Examples are device removal's vdev_obsolete_sm
or vdev_chedkpoint_sm from this review. Both of these have a
1:1 relationship with each vdev and could benefit from a bigger
block size.
Porting notes:
* The part of dsl_scan_sync() which handles async destroys has
been moved into the new dsl_process_async_destroys() function.
* Remove "VERIFY(!(flags & FWRITE))" in "kernel.c" so zhack can write
to block device backed pools.
* ZTS:
* Fix get_txg() in zpool_sync_001_pos due to "checkpoint_txg".
* Don't use large dd block sizes on /dev/urandom under Linux in
checkpoint_capacity.
* Adopt Delphix-OS's setting of 4 (spa_asize_inflation =
SPA_DVAS_PER_BP + 1) for the checkpoint_capacity test to speed
its attempts to fill the pool
* Create the base and nested pools with sync=disabled to speed up
the "setup" phase.
* Clear labels in test pool between checkpoint tests to avoid
duplicate pool issues.
* The import_rewind_device_replaced test has been marked as "known
to fail" for the reasons listed in its DISCLAIMER.
* New module parameters:
zfs_spa_discard_memory_limit,
zfs_remove_max_bytes_pause (not documented - debugging only)
vdev_max_ms_count (formerly metaslabs_per_vdev)
vdev_min_ms_count
Authored by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: John Kennedy <[email protected]>
Reviewed by: Dan Kimmel <[email protected]>
Reviewed by: Brian Behlendorf <[email protected]>
Approved by: Richard Lowe <[email protected]>
Ported-by: Tim Chase <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
OpenZFS-issue: https://illumos.org/issues/9166
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7159fdb8
Closes #7570
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ms_shift can be incorrectly changed changed in MOS config for
indirect vdevs that have been historically expanded
According to spa_config_update() we expect new vdevs to have
vdev_ms_array equal to 0 and then we go ahead and set their metaslab
size. The problem is that indirect vdevs also have vdev_ms_array == 0
because their metaslabs are destroyed once their removal is done.
As a result, if a vdev was expanded and then removed may have its
ms_shift changed if another vdev was added after its removal.
Fortunately this behavior does not cause any type of crash or bad
behavior in the kernel but it can confuse zdb and anyone doing any kind
of analysis of the history of the pools.
Authored by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: John Kennedy <[email protected]>
Reviewed by: Prashanth Sreenivasa <[email protected]>
Reviewed by: Brian Behlendorf <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Ported-by: Tim Chase <[email protected]>
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/651
OpenZFS-issue: https://illumos.org/issues/9591a
External-issue: DLPX-58879
Closes #7644
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For zio taskq's which have multiple instances (e.g. z_rd_int_0,
z_rd_int_1, etc), each one has a unique name (the _0, _1, _2 suffix).
This makes performance analysis more difficult, because by default,
`perf` includes the thread name (which is the same as the taskq name) in
the stack trace. This means that we get 8 different stacks, all of
which are doing the same thing, but are executed from different taskq's.
We should remove the suffix of the taskq name, so that all the
read-interrupt threads are named z_rd_int.
Note that we already support multiple taskq's with the same name. This
happens when there are multiple pools. In this case the taskq has a
different tq_instance, which shows up in /proc/spl/taskq-all.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
Signed-off-by: Matthew Ahrens <[email protected]>
Closes #7646
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The blk_queue_stackable() function was replaced in the 4.14 kernel
by queue_is_rq_based(), commit torvalds/linux@5fdee212. This change
resulted in the default elevator being used which can negatively
impact performance.
Rather than adding additional compatibility code to detect the
new interface unconditionally attempt to set the elevator. Since
we expect this to fail for block devices without an elevator the
error message has been moved in to zfs_dbgmsg().
Finally, it was observed that the elevator_change() was removed
from the 4.12 kernel, commit torvalds/linux@c033269. Update the
comment to clearly specify which are expected to export the
elevator_change() symbol.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7645
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit torvalds/linux@95582b0 changes the inode i_atime, i_mtime,
and i_ctime members form timespec's to timespec64's to make them
2038 safe. As part of this change the current_time() function was
also updated to return the timespec64 type.
Resolve this issue by introducing a new inode_timespec_t type which
is defined to match the timespec type used by the inode. It should
be used when working with inode timestamps to ensure matching types.
The timestruc_t type under Illumos was used in a similar fashion but
was specified to always be a timespec_t. Rather than incorrectly
define this type all timespec_t types have been replaced by the new
inode_timespec_t type.
Finally, the kernel and user space 'sys/time.h' headers were aligned
with each other. They define as appropriate for the context several
constants as macros and include static inline implementation of
gethrestime(), gethrestime_sec(), and gethrtime().
Reviewed-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7643
|
|
|
|
|
|
|
|
|
|
|
| |
This patch simply adds an ASSERT that confirms that the last
decrypting reference on a dataset waits until the dataset is
no longer dirty. This should help to debug issues where the
ZIO layer cannot find encryption keys after a dataset has been
disowned.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7637
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds tunables for modifying the maximum memory limit and
maximum instruction limit that can be specified when running a channel
program.
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]
Reviewed-by: Sara Hartse <[email protected]>
Signed-off-by: John Gallagher <[email protected]>
External-issue: LX-1085
Closes #7618
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added support for the bops->check_events() interface which was
added in the 2.6.38 kernel to replace bops->media_changed().
Fully implementing this functionality allows the volume resize
code to rely on revalidate_disk(), which is the preferred
mechanism, and removes the need to use check_disk_size_change().
In order for bops->check_events() to lookup the zvol_state_t
stored in the disk->private_data the zvol_state_lock needs to
be held. Since the check events interface may poll the mutex
has been converted to a rwlock for better concurrently. The
rwlock need only be taken as a writer in the zvol_free() path
when disk->private_data is set to NULL.
The configure checks for the block_device_operations structure
were consolidated in a single kernel-block-device-operations.m4
file.
The ZFS_AC_KERNEL_BDEV_BLOCK_DEVICE_OPERATIONS configure checks
and assoicated dead code was removed. This interface was added
to the 2.6.28 kernel which predates the oldest supported 2.6.32
kernel and will therefore always be available.
Updated maximum Linux version in META file. The 4.17 kernel
was released on 2018-06-03 and ZoL is compatible with the
finalized kernel.
Reviewed-by: Boris Protopopov <[email protected]>
Reviewed-by: Sara Hartse <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7611
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The zfs_dbuf_evict_key TSD (thread-specific data) is not necessary -
we can instead pass a flag down in a few places to prevent recursive
dbuf eviction. Making this change has 3 benefits:
1. The code semantics are easier to understand.
2. On Linux, performance is improved, because creating/removing
TSD values (by setting to NULL vs non-NULL) is expensive, and
we do it very often.
3. According to Nexenta, the current semantics can cause a
deadlock when concurrently calling dmu_objset_evict_dbufs()
(which is rare today, but they are working on a "parallel
unmount" change that triggers this more easily):
Porting Notes:
* Minor conflict with OpenZFS 9337 which has not yet been ported.
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://illumos.org/issues/9577
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/645
External-issue: DLPX-58547
Closes #7602
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the case where the pool is loaded without the crypto
keys necessary to playback the intent log, and log device
removal is attempted, a generic busy message is received.
Change the message to inform the user that the datasets
must be mounted.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tom Caputi <[email protected]>
Signed-off-by: Paul Zuchowski <[email protected]>
Closes #7518
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the new aggsum counters the CPU_SEQID macro should be surrounded by
kpreempt_disable)() and kpreempt_enable() calls to prevent a Linux
kernel BUG warning. The addsum_add() function use the cpuid to
minimize lock contention when selecting a bucket, after selection
the bucket is protected by a mutex and it is safe to reschedule the
process to a different processor at any time.
Reviewed-by: Matthew Thode <[email protected]>
Reviewed-by: Paul Dagnelie <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #7609
Closes #7610
|
|
|
|
|
|
|
|
|
|
|
| |
If sa_build_index() encounters a corrupt buffer, don't panic.
Add info to zfs ring buffer and return EIO. This allows for a cleaner
error recovery path.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Signed-off-by: Nathaniel Clark <[email protected]>
Issue #6500
Closes #7487
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes an issue where l2arc_read_done() would always
write data to b_pabd, even if raw encrypted data was requested.
This only occured in cases where the L2ARC device had a different
ashift than the main pool.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7586
Closes #7593
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a small bug found where receive_spill() sometimes
attempted to decrypt spill blocks when doing a raw receive. In
addition, this patch fixes another small issue in arc_buf_fill()'s
error handling where a decryption failure (which could be caused by
the first bug) would attempt to set the arc header's IO_ERROR flag
without holding the header's lock.
Reviewed-by: Matthew Thode <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Signed-off-by: Tom Caputi <[email protected]>
Closes #7564
Closes #7584
Closes #7592
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In pursuit of improving performance on multi-core systems, we should
implements fanned out counters and use them to improve the performance of
some of the arc statistics. These stats are updated extremely frequently,
and can consume a significant amount of CPU time.
Authored by: Paul Dagnelie <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
Reviewed by: Matthew Ahrens <[email protected]>
Approved by: Dan McDonald <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Ported-by: Paul Dagnelie <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/8484
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7028a8b92b7
Issue #3752
Closes #7462
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. Add a proc entry to display the pool's state:
$ cat /proc/spl/kstat/zfs/tank/state
ONLINE
This is done without using the spa config locks, so it will
never hang.
2. Fix 'zpool status' and 'zpool list -o health' output to print
"SUSPENDED" instead of "ONLINE" for suspended pools.
Reviewed-by: Olaf Faaland <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #7331
Closes #7563
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
txg_kick() fails to see that we are quiescing, forcing transactions to
their next stages without leaving them accumulate changes
Creating a fragmented pool in a DCenter VM and continuously writing to it with
multiple instances of randwritecomp, we get the following output from txg.d:
0ms 311MB in 4114ms (95% p1) 75MB/s 544MB (76%) 336us 153ms 0ms
0ms 8MB in 51ms ( 0% p1) 163MB/s 474MB (66%) 129us 34ms 0ms
0ms 366MB in 4454ms (93% p1) 82MB/s 572MB (79%) 498us 20ms 0ms
0ms 406MB in 5212ms (95% p1) 77MB/s 591MB (82%) 661us 37ms 0ms
0ms 340MB in 5110ms (94% p1) 66MB/s 622MB (86%) 1048us 41ms 1ms
0ms 3MB in 61ms ( 0% p1) 51MB/s 419MB (58%) 33us 0ms 0ms
0ms 361MB in 3555ms (88% p1) 101MB/s 542MB (75%) 335us 40ms 0ms
0ms 356MB in 4592ms (92% p1) 77MB/s 561MB (78%) 430us 89ms 1ms
0ms 11MB in 129ms (13% p1) 90MB/s 507MB (70%) 222us 15ms 0ms
0ms 281MB in 2520ms (89% p1) 111MB/s 542MB (75%) 334us 42ms 0ms
0ms 383MB in 3666ms (91% p1) 104MB/s 557MB (77%) 411us 133ms 0ms
0ms 404MB in 5757ms (94% p1) 70MB/s 635MB (88%) 1274us 123ms 2ms
4ms 367MB in 4172ms (89% p1) 88MB/s 556MB (77%) 401us 51ms 0ms
0ms 42MB in 470ms (44% p1) 90MB/s 557MB (77%) 412us 43ms 0ms
0ms 261MB in 2273ms (88% p1) 114MB/s 556MB (77%) 407us 27ms 0ms
0ms 394MB in 3646ms (85% p1) 108MB/s 552MB (77%) 393us 304ms 0ms
0ms 275MB in 2416ms (89% p1) 113MB/s 510MB (71%) 200us 53ms 0ms
0ms 9MB in 53ms ( 0% p1) 169MB/s 483MB (67%) 140us 100ms 1ms
The TXGs that are getting synced and don't have lots of changes are pushed by
txg_kick() which basically forces the current open txg to get to the quiesced
state:
if (tx->tx_syncing_txg == 0 &&
tx->tx_quiesce_txg_waiting <= tx->tx_open_txg &&
tx->tx_sync_txg_waiting <= tx->tx_synced_txg &&
tx->tx_quiesced_txg <= tx->tx_synced_txg) {
tx->tx_quiesce_txg_waiting = tx->tx_open_txg + 1;
cv_broadcast(&tx->tx_quiesce_more_cv);
}
The problem is that the above code doesn't check if we are currently quiescing
anything (only if a quiesce or a sync has been requested, ..etc) so the
following scenario can happen:
1] We have an open txg A that had enough dirty data (more than
zfs_dirty_data_sync) and it was pushed to the quiesced state, and opened
a new txg B. No txg is currently being synced.
2] Immediately after the opening of B, txg_kick() was run by some other write
(and because of A's dirty data) and saw that we are not currently syncing
any txg and no one has requested quiescing so it requests one by bumping
tx_quiesce_txg_waiting and broadcasts the quiesce thread.
3] The quiesce thread just passed txg A to be synced and sees that a quiescing
request has been sent to it so it immediately grabs B without letting it
gather enough data, putting it in a quiesced state and opening a new txg C.
In this scenario txg B, is an example of how the entries of interest show up in
the txg.d output.
Ideally we would like txg_kick() to get triggered only when we are sure that
we are not syncing AND not quiescing any txg. This way we can kick an open TXG
to the quiescing state when we are sure that there is nothing going on and we
would benefit from the different states running concurrently.
Authored by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Reviewed by: Andriy Gapon <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://illumos.org/issues/9464
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/1cd7635b
Closes #7587
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to be able to pass various settings during import/open of a
pool, which are not only related to rewind. Instead of adding a new
policy and duplicate a bunch of code, we should just rename
rewind_policy to a more generic term like load_policy.
For instance, we'd like to set spa->spa_import_flags from the nvlist,
rather from a flags parameter passed to spa_import as in some cases we
want those flags not only for the import case, but also for the open
case. One such flag could be ZFS_IMPORT_MISSING_LOG (as used in zdb)
which would allow zfs to open a pool when logs are missing.
Authored by: Pavel Zakharov <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://illumos.org/issues/9235
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/d2b1e44
Closes #7532
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the null pointer issue shown below, the solution is to initialize the
contents of the object before changing its type, so that concurrent accessors
will see it as non-zapified until it is ready for access via the ZAP.
BAD TRAP: type=e (#pf Page fault) rp=ffffff00ff520440 addr=20 occurred
in module "zfs" due to a NULL pointer dereference
ffffff00ff520320 unix:die+df ()
ffffff00ff520430 unix:trap+dc0 ()
ffffff00ff520440 unix:cmntrap+e6 ()
ffffff00ff520590 zfs:zap_leaf_lookup+46 ()
ffffff00ff520640 zfs:fzap_lookup+a9 ()
ffffff00ff5206e0 zfs:zap_lookup_norm+111 ()
ffffff00ff520730 zfs:zap_contains+42 ()
ffffff00ff520760 zfs:dsl_dataset_has_resume_receive_state+47 ()
ffffff00ff520900 zfs:get_receive_resume_stats+3e ()
ffffff00ff520a90 zfs:dsl_dataset_stats+262 ()
ffffff00ff520ac0 zfs:dmu_objset_stats+2b ()
ffffff00ff520b10 zfs:zfs_ioc_objset_stats_impl+64 ()
ffffff00ff520b60 zfs:zfs_ioc_objset_stats+33 ()
ffffff00ff520bd0 zfs:zfs_ioc_dataset_list_next+140 ()
ffffff00ff520c80 zfs:zfsdev_ioctl+4d7 ()
ffffff00ff520cc0 genunix:cdev_ioctl+39 ()
ffffff00ff520d10 specfs:spec_ioctl+60 ()
ffffff00ff520da0 genunix:fop_ioctl+55 ()
ffffff00ff520ec0 genunix:ioctl+9b ()
ffffff00ff520f10 unix:brand_sys_sysenter+1c9 ()
Porting Notes:
* DMU_OT_BYTESWAP conditional in zap_lockdir_impl() kept.
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: Pavel Zakharov <[email protected]>
Reviewed by: Brad Lewis <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://illumos.org/issues/9329
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/e8e0f97
Closes #7578
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ZAP code was written before we allowed c99 in the Solaris kernel. We
should change it to take advantage of being able to declare variables where
they are first used. This reduces variable scope and means less scrolling
to find the type of variables.
Authored by: Matthew Ahrens <[email protected]>
Reviewed by: Steve Gonczi <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Approved by: Dan McDonald <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://illumos.org/issues/9328
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/76ead05
Closes #7578
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update bdev_capacity to have wholedisk vdevs query the
size of the underlying block device (correcting for the size
of the efi parition and partition alignment) and therefore detect
expanded space.
Correct vdev_get_stats_ex so that the expandsize is aligned
to metaslab size and new space is only reported if it is large
enough for a new metaslab.
Reviewed by: Don Brady <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed by: George Wilson <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed by: John Wren Kennedy <[email protected]>
Signed-off-by: sara hartse <[email protected]>
External-issue: LX-165
Closes #7546
Issue #7582
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes an assert in vdev_queue_change_io_priority():
VERIFY3(zio->io_priority < ZIO_PRIORITY_NUM_QUEUEABLE) failed (7 < 6)
PANIC at vdev_queue.c:832:vdev_queue_change_io_priority()
Reviewed-by: Tom Caputi <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #7566
Closes #7542
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Minimal changes required to integrate the SPL sources in to the
ZFS repository build infrastructure and packaging.
Build system and packaging:
* Renamed SPL_* autoconf m4 macros to ZFS_*.
* Removed redundant SPL_* autoconf m4 macros.
* Updated the RPM spec files to remove SPL package dependency.
* The zfs package obsoletes the spl package, and the zfs-kmod
package obsoletes the spl-kmod package.
* The zfs-kmod-devel* packages were updated to add compatibility
symlinks under /usr/src/spl-x.y.z until all dependent packages
can be updated. They will be removed in a future release.
* Updated copy-builtin script for in-kernel builds.
* Updated DKMS package to include the spl.ko.
* Updated stale AUTHORS file to include all contributors.
* Updated stale COPYRIGHT and included the SPL as an exception.
* Renamed README.markdown to README.md
* Renamed OPENSOLARIS.LICENSE to LICENSE.
* Renamed DISCLAIMER to NOTICE.
Required code changes:
* Removed redundant HAVE_SPL macro.
* Removed _BOOT from nvpairs since it doesn't apply for Linux.
* Initial header cleanup (removal of empty headers, refactoring).
* Remove SPL repository clone/build from zimport.sh.
* Use of DEFINE_RATELIMIT_STATE and DEFINE_SPINLOCK removed due
to build issues when forcing C99 compilation.
* Replaced legacy ACCESS_ONCE with READ_ONCE.
* Include needed headers for `current` and `EXPORT_SYMBOL`.
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Olaf Faaland <[email protected]>
Reviewed-by: Matthew Ahrens <[email protected]>
Reviewed-by: Pavel Zakharov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
TEST_ZIMPORT_SKIP="yes"
Closes #7556
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Merge a minimal version of the zfsonlinux/spl repository in to the
zfsonlinux/zfs repository. Care was taken to prevent file conflicts
when merging and to preserve the spl repository history. The spl
kernel module remains under the GPLv2 license as documented by the
additional THIRDPARTYLICENSE.gplv2 file.
Signed-off-by: Brian Behlendorf <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This commit removes everything from the repository except the core
SPL implementation for Linux. Those files which remain have been
moved to non-conflicting locations to facilitate the merge.
The README.md and associated files have been updated accordingly.
Signed-off-by: Brian Behlendorf <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This patch contains no functional changes. It is solely intended
to resolve cstyle warnings in order to facilitate moving the spl
source code in to the zfs repository.
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed by: George Melikov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #687
|
| |
| |
| |
| |
| |
| |
| |
| | |
vn_init() and vn_fini() had been renamed by 12ff95ff in 2011.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #686
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is only used via ->ks_update of `kstat_t *`.
This isn't exported nor do headers have its prototype.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tomohiro Kusumi <[email protected]>
Closes #686
|
| |
| |
| |
| |
| |
| |
| |
| | |
This patch contains no functional changes. It is solely intended
to resolve cstyle warnings in order to facilitate moving the spl
source code in to the zfs repository.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #681
|
| |
| |
| |
| |
| |
| |
| |
| | |
Add missing helper function cv_timedwait_io(), it should be used
when waiting on IO with a specified timeout.
Reviewed-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #674
|
| |
| |
| |
| |
| |
| |
| |
| | |
Use timer_setup() macro and new timeout function definition.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Tony Hutter <[email protected]>
Closes #670
Closes #671
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The kernel_read & kernel_write functions have always wrapped the
vfs_read & vfs_write functions respectively. However, they could
not be used by vn_rdwr() since the offset wasn't passed as a
pointer. This prevented us from being able to properly update
the file offset.
Linux 4.14 unexported vfs_read & vfs_write but also changed the
signature of kernel_read & kernel_write to provide the needed
functionality. Use these updated functions when available.
Reviewed-by: Pritam Baral <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #656
Closes #667
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On systems with CONFIG_SMP turned off, spin_is_locked always returns
false causing these assertions to fail. Remove them as suggested in
zfsonlinux/zfs#6558.
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: James Cowgill <[email protected]>
Closes #665
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Both vn_rename and vn_remove have been historically problematic
to implement reliably. Rather than fixing them yet again they
are being removed.
Reviewed-by: Arkadiusz Bubala <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #648
Closes #661
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
No semantic changes.
Change
/************\
and
\************/
to
/*
and
*/
Signed-off-by: Olaf Faaland <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Reviewed-by: Tim Chase <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Richard Elling <[email protected]>
Closes #652
Closes #651
|
| |
| |
| |
| |
| |
| | |
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: gaurkuma <[email protected]>
Closes #641
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This was probably accidentally committed in
aeb9baa618beea1458ab3ab22cbc0f39213da6cf
Fix: handle NULL case in spl_kmem_free_track()
Reviewed-by: George Melikov <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Fabian Grünbichler <[email protected]>
Closes #644
|
| |
| |
| |
| |
| |
| |
| |
| | |
taskq work item to more than one queue concurrently. Also, please
see discussion in zfsonlinux/zfs#3840.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Boris Protopopov <[email protected]>
Closes #609
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
taskq_seq_show_impl walks the tq_active_list to show the tqent_func and
tqent_arg. However for taskq_dispatch_ent, it's very likely that the
task entry will be freed during the function call, and causes a
use-after-free bug.
To fix this, we duplicate the task entry to an on-stack struct, and
assign it instead to tqt_task. This way, the tq_lock alone will
guarantee its safety.
Reviewed-by: Tim Chase <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes #638
Closes #640
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
gcc-7 seems to use __udivmoddi4 for 64-bit division on 32-bit arch. This
patch implement them so we don't get undefined reference error.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: loli10K <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes zfsonlinux/zfs#6417
Closes #636
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In unattended operations it's often more useful to have node
panic and reboot when it encounters problems as opposed to
sit there indefinitely waiting for somebody to discover it.
This implements an spl_panic_crash module parameter, set it
to nonzero to cause spl_panic() to call panic().
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Giuseppe Di Natale <[email protected]>
Signed-off-by: Oleg Drokin <[email protected]>
Closes #634
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When we load a ZFS pool having spa_name equals to some existing kstat
we would have to create a duplicate entry, which procfs doesn't like.
For instance a ZFS pool named "zil" would have its kstat "txgs"
(module "zfs/zil") intalled under "/proc/spl/kstat/zfs/zil":
unfortunately we already have a kstat named "zil" (module "zfs")
installed in the same procfs location.
Avoid this issue by skipping the duplicate entry creation in procfs.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: loli10K <[email protected]>
Closes #628
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit torvalds/linux@ac6424b9
- Renamed struct wait_queue -> struct wait_queue_entry.
Commit torvalds/linux@2055da97
- Renamed wait_queue_head::task_list -> wait_queue_head::head
- Renamed wait_queue_entry::task_list -> wait_queue_entry::entry
Reviewed-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #629
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Historically the SPL cached the system hostid the first time it
was accessed. This was done to speed up subsequent accesses.
But in practice the system host id is rarely accessed and its
inconvenient that it doesn't promptly detect /etc/hostid
configuration changes. Therefore, zone_get_hostid() has been
updated to always refresh the system hostid reported.
Reviewed-by: Olaf Faaland <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #626
|
| |
| |
| |
| |
| |
| |
| | |
Exclude Makefile.in in module/ and fix the gitignore in cmd/
Also, ignore *.patch and *.orig files
Signed-off-by: Chunwei Chen <[email protected]>
|