| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using a benchmark which has 32 threads creating 2 million files in the
same directory, on a machine with 16 CPU cores, I observed poor
performance. I noticed that dmu_tx_hold_zap() was using about 30% of
all CPU, and doing dnode_hold() 7 times on the same object (the ZAP
object that is being held).
dmu_tx_hold_zap() keeps a hold on the dnode_t the entire time it is
running, in dmu_tx_hold_t:txh_dnode, so it would be nice to use the
dnode_t that we already have in hand, rather than repeatedly calling
dnode_hold(). To do this, we need to pass the dnode_t down through
all the intermediate calls that dmu_tx_hold_zap() makes, making these
routines take the dnode_t* rather than an objset_t* and a uint64_t
object number. In particular, the following routines will need to have
analogous *_by_dnode() variants created:
dmu_buf_hold_noread()
dmu_buf_hold()
zap_lookup()
zap_lookup_norm()
zap_count_write()
zap_lockdir()
zap_count_write()
This can improve performance on the benchmark described above by 100%,
from 30,000 file creations per second to 60,000. (This improvement is on
top of that provided by working around the object allocation issue. Peak
performance of ~90,000 creations per second was observed with 8 CPUs;
adding CPUs past that decreased performance due to lock contention.) The
CPU used by dmu_tx_hold_zap() was reduced by 88%, from 340 CPU-seconds
to 40 CPU-seconds.
Sponsored by: Intel Corp.
Signed-off-by: Matthew Ahrens <[email protected]>
Signed-off-by: Ned Bass <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/7004
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/109
Closes #4641
Closes #4972
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zap_lockdir() / zap_unlockdir() should take a "void *tag" argument which
tags the hold on the zap. This will help diagnose programming errors
which misuse the hold on the ZAP.
Sponsored by: Intel Corp.
Signed-off-by: Matthew Ahrens <[email protected]>
Signed-off-by: Pavel Zakharov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/7003
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/108
Closes #4972
|
|
|
|
|
|
|
|
|
|
| |
When spa retry load succeeds and spa recovery is requested it may
leak in spa_load_best function. Always free the generated config
when it is not assigned to the spa.
Signed-off-by: cao.xuewen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4940
|
|
|
|
|
|
|
|
|
| |
Just cleanup the new fs created during the test, so the "$found"
should be "true".
Signed-off-by: ChaoyuZhang <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4978
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is another bug in the long line of hole-birth related issues. In
this particular case, it was discovered that a previous hole-birth fix
(illumos bug 6513, commit bc77ba73) did not cover as many cases as we
thought it did. While the issue worked in the case of hole-punching
(writing zeroes to a large part of a file), it did not deal with
truncation, and then writing beyond the new end of the file.
The problem is that dbuf_findbp will return ENOENT if the block it's
trying to find is beyond the end of the file. If that happens, we assume
there is no birth time, and so we lose that information when we write
out new blkptrs. We should teach dbuf_findbp to look for things that are
beyond the current end, but not beyond the absolute end of the file.
Authored by: Paul Dagnelie <[email protected]>
Reviewed by: Matthew Ahrens [email protected]
Reviewed by: George Wilson [email protected]
Ported-by: kernelOfTruth <[email protected]>
Signed-off-by: Boris Protopopov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/7176
OpenZFS-commit: https://github.com/openzfs/openzfs/pull/173/commits/8b9f3ad
Upstream-bugs: DLPX-46009
Porting notes:
- Fix ISO C90 mixed declaration error in dbuf.c ( int nlevels, epbs; ) ;
keep previous position of the initialization
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the man page of dirname: " Both dirname() and basename()
may modify the contents of path, so it may be desirable to pass
a copy when calling one of these functions." And in fact on linux
using dirname actually changes the contents of the passed parameter as
evident from the following failure when running the ctime test:
link(/root/zfs-mount, /root/zfs-mount/link_file)
Fix this by creating a copy of the input parameter and passing that
to dirname, thus not compromising the original parameter, allowing
the creation of hard link to succeed.
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4977
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under a workload which makes heavy use of `dbuf_hold()`, I noticed that a
considerable amount of time was spent in `dbuf_hold_impl()`, due to its call to
`kmem_zalloc(sizeof (struct dbuf_hold_impl_data) * DBUF_HOLD_IMPL_MAX_DEPTH)`,
which is around 2KiB. This structure is used as a stack, to limit the size of
the C stack as dbuf_hold() calls itself recursively. We make a recursive call
to hold the parent's dbuf when the requested dbuf is not found. The vast
majority of the time, the parent or grandparent indirect dbuf is cached, so the
number of recursive calls is very low. However, we initialize this entire
array for every call to dbuf_hold().
To improve performance, this commit changes `dbuf_hold()` to use `kmem_alloc()`
instead of `kmem_zalloc()`. __dbuf_hold_impl_init is changed to initialize all
members of the struct before they are used. I observed ~5% performance
improvement on a workload which creates many files.
Signed-off-by: Matthew Ahrens <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4974
|
|
|
|
|
|
|
|
|
|
|
|
| |
Erroneous access detected by gcc UndefinedBehaviorSanitizer:
`zdb.c:2424:7: runtime error: index 112 out of bounds for type 'uint64_t [112]'`
Fix: increase histogram size by 1 to accommodate all possible sizes.
Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4934
Issue #4883
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Benchmark memory block is increased to 128kiB to reflect real block sizes more
accurately. Measurements include all three stages needed for checksum generation,
i.e. `init()/compute()/fini()`. The inner loop is repeated multiple times to offset
overhead of time function.
- Fastest implementation selects native and byteswap methods independently in
benchmark. To support this new function pointers `init_byteswap()/fini_byteswap()`
are introduced.
- Implementation mutex lock is replaced by atomic variable.
- To save time, benchmark is not executed in userspace. Instead, highest supported
implementation is used for fastest. Default userspace selector is still 'cycle'.
- `fletcher_4_native/byteswap()` methods use incremental methods to finish
calculation if data size is not multiple of vector stride (currently 64B).
- Added `fletcher_4_native_varsize()` special purpose method for use when buffer size
is not known in advance. The method does not enforce 4B alignment on buffer size, and
will ignore last (size % 4) bytes of the data buffer.
- Benchmark `kstat` is changed to match the one of vdev_raidz. It now shows
throughput for all supported implementations (in B/s), native and byteswap,
as well as the code [fastest] is running.
Example of `fletcher_4_bench` running on `Intel(R) Xeon(R) CPU E5-2660 v3 @ 2.60GHz`:
implementation native byteswap
scalar 4768120823 3426105750
sse2 7947841777 4318964249
ssse3 7951922722 6112191941
avx2 13269714358 11043200912
fastest avx2 avx2
Example of `fletcher_4_bench` running on `Intel(R) Xeon Phi(TM) CPU 7210 @ 1.30GHz`:
implementation native byteswap
scalar 1291115967 1031555336
sse2 2539571138 1280970926
ssse3 2537778746 1080016762
avx2 4950749767 1078493449
avx512f 9581379998 4010029046
fastest avx512f avx512f
Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4952
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Algorithm runs 8 parallel sums, consuming 8x uint32_t elements per
loop iteration. Size alignment of main fletcher4 methods is adjusted
accordingly. New implementation is called 'avx512f'.
Note: byteswap method can be implemented more efficiently when avx512bw hardware
becomes available. Currently, it is ~ 2x slower than native method.
Table shows result of full (native) fletcher4 calculation for different buffer size:
fletcher4 4KB 16KB 64KB 128KB 256KB 1MB 16MB
--------------------------------------------------------------------
[scalar] 1213 1228 1231 1231 1225 1200 1160
[sse2] 2374 2442 2459 2456 2462 2250 2220
[avx2] 4288 4753 4871 4893 4900 4050 3882
[avx512f] 5975 8445 9196 9221 9262 6307 5620
Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #4952
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds compiler and runtime tests (user and kernel) for following
instruction sets: avx512f, avx512cd, avx512er, avx512pf, avx512bw, avx512dq,
avx512vl, avx512ifma, avx512vbmi.
note: Linux support for AVX-512F (Foundation) instruction set started with
linux v3.15
Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #4952
|
|
|
|
|
|
|
|
|
|
|
| |
Adds a module option which disables the hole_birth optimization
which has been responsible for several recent bugs, including
issue #4050.
Original-patch: https://gist.github.com/pcd1193182/2c0cd47211f3aee623958b4698836c48
Signed-off-by: Rich Ercolani <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4833
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Import a raidz pool which has a vdev with a bad label, zpool status
shows the right state of the dev, but the wrong state of the pool.
The pool state should be DEGRADED, not ONLINE.
We examine the label in vdev_validate while in spa_load_impl, the bad
label can be detected but doesn't propagate its state to the parent.
There are other chances to propagate state in the following vdev_load
if we failed to load DTL, but our pool is raidz1 which can tolerate a
faulted disk. So we lost the last chance to correct the pool state.
Propagate the leaf vdev's state to parent if its label was corrupted,
as is done elsewhere in vdev_validate.
Signed-off-by: GeLiXin <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Closes #4948
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Authored by: Hans Rosenfeld <[email protected]>
Reviewed by: Dan Fields <[email protected]>
Reviewed by: Josef Sipek <[email protected]>
Reviewed by: Richard Elling <[email protected]>
Reviewed by: George Wilson <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Signed-off-by: Don Brady <[email protected]>
Ported-by: Brian Behlendorf <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/5997
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/1437283
Porting Notes:
In addition to the OpenZFS changes this patch realigns the events
with those found in OpenZFS.
Events which would be logged as sysevents on illumos have been
been mapped to the 'sysevent' class for Linux. In addition, several
subclass names have been changed to match what is used in OpenZFS.
In all cases this means a '.' was changed to an '_' in the subclass.
The scripts provided by ZoL have been updated, however users which
provide scripts for any of the following events will need to rename
them based on the new subclass names.
ereport.fs.zfs.config.sync sysevent.fs.zfs.config_sync
ereport.fs.zfs.zpool.destroy sysevent.fs.zfs.pool_destroy
ereport.fs.zfs.zpool.reguid sysevent.fs.zfs.pool_reguid
ereport.fs.zfs.vdev.remove sysevent.fs.zfs.vdev_remove
ereport.fs.zfs.vdev.clear sysevent.fs.zfs.vdev_clear
ereport.fs.zfs.vdev.check sysevent.fs.zfs.vdev_check
ereport.fs.zfs.vdev.spare sysevent.fs.zfs.vdev_spare
ereport.fs.zfs.vdev.autoexpand sysevent.fs.zfs.vdev_autoexpand
ereport.fs.zfs.resilver.start sysevent.fs.zfs.resilver_start
ereport.fs.zfs.resilver.finish sysevent.fs.zfs.resilver_finish
ereport.fs.zfs.scrub.start sysevent.fs.zfs.scrub_start
ereport.fs.zfs.scrub.finish sysevent.fs.zfs.scrub_finish
ereport.fs.zfs.bootfs.vdev.attach sysevent.fs.zfs.bootfs_vdev_attach
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following comment in zil.h
* WR_COPIED:
* If we know we'll immediately be committing the
* transaction (FSYNC or FDSYNC), then we allocate a larger
* log record here for the data and copy the data in.
The word "the" should be "then".
Signed-off-by: luozhengzheng <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4961
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If there is no explicit note in the .S files, the obj file will mark it
as requiring an executable stack. This is unneeded and causes issues on
hardened systems.
More info:
https://wiki.gentoo.org/wiki/Hardened/GNU_stack_quickstart
Signed-off-by: Jason Zaman <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4947
Closes #4962
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The constify plugin will automatically constify a class of types that contain
only function pointers. The icp structs fail to build if this is enabled with
the following error. The no_const attribute makes the plugin skip those
structs.
module/icp/spi/kcf_spi.c: In function ‘copy_ops_vector_v1’:
module/icp/spi/kcf_spi.c:61:16: error: assignment of read-only location ‘*dst_ops->cou.cou_v1.co_control_ops’
*((dst)->ops) = *((src)->ops);
^
module/icp/spi/kcf_spi.c:74:2: note: in expansion of macro ‘KCF_SPI_COPY_OPS’
KCF_SPI_COPY_OPS(src_ops, dst_ops, co_control_ops);
^
Signed-off-by: Jason Zaman <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4947
Closes #4962
|
|
|
|
|
|
|
|
|
|
|
|
| |
The HAVE_BIO_RW_* #ifdef's must appear before REQ_* #ifdef's
in the bio_is_flush() and bio_is_discard() macros. Linux 2.6.32
era kernels defined both of values and the HAVE_BIO_RW_* must be
used in this case. This resulted in a panic in zconfig test 5.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes #4951
Closes #4959
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nvlist_pack() and nvlist_unpack are implemented recursively, which can
cause the stack to overflow with a deeply nested nvlist; i.e. an nvlist
which contains an nvlist, which contains an nvlist, which...
Unprivileged users can pass an nvlist to the kernel via certain ioctls
on /dev/zfs, which the kernel will unpack without additional permission
checking or validation. Therefore, an unprivileged user can cause the
kernel's stack to overflow and panic.
Ideally, these functions would be implemented non-recursively. As a
quick fix, this patch limits the depth of the recursion and returns an
error when attempting to pack and unpack a deeply-nested nvlist.
Signed-off-by: Adam Leventhal <[email protected]>
Signed-off-by: George Wilson <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Ported-by: Prakash Surya <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/7263
OpenZFS-commit: https://github.com/openzfs/openzfs/commit/0511d6d
-
|
|
|
|
|
|
|
|
|
| |
Also print decompress progress to stderr so it wouldn't pollute raw output
with r flag.
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4956
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix bugs due to kernel change in torvalds/linux@4bacc9c9234c ("overlayfs:
Make f_path always point to the overlay and f_inode to the underlay").
This problem crashes system when use zfs as a layer of overlayfs.
Signed-off-by: Chen Haiquan <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4914
Closes #4935
|
|
|
|
|
|
|
|
|
|
|
| |
The indefinite article before nvlist should be "an", not "a".
We have 27 "an nvlist" and 7 "a nvlist" in our comment, they should
stay the same as we are such a strict filesystem.
Signed-off-by: GeLiXin <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4941
|
|
|
|
|
|
|
|
|
|
| |
Non-Linux OpenZFS implementations require additional support to be
used a root pool. This code should simply be removed to avoid
confusion and improve readability.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes #4951
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All users of bio->bi_rw have been replaced with compatibility wrappers.
This allows the kernel specific logic to be abstracted away, and for
each of the supported cases to be documented with the wrapper. The
updated interfaces are as follows:
* void blk_queue_set_write_cache(struct request_queue *, bool, bool)
* boolean_t bio_is_flush(struct bio *)
* boolean_t bio_is_fua(struct bio *)
* boolean_t bio_is_discard(struct bio *)
* boolean_t bio_is_secure_erase(struct bio *)
* VDEV_WRITE_FLUSH_FUA
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes #4951
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fix resolves warnings reported during compiling of user-space
libraries with different gcc optimization levels.
Tested with gcc versions: 4.9.2 (Debian), and 6.1.1 (Fedora).
The patch enables use of following opt levels: O0, O1, O2, O3, Og, Os, Ofast.
List of warnings:
[GCC 4.9.2 -Os]
libzfs_sendrecv.c:3726:26: error: 'clp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
[GCC 4.9.2 -Og]
fs_fletcher.c:323:26: error: 'idx' may be used uninitialized in this function [-Werror=maybe-uninitialized]
dsl_dataset.c:1290:12: error: 'atp' may be used uninitialized in this function [-Werror=maybe-uninitialized]
[GCC 4.9.2 -Ofast]
u8_textprep.c:1310:9: error: 'tc[3ul]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
u8_textprep.c:177:23: error: 'u8t[0ul]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
dsl_dataset.c:2089:37: error: ‘hds’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
dsl_dataset.c:3216:2: error: ‘ds’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
dsl_dataset.c:1591:2: error: ‘ds’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
dsl_dataset.c:3341:2: error: ‘ds’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
vdev_raidz.c:1153:8: error: 'dcount[2]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
vdev_raidz.c:1167:17: error: 'dst[2]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
kernel.c:1005:2: error: ‘resid’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
libzfs_dataset.c:2826:8: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
libzfs_dataset.c:3056:35: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
libzfs_dataset.c:1584:13: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
libzfs_dataset.c:3056:35: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
libzfs_dataset.c:1792:66: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
libzfs_dataset.c:3986:35: error: ‘val’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
[GCC 6.1.1]
Resolved in PR #4907
Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4937
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Starting from Linux 4.7, get_acl will set acl cache pointer to temporary
sentinel value before calling i_op->get_acl. Therefore we can't compare
against ACL_NOT_CACHED and return.
Since from Linux 3.14, get_acl already check the cache for us, so we
disable this in zpl_get_acl.
Linux 4.7 also does set_cached_acl for us so we disable it in zpl_get_acl.
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4944
Closes #4946
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zfs_get_name() expects a parameter of type zfs_handle_t *zhp , but
gets an invalid parameter type of zfs_handle_t **zhp actually in
libzfs_dataset_cmp(), which may trigger a coredump if called.
libzfs_dataset_cmp() working normally so far, just because all the
callers only give datasets of type ZFS_TYPE_FILESYSTEM to it, we
compared their mountpoint and return, luckily.
Signed-off-by: GeLiXin <[email protected]>
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4919
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The posix_acl_valid() function has been updated to require a
user namespace. Filesystem callers should normally provide the
user_ns from the super block associcated with the ACL; the
zpl_posix_acl_valid() wrapper has been added for this purpose.
See https://github.com/torvalds/linux/commit/0d4d717f for
complete details.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes #4922
|
|
|
|
|
|
|
|
|
|
| |
Remove ZFS_AC_KERNEL_CURRENT_UMASK and ZFS_AC_KERNEL_POSIX_ACL_CACHING
configure checks, all supported kernel provide this functionality.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Closes #4922
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* When the uid/gid change is handled in zfs_setattr we want to
actually adjust the user passed uid to a KUID and write that to disk.
* In trace points use the i_uid member without doing translation,
since it has already been performed.
* Use kuid in zfs_aclset_common
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4928
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Kernel 4.8 paved the way to enabling mounting a file system inside a
non-init user namespace. To facilitate this a s_user_ns member was
added holding the userns in which the filesystem's instance was
mounted. This enables doing the uid/gid translation relative to
this particular username space and not the default init_user_ns.
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4928
|
|
|
|
|
|
|
|
|
|
|
|
| |
When arc_max is increased, arc_meta_limit will not be updated to 3/4
of the new arc_c_max value. This was done originally to preserve any
existing maximum value. This turned out to be counter intuitive to
users and this fix changes that behavior. If zfs_arc_meta_limit is
non-default, it will be picked up later in the ARC tuning function.
Signed-off-by: Gaurav Kumar <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4893
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of gcc 6.1.1 20160621 (Red Hat 6.1.1-3) an array bounds warnings
is detected in the zdb the dump_object() function. The analysis is
correct but difficult to interpret because this is implemented as a
macro. Rework the ZDB_OT_NAME in to a function and remove the case
detected by gcc which is a side effect of the DMU_OT_IS_VALID() macro.
zdb.c: In function ‘dump_object’:
zdb.c:1931:288: error: array subscript is outside array bounds
[-Werror=array-bounds]
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Gvozden Neskovic <[email protected]>
Closes #4907
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of gcc 6.1.1 20160621 (Red Hat 6.1.1-3) a self-comparison is
detected by gcc in metaslab_alloc(). Resolve the warning by passing
a physical size of 0 to BP_SET_BIRTH() as it done by other callers.
module/zfs/metaslab.c: In function ‘metaslab_alloc’:
module/zfs/metaslab.c:2575:184: error: self-comparison always evaluates
to true [-Werror=tautological-compare]
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Gvozden Neskovic <[email protected]>
Issue #4907
|
|
|
|
|
|
|
|
|
| |
A lot of string replacement target don't have dependency or incorrect
dependency. We setup proper dependency by pattern rules.
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4908
|
|
|
|
|
|
|
|
|
|
| |
Add a `make lint` target which maps to a cppcheck target. As with
the shellcheck target it will only run when cppcheck is installed.
This allows a `make lint` build check to be incrementally added to
the automated testing for distribution which provide cppcheck.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4915
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a possible VDEV statistics array overflow when ZIOs with
ZIO_PRIORITY_NOW complete.
Signed-off-by: Tony Hutter <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #4883
Closes #4917
|
|
|
|
|
|
|
|
|
|
|
| |
Config of a hot spare or l2cache device will leak memory in function
add_config(). At the start of this function, when dealing with a
config which belongs to a hot spare not currently in use or a l2cache
device the config should be freed.
Signed-off-by: liaoyuxiangqin <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4910
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Leaks reported by using AddressSanitizer, GCC 6.1.0
Direct leak of 4097 byte(s) in 1 object(s) allocated from:
#1 0x414f73 in process_options cmd/ztest/ztest.c:721
Direct leak of 5440 byte(s) in 17 object(s) allocated from:
#1 0x41bfd5 in umem_alloc ../../lib/libspl/include/umem.h:88
#2 0x41bfd5 in ztest_zap_parallel cmd/ztest/ztest.c:4659
#3 0x4163a8 in ztest_execute cmd/ztest/ztest.c:5907
Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4896
|
|
|
|
|
|
|
|
|
|
|
| |
In zpool_find_import_scan: Reads an uninitialized pointer or
its target Coverity #150966
Found by static analysis with CoverityScan 0.8.5
Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4897
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The switch statement in function zfs_standard_error_fmt for the
ENOSPC and EDQUOT cases returns immediately and unlike all other
cases in the switch this does not perform the va_end call.
Perform a break which ends up calling va_end rather than returning
immediately.
Found by static analysis with CoverityScan 0.8.5
Signed-off-by: Colin Ian King <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4900
|
|
|
|
|
|
|
|
|
|
| |
Currently i_blkbits is always set to SPA_MINBLOCKSHIFT every time
zfs_inode_update_impl is called. Since this value never changes
move its assignment to at inode creation time.
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4906
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The newly added icp module uses a hardcoded value of CDDL for the license,
however in local development one might want to change that to something
else in order to facilitate compiling against lock debugging enabled kernel.
All modules of the zfs use the ZFS_META_LICNSE string which is replaced with
the value held in the META file. One can modify the value in the META file
once and then rerun the configure to have all modules' licenses changed.
Change the icp module license string to be ZFS_META_LICENSE so that it
falls under the same paradigm.
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4905
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In zfs_ioc_log_history() function the tsd_set() function is called
with NULL which causes the zfs_allow_log_destroy() to be run. In
this case the passed value will be NULL. This is normally entirely
safe because strfree() maps directly to kfree() which may be passed
a NULL. However, since alternate implementations of strfree() may
not handle this gracefully add a check for NULL.
Observed under an embedded Linux 2.6.32.41 kernel running the
automated testing while running the ZFS Test Suite.
Signed-off-by: caoxuewen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4872
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New REQ_OP_* definitions have been introduced to separate the
WRITE, READ, and DISCARD operations from the flags. This included
changing the encoding of bi_rw. It places REQ_OP_* in high order
bits and other stuff in low order bits. This encoding is done
through the new helper function bio_set_op_attrs. For complete
details refer to:
https://github.com/torvalds/linux/commit/f215082
https://github.com/torvalds/linux/commit/4e1b2d5
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4892
Closes #4899
|
|
|
|
|
|
|
|
|
|
|
|
| |
The REQ_FLUSH flag was renamed REQ_PREFLUSH to avoid confusion with
REQ_OP_FLUSH. See https://github.com/torvalds/linux/commit/28a8f0d3
for complete details.
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #4892
Issue #4899
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rw argument has been removed from submit_bio/submit_bio_wait.
Callers are now expected to set bio->bi_rw instead of passing it
in. See https://github.com/torvalds/linux/commit/4e49ea4a for
complete details.
Signed-off-by: Tim Chase <[email protected]>
Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #4892
Issue #4899
|
|
|
|
|
|
|
|
|
| |
The memory allocation and locking in `spa_txg_history_*()` can
potentially block txg_hold_open for arbitrarily long periods of time.
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #4333
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here's the problem - on 4K native devices in userland on
Linux using O_DIRECT, buffers must be 4K aligned or I/O
will fail with EINVAL, causing zdb (and others) to coredump.
Since userland probably doesn't need optimized buffer caches,
we just force 4K alignment on everything.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Gvozden Neskovic <[email protected]>
Closes #4479
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Updated test case history_001_pos.ksh so it can run in tree. The
original test case assumed /usr/sbin/zfs and /usr/sbin/zpool were
the only valid locations for these utilities. The same modification
has already been made too history_common.kshlib.
The only other failing test case was history_010_pos and that was
the result of the ":linux" suffix not being appended when checking
the long output in the test case.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #4882
|