summaryrefslogtreecommitdiffstats
path: root/include/linux
Commit message (Collapse)AuthorAgeFilesLines
* Fix free memory calculation on v3.14+chrisrd2018-02-232-1/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Provide infrastructure to auto-configure to enum and API changes in the global page stats used for our free memory calculations. arc_free_memory has been broken since an API change in Linux v3.14: 2016-07-28 v4.8 599d0c95 mm, vmscan: move LRU lists to node 2016-07-28 v4.8 75ef7184 mm, vmstat: add infrastructure for per-node vmstats These commits moved some of global_page_state() into global_node_page_state(). The API change was particularly egregious as, instead of breaking the old code, it silently did the wrong thing and we continued using global_page_state() where we should have been using global_node_page_state(), thus indexing into the wrong array via NR_SLAB_RECLAIMABLE et al. There have been further API changes along the way: 2017-07-06 v4.13 385386cf mm: vmstat: move slab statistics from zone to node counters 2017-09-06 v4.14 c41f012a mm: rename global_page_state to global_zone_page_state ...and various (incomplete, as it turns out) attempts to accomodate these changes in ZoL: 2017-08-24 2209e409 Linux 4.8+ compatibility fix for vm stats 2017-09-16 787acae0 Linux 3.14 compat: IO acct, global_page_state, etc 2017-09-19 661907e6 Linux 4.14 compat: IO acct, global_page_state, etc The config infrastructure provided here resolves these issues going back to the original API change in v3.14 and is robust against further Linux changes in this area. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Chris Dunlop <[email protected]> Closes #7170
* Linux 4.16 compat: use correct *_dec_and_test()Tony Hutter2018-02-221-1/+5
| | | | | | | | | | Use refcount_dec_and_test() on 4.16+ kernels, atomic_dec_and_test() on older kernels. https://lwn.net/Articles/714974/ Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes: #7179 Closes: #7211
* Linux 4.11 compat: avoid refcount_t name conflictBrian Behlendorf2018-02-081-0/+6
| | | | | | | | | | | | | | | Related to commit 4859fe796, when directly using the kernel's refcount functions in kernel compatibility code do not map refcount_t to zfs_refcount_t. This leads to a type mismatch. Longer term we should consider renaming refcount_t to zfs_refcount_t in the zfs code base. Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7148
* Linux 4.16 compat: inode_set_iversion()Brian Behlendorf2018-02-081-0/+14
| | | | | | | | | | | | | | | A new interface was added to manipulate the version field of an inode. Add a inode_set_iversion() wrapper for older kernels and use the new interface when available. The i_version field was dropped from the trace point due to the switch to an atomic64_t i_version type. Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7148
* Linux 3.14 compat: IO acct, global_page_state, etcGiuseppe Di Natale2017-09-161-4/+14
| | | | | | | | | | | | | | | | | | | | | | generic_start_io_acct/generic_end_io_acct in the master branch of the linux kernel requires that the request_queue be provided. Move the logic from freemem in the spl to arc_free_memory in arc.c. Do this so we can take advantage of global_page_state interface checks in zfs. Upstream kernel replaced struct block_device with struct gendisk in struct bio. Determine if the function bio_set_dev exists during configure and have zfs use that if it exists. bio_set_dev https://github.com/torvalds/linux/commit/74d4699 global_node_page_state https://github.com/torvalds/linux/commit/75ef718 io acct https://github.com/torvalds/linux/commit/d62e26b Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Giuseppe Di Natale <[email protected]> Closes #6635
* Linux 4.13 compat: bio->bi_status and blk_status_tBrian Behlendorf2017-07-231-1/+91
| | | | | | | | | Commit torvalds/linux@4e4cbee9. The bio->bi_error field was replaced with bio->bi_status which is an enum that describes all possible error types. Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6351
* Linux 4.12 compat: fix super_setup_bdi_name() callLOLi2017-05-251-4/+5
| | | | | | | | | | Provide a format parameter to super_setup_bdi_name() so we don't create duplicate names in '/devices/virtual/bdi' sysfs namespace which would prevent us from mounting more than one ZFS filesystem at a time. Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #6147
* Linux 4.12 compat: CURRENT_TIME removedBrian Behlendorf2017-05-101-0/+11
| | | | | | | | Linux 4.9 added current_time() as the preferred interface to get the filesystem time. CURRENT_TIME was retired in Linux 4.12. Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6114
* Enable Linux read-ahead for a single page on ZVOLsRichard Yao2017-05-041-0/+11
| | | | | | | | | | | | | | | | | | | | Linux has read-ahead logic designed to accelerate sequential workloads. ZFS has its own read-ahead logic called zprefetch that operates on both ZVOLs and datasets. Having two prefetchers active at the same time can cause overprefetching, which unnecessarily reduces IOPS performance on CoW filesystems like ZFS. Testing shows that entirely disabling the Linux prefetch results in a significant performance penalty for reads while commensurate benefits are seen in random writes. It appears that read-ahead benefits are inversely proportional to random write benefits, and so a single page of Linux-layer read-ahead appears to offer the middle ground for both workloads. Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Issue #5902
* Linux 4.12 compat: super_setup_bdi_name()Brian Behlendorf2017-05-021-9/+78
| | | | | | | | | | All filesystems were converted to dynamically allocated BDIs. The destruction of backing_dev_info structures is handled as part of super block destruction. Refactor the code to abstract away the details of creating and destroying a BDI. Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6089
* Reinstate zvol_taskq to fix aio on zvolChunwei Chen2017-04-261-2/+9
| | | | | | | | | | | | | | | | | | | | | | | Commit 37f9dac removed the zvol_taskq for processing zvol requests. This was removed as part of switching to make_request_fn and was motivated by a concern at the time over dispatch latency. However, this also made all bio request synchronous, and caused serious performance issues as the bio submitter would wait for every bio it submitted, effectively making the IO depth 1. This patch reinstate zvol_taskq, and to make sure overlapped I/Os are ordered properly, we take range lock in zvol_request, and pass it along with bio to the I/O functions zvol_{write,discard,read}. In order to facilitate benchmarks a zvol_request_sync module option was added to switch between sync and async request handling. For the moment, the default behavior is synchronous but this is likely to change pending additional testing. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #5824
* Linux 4.11 compat: iops.getattr and friendsOlaf Faaland2017-03-201-0/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In torvalds/linux@a528d35, there are changes to the getattr family of functions, struct kstat, and the interface of inode_operations .getattr. The inode_operations .getattr and simple_getattr() interface changed to: int (*getattr) (const struct path *, struct dentry *, struct kstat *, u32 request_mask, unsigned int query_flags) The request_mask argument indicates which field(s) the caller intends to use. Fields the caller has not specified via request_mask may be set in the returned struct anyway, but their values may be approximate. The query_flags argument indicates whether the filesystem must update the attributes from the backing store. Currently both fields are ignored. It is possible that getattr-related functions within zfs could be optimized based on the request_mask. struct kstat includes new fields: u32 result_mask; /* What fields the user got */ u64 attributes; /* See STATX_ATTR_* flags */ struct timespec btime; /* File creation time */ Fields attribute and btime are cleared; the result_mask reflects this. These appear to be optional based on simple_getattr() and vfs_getattr() within the kernel, which take the same approach. Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #5875
* Fix harmless "BARRIER is deprecated" kernel warning on Centos 6.8Tony Hutter2017-03-081-9/+6
| | | | | | | | | | | | | | A one time warning after module load that "BARRIER is deprecated" was seen on the heavily patched 2.6.32-642.13.1.el6.x86_64 Centos 6.8 kernel. It seems that kernel had both the old BARRIER and the newer FLUSH/FUA interfaces defined. This fixes the warning by prefering the newer FLUSH/FUA interface if it's available. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #5739 Closes #5828
* Fix multi-line error messages in blkdev_compat.hbunder20152017-03-071-9/+4
| | | | | | | | | | Fix multi-line error messages in blkdev_compat.h by changing error-generating multi-line error messages to single line errors. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: bunder2015 <[email protected]> Closes #5860
* codebase style improvements for OpenZFS 6459 portGeorge Melikov2017-01-221-4/+8
|
* 4.10 compat - BIO flag changes and othersTim Chase2016-12-301-9/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | [bio] The req_op enum was changed to req_opf. Update the "Linux 4.8 API" autotools checks to use an int to determine whether the various REQ_OP values are defined. This should work properly on kernels >= 4.8. [bio] bio_set_op_attrs() is now an inline function and can't be detected with #ifdef. Add a configure check to determine whether bio_set_op_attrs() is defined. Move the local definition of it from vdev_disk.c to blkdev_compat.h for consistency with other related compability shims. [bio] The read/write flags and their modifiers, including WRITE_FLUSH, WRITE_FUA and WRITE_FLUSH_FUA have been removed from fs.h. Add the new bio_set_flush() compatibility wrapper to replace VDEV_WRITE_FLUSH_FUA and set the flags appropriately for each supported kernel version. [vfs] The generic_readlink() function has been made static. If .readlink in inode_operations is NULL, generic_readlink() is used. [zol typo] Completely unrelated to 4.10 compat, fix a typo in the check for REQ_OP_SECURE_ERASE so that the proper macro is defined: s/HAVE_REQ_OP_SECURE_DISCARD/HAVE_REQ_OP_SECURE_ERASE/ Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes #5499
* Use cstyle -cpP in `make cstyle` checkBrian Behlendorf2016-12-121-4/+4
| | | | | | | | | | | | | | | | | | | | | | | Enable picky cstyle checks and resolve the new warnings. The vast majority of the changes needed were to handle minor issues with whitespace formatting. This patch contains no functional changes. Non-whitespace changes are as follows: * 8 times ; to { } in for/while loop * fix missing ; in cmd/zed/agents/zfs_diagnosis.c * comment (confim -> confirm) * change endline , to ; in cmd/zpool/zpool_main.c * a number of /* BEGIN CSTYLED */ /* END CSTYLED */ blocks * /* CSTYLED */ markers * change == 0 to ! * ulong to unsigned long in module/zfs/dsl_scan.c * rearrangement of module_param lines in module/zfs/metaslab.c * add { } block around statement after for_each_online_node Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Håkan Johansson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #5465
* Use set_cached_acl and forget_cached_acl when possibleChunwei Chen2016-11-071-6/+6
| | | | | | | | | Originally, these two function are inline, so their usability is tied to posix_acl_release. However, since Linux 3.14, they became EXPORT_SYMBOL, so we can always use them. In this patch, we create an independent test for these two functions so we can use them when possible. Signed-off-by: Chunwei Chen <[email protected]>
* Batch free zpl_posix_acl_releaseChunwei Chen2016-11-071-8/+3
| | | | | | | | | | | | | | | | | | | | | | | Currently every calls to zpl_posix_acl_release will schedule a delayed task, and each delayed task will add a timer. This used to be fine except for possibly bad performance impact. However, in Linux 4.8, a new timer wheel implementation[1] is introduced. In this new implementation, the larger the delay, the less accuracy the timer is. So when we have a flood of timer from zpl_posix_acl_release, they will expire at the same time. Couple with the fact that task_expire will do linear search with lock held. This causes an extreme amount of contention inside interrupt and would actually lockup the system. We fix this by doing batch free to prevent a flood of delayed task. Every call to zpl_posix_acl_release will put the posix_acl to be freed on a lockless list. Every batch window, 1 sec, the zpl_posix_acl_free will fire up and free every posix_acl that passed the grace period on the list. This way, we only have one delayed task every second. [1] https://lwn.net/Articles/646950/ Signed-off-by: Chunwei Chen <[email protected]>
* Fix lookup_bdev() on UbuntuHajo Möller2016-10-261-4/+13
| | | | | | | | | | | | Ubuntu added support for checking inode permissions to lookup_bdev() in kernel commit 193fb6a2c94fab8eb8ce70a5da4d21c7d4023bee (merged in 4.4.0-6.21). Upstream bug: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1636517 This patch adds a test for Ubuntu's variant of lookup_bdev() to configure and calls the function in the correct way. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Hajo Möller <[email protected]> Closes #5336
* Linux 4.9 compat: inode_change_ok() renamed setattr_prepare()Brian Behlendorf2016-10-201-0/+11
| | | | | | | | | | | | In torvalds/linux@31051c8 the inode_change_ok() function was renamed setattr_prepare() and updated to take a dentry ratheri than an inode. Update the code to call the setattr_prepare() and add a wrapper function which call inode_change_ok() for older kernels. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Requires-spl: refs/pull/581/head
* Add parity generation/rebuild using 128-bits NEON for Aarch64Romain Dolbeau2016-10-032-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-use the framework established for SSE2, SSSE3 and AVX2. However, GCC is using FP registers on Aarch64, so unlike SSE/AVX2 we can't rely on the registers being left alone between ASM statements. So instead, the NEON code uses C variables and GCC extended ASM syntax. Note that since the kernel explicitly disable vector registers, they have to be locally re-enabled explicitly. As we use the variable's number to define the symbolic name, and GCC won't allow duplicate symbolic names, numbers have to be unique. Even when the code is not going to be used (e.g. the case for 4 registers when using the macro with only 2). Only the actually used variables should be declared, otherwise the build will fails in debug mode. This requires the replacement of the XOR(X,X) syntax by a new ZERO(X) macro, which does the same thing but without repeating the argument. And perhaps someday there will be a machine where there is a more efficient way to zero a register than XOR with itself. This affects scalar, SSE2, SSSE3 and AVX2 as they need the new macro. It's possible to write faster implementations (different scheduling, different unrolling, interleaving NEON and scalar, ...) for various cores, but this one has the advantage of fitting in the current state of the code, and thus is likely easier to review/check/merge. The only difference between aarch64-neon and aarch64-neonx2 is that aarch64-neonx2 unroll some functions some more. Reviewed-by: Gvozden Neskovic <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #4801
* Linux compat: Grsecurity kernelGvozden Neskovic2016-08-222-1/+41
| | | | | | | | | | | API Change: Module parameter set/get methods take const parameter in Grsecurity kernel v4.7.1 Signed-off-by: Gvozden Neskovic <[email protected]> Signed-off-by: Jason Zaman <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4997 Closes #5001
* Add support for AVX-512 family of instruction setsGvozden Neskovic2016-08-161-12/+233
| | | | | | | | | | | | | This patch adds compiler and runtime tests (user and kernel) for following instruction sets: avx512f, avx512cd, avx512er, avx512pf, avx512bw, avx512dq, avx512vl, avx512ifma, avx512vbmi. note: Linux support for AVX-512F (Foundation) instruction set started with linux v3.15 Signed-off-by: Gvozden Neskovic <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4952
* Reorder HAVE_BIO_RW_* checksBrian Behlendorf2016-08-121-4/+12
| | | | | | | | | | | | The HAVE_BIO_RW_* #ifdef's must appear before REQ_* #ifdef's in the bio_is_flush() and bio_is_discard() macros. Linux 2.6.32 era kernels defined both of values and the HAVE_BIO_RW_* must be used in this case. This resulted in a panic in zconfig test 5. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4951 Closes #4959
* Use file_dentry and file_inode wrappersChen Haiquan2016-08-111-0/+12
| | | | | | | | | | | | Fix bugs due to kernel change in torvalds/linux@4bacc9c9234c ("overlayfs: Make f_path always point to the overlay and f_inode to the underlay"). This problem crashes system when use zfs as a layer of overlayfs. Signed-off-by: Chen Haiquan <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4914 Closes #4935
* Linux 4.8 compat: Fix removal of bio->bi_rw memberBrian Behlendorf2016-08-111-56/+109
| | | | | | | | | | | | | | | | | | All users of bio->bi_rw have been replaced with compatibility wrappers. This allows the kernel specific logic to be abstracted away, and for each of the supported cases to be documented with the wrapper. The updated interfaces are as follows: * void blk_queue_set_write_cache(struct request_queue *, bool, bool) * boolean_t bio_is_flush(struct bio *) * boolean_t bio_is_fua(struct bio *) * boolean_t bio_is_discard(struct bio *) * boolean_t bio_is_secure_erase(struct bio *) * VDEV_WRITE_FLUSH_FUA Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4951
* Linux 4.8 compat: posix_acl_valid()Brian Behlendorf2016-08-081-0/+12
| | | | | | | | | | | | | | The posix_acl_valid() function has been updated to require a user namespace. Filesystem callers should normally provide the user_ns from the super block associcated with the ACL; the zpl_posix_acl_valid() wrapper has been added for this purpose. See https://github.com/torvalds/linux/commit/0d4d717f for complete details. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4922
* Retire HAVE_CURRENT_UMASK and HAVE_POSIX_ACL_CACHINGBrian Behlendorf2016-08-081-13/+0
| | | | | | | | | | Remove ZFS_AC_KERNEL_CURRENT_UMASK and ZFS_AC_KERNEL_POSIX_ACL_CACHING configure checks, all supported kernel provide this functionality. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4922
* Linux 4.8 compat: new s_user_ns member of struct super_blockNikolay Borisov2016-08-081-0/+17
| | | | | | | | | | | | | Kernel 4.8 paved the way to enabling mounting a file system inside a non-init user namespace. To facilitate this a s_user_ns member was added holding the userns in which the filesystem's instance was mounted. This enables doing the uid/gid translation relative to this particular username space and not the default init_user_ns. Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4928
* Linux 4.8 compat: REQ_OP and bio_set_op_attrs()Chunwei Chen2016-07-291-5/+20
| | | | | | | | | | | | | | | | | | New REQ_OP_* definitions have been introduced to separate the WRITE, READ, and DISCARD operations from the flags. This included changing the encoding of bi_rw. It places REQ_OP_* in high order bits and other stuff in low order bits. This encoding is done through the new helper function bio_set_op_attrs. For complete details refer to: https://github.com/torvalds/linux/commit/f215082 https://github.com/torvalds/linux/commit/4e1b2d5 Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4892 Closes #4899
* Linux 4.8 compat: REQ_PREFLUSHBrian Behlendorf2016-07-291-0/+12
| | | | | | | | | | | | The REQ_FLUSH flag was renamed REQ_PREFLUSH to avoid confusion with REQ_OP_FLUSH. See https://github.com/torvalds/linux/commit/28a8f0d3 for complete details. Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4892 Issue #4899
* Check whether the kernel supports i_uid/gid_read/write helpersNikolay Borisov2016-07-251-0/+53
| | | | | | | | | | | | | Since the concept of a kuid and the need to translate from it to ordinary integer type was added in kernel version 3.5 implement necessary plumbing to be able to detect this condition during compile time. If the kernel doesn't support the kuid then just fall back to directly accessing the respective struct inode's members Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4685 Issue #227
* Linux 4.7 compat: handler->set() takes both dentry and inodeBrian Behlendorf2016-06-011-1/+15
| | | | | | | | | Counterpart to fd4c7b7, the same approach was taken to resolve the compatibility issue. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4717 Issue #4665
* Linux 4.7 compat: replace blk_queue_flush with blk_queue_write_cacheChunwei Chen2016-05-201-0/+27
| | | | | | Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4665
* Linux 4.7 compat: handler->get() takes both dentry and inodeChunwei Chen2016-05-201-1/+14
| | | | | | Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4665
* Fix user namespaces uid/gid mappingBrian Behlendorf2016-04-301-4/+4
| | | | | | | | | | | As described in torvalds/linux@5f3a4a2 the &init_user_ns, and not the current user_ns, should be passed to posix_acl_from_xattr() and posix_acl_to_xattr(). Conveniently the init_user_ns is available through the init credential (kcred). Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Massimo Maggi <[email protected]> Closes #4177
* Support for vectorized algorithms on x86Gvozden Neskovic2016-03-212-1/+390
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4381
* Linux 4.5 compat: xattr list handlerBrian Behlendorf2016-01-201-27/+93
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The registered xattr .list handler was simplified in the 4.5 kernel to only perform a permission check. Given a dentry for the file it must return a boolean indicating if the name is visible. This differs slightly from the previous APIs which also required the function to copy the name in to the provided list and return its size. That is now all the responsibility of the caller. This should be straight forward change to make to ZoL since we've always required the caller to make the copy. However, this was slightly complicated by the need to support 3 older APIs. Yes, between 2.6.32 and 4.5 there are 4 versions of this interface! Therefore, while the functional change in this patch is small it includes significant cleanup to make the code understandable and maintainable. These changes include: - Improved configure checks for .list, .get, and .set interfaces. - Interfaces checked from newest to oldest. - Strict checking for each possible known interface. - Configure fails when no known interface is available. - HAVE_*_XATTR_LIST renamed HAVE_XATTR_LIST_* for consistency with similar iops and fops configure checks. - POSIX_ACL_XATTR_{DEFAULT|ACCESS} were removed forcing callers to move to their replacements, XATTR_NAME_POSIX_ACL_{DEFAULT|ACCESS}. Compatibility wrapper were added for old kernels. - ZPL_XATTR_LIST_WRAPPER added which behaves the same as the existing ZPL_XATTR_{GET|SET} WRAPPERs. Only the inode is guaranteed to be a valid pointer, passing NULL for the 'list' and 'name' variables is allowed and must be checked for. All .list functions were updated to use the wrapper to aid readability. - zpl_xattr_filldir() updated to use the .list function for its permission check which is consistent with the updated Linux 4.5 interface. If a .list function is registered it should return 0 to indicate a name should be skipped, if there is no registered function the name will be added. - Additional documentation from xattr(7) describing the correct behavior for each namespace was added before the relevant handlers. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Issue #4228
* Use uio for zvol_{read,write}Chunwei Chen2015-12-151-0/+2
| | | | | | | | | Since uio now supports bvec, we can convert bio into uio and reuse dmu_{read,write}_uio. This way, we can remove some duplicate code. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4078
* Linux 4.4 compat: xattr operations takes xattr_handlerChunwei Chen2015-12-011-0/+28
| | | | | | | | | | The xattr_hander->{list,get,set} were changed to take a xattr_handler, and handler_flags argument was removed and should be accessed by handler->flags. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4021
* Linux 4.3 compat: bio_end_io_t / BIO_UPTODATELukas Wunner2015-09-251-12/+9
| | | | | | | | | | | | | | | | | | Commit torvalds/linux@4246a0b63bd8f56a1469b12eafeb875b1041a451 ("block: add a bi_error field to struct bio") dropped the error argument from bio_endio in favor of newly introduced bio->bi_error. This also replaces bio->bi_flags value BIO_UPTODATE. bio_endio was a 3 argument function until Linux 2.6.24, which made it a 2 argument function, and now the prototype has changed yet again to a 1 argument function. Support for pre 2.6.24 kernels was already dropped with 37f9dac592bf ("zvol processing should use struct bio") which assumed the 2 argument version in zvol_request(). Remaining code to support the 3 argument version is hereby removed. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Lukas Wunner <[email protected]> Issue #3799
* Reintroduce IO accounting on zvols on Linux 3.19+Richard Yao2015-09-091-0/+5
| | | | | | | | | | | | | | zfsonlinux/zfs@e20cd6f7a8922709b1aa2ecefd783390102d79e0 caused us to lose IO accounting on zvols. When I originally wrote that last year, the symbols we needed to maintain IO accounting were GPL exported, but torvalds/linux@394ffa503bc40e32d7f54a9b817264e81ce131b4 provided suitable symbols for restoring this functionality 4 months later. We can call them to restore the IO accounting on Linux 3.19 and later as well as any older kernels where that patch is backported. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3741
* Remove blk_rq_bytes()/blk_rq_sectors autotools checksRichard Yao2015-09-041-23/+0
| | | | Signed-off-by: Richard Yao <[email protected]>
* Remove blk_rq_pos() autotools checkRichard Yao2015-09-041-8/+0
| | | | Signed-off-by: Richard Yao <[email protected]>
* Remove blk_fetch_request() autotools checkRichard Yao2015-09-041-14/+0
| | | | Signed-off-by: Richard Yao <[email protected]>
* Remove blk_requeue_request() autotools checkRichard Yao2015-09-041-8/+0
| | | | Signed-off-by: Richard Yao <[email protected]>
* Remove blk_end_request() autotools check.Richard Yao2015-09-041-74/+0
| | | | Signed-off-by: Richard Yao <[email protected]>
* Remove rq_is_sync() autotools checkRichard Yao2015-09-041-8/+0
| | | | Signed-off-by: Richard Yao <[email protected]>
* Remove rq_for_each_segment() autotools checkRichard Yao2015-09-041-42/+0
| | | | Signed-off-by: Richard Yao <[email protected]>