summaryrefslogtreecommitdiffstats
path: root/module
Commit message (Collapse)AuthorAgeFilesLines
* OpenZFS 8484 - Implement aggregate sum and use for arc countersPaul Dagnelie2018-06-067-85/+483
| | | | | | | | | | | | | | | | | | | In pursuit of improving performance on multi-core systems, we should implements fanned out counters and use them to improve the performance of some of the arc statistics. These stats are updated extremely frequently, and can consume a significant amount of CPU time. Authored by: Paul Dagnelie <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Approved by: Dan McDonald <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Ported-by: Paul Dagnelie <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/8484 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7028a8b92b7 Issue #3752 Closes #7462
* Add pool state /proc entry, "SUSPENDED" poolsTony Hutter2018-06-063-2/+104
| | | | | | | | | | | | | | | | | | | 1. Add a proc entry to display the pool's state: $ cat /proc/spl/kstat/zfs/tank/state ONLINE This is done without using the spa config locks, so it will never hang. 2. Fix 'zpool status' and 'zpool list -o health' output to print "SUSPENDED" instead of "ONLINE" for suspended pools. Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Richard Elling <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7331 Closes #7563
* OpenZFS 9464 - txg_kick() fails to see that we are quiescingSerapheim Dimitropoulos2018-06-042-7/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | txg_kick() fails to see that we are quiescing, forcing transactions to their next stages without leaving them accumulate changes Creating a fragmented pool in a DCenter VM and continuously writing to it with multiple instances of randwritecomp, we get the following output from txg.d: 0ms 311MB in 4114ms (95% p1) 75MB/s 544MB (76%) 336us 153ms 0ms 0ms 8MB in 51ms ( 0% p1) 163MB/s 474MB (66%) 129us 34ms 0ms 0ms 366MB in 4454ms (93% p1) 82MB/s 572MB (79%) 498us 20ms 0ms 0ms 406MB in 5212ms (95% p1) 77MB/s 591MB (82%) 661us 37ms 0ms 0ms 340MB in 5110ms (94% p1) 66MB/s 622MB (86%) 1048us 41ms 1ms 0ms 3MB in 61ms ( 0% p1) 51MB/s 419MB (58%) 33us 0ms 0ms 0ms 361MB in 3555ms (88% p1) 101MB/s 542MB (75%) 335us 40ms 0ms 0ms 356MB in 4592ms (92% p1) 77MB/s 561MB (78%) 430us 89ms 1ms 0ms 11MB in 129ms (13% p1) 90MB/s 507MB (70%) 222us 15ms 0ms 0ms 281MB in 2520ms (89% p1) 111MB/s 542MB (75%) 334us 42ms 0ms 0ms 383MB in 3666ms (91% p1) 104MB/s 557MB (77%) 411us 133ms 0ms 0ms 404MB in 5757ms (94% p1) 70MB/s 635MB (88%) 1274us 123ms 2ms 4ms 367MB in 4172ms (89% p1) 88MB/s 556MB (77%) 401us 51ms 0ms 0ms 42MB in 470ms (44% p1) 90MB/s 557MB (77%) 412us 43ms 0ms 0ms 261MB in 2273ms (88% p1) 114MB/s 556MB (77%) 407us 27ms 0ms 0ms 394MB in 3646ms (85% p1) 108MB/s 552MB (77%) 393us 304ms 0ms 0ms 275MB in 2416ms (89% p1) 113MB/s 510MB (71%) 200us 53ms 0ms 0ms 9MB in 53ms ( 0% p1) 169MB/s 483MB (67%) 140us 100ms 1ms The TXGs that are getting synced and don't have lots of changes are pushed by txg_kick() which basically forces the current open txg to get to the quiesced state: if (tx->tx_syncing_txg == 0 && tx->tx_quiesce_txg_waiting <= tx->tx_open_txg && tx->tx_sync_txg_waiting <= tx->tx_synced_txg && tx->tx_quiesced_txg <= tx->tx_synced_txg) { tx->tx_quiesce_txg_waiting = tx->tx_open_txg + 1; cv_broadcast(&tx->tx_quiesce_more_cv); } The problem is that the above code doesn't check if we are currently quiescing anything (only if a quiesce or a sync has been requested, ..etc) so the following scenario can happen: 1] We have an open txg A that had enough dirty data (more than zfs_dirty_data_sync) and it was pushed to the quiesced state, and opened a new txg B. No txg is currently being synced. 2] Immediately after the opening of B, txg_kick() was run by some other write (and because of A's dirty data) and saw that we are not currently syncing any txg and no one has requested quiescing so it requests one by bumping tx_quiesce_txg_waiting and broadcasts the quiesce thread. 3] The quiesce thread just passed txg A to be synced and sees that a quiescing request has been sent to it so it immediately grabs B without letting it gather enough data, putting it in a quiesced state and opening a new txg C. In this scenario txg B, is an example of how the entries of interest show up in the txg.d output. Ideally we would like txg_kick() to get triggered only when we are sure that we are not syncing AND not quiescing any txg. This way we can kick an open TXG to the quiescing state when we are sure that there is nothing going on and we would benefit from the different states running concurrently. Authored by: Serapheim Dimitropoulos <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: Andriy Gapon <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9464 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/1cd7635b Closes #7587
* OpenZFS 9235 - rename zpool_rewind_policy_t to zpool_load_policy_tPavel Zakharov2018-06-042-49/+47
| | | | | | | | | | | | | | | | | | | | | | | We want to be able to pass various settings during import/open of a pool, which are not only related to rewind. Instead of adding a new policy and duplicate a bunch of code, we should just rename rewind_policy to a more generic term like load_policy. For instance, we'd like to set spa->spa_import_flags from the nvlist, rather from a flags parameter passed to spa_import as in some cases we want those flags not only for the import case, but also for the open case. One such flag could be ZFS_IMPORT_MISSING_LOG (as used in zdb) which would allow zfs to open a pool when logs are missing. Authored by: Pavel Zakharov <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Approved by: Robert Mustacchi <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9235 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/d2b1e44 Closes #7532
* OpenZFS 9329 - panic in zap_leaf_lookup() due to concurrent zapificationMatthew Ahrens2018-05-312-14/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For the null pointer issue shown below, the solution is to initialize the contents of the object before changing its type, so that concurrent accessors will see it as non-zapified until it is ready for access via the ZAP. BAD TRAP: type=e (#pf Page fault) rp=ffffff00ff520440 addr=20 occurred in module "zfs" due to a NULL pointer dereference ffffff00ff520320 unix:die+df () ffffff00ff520430 unix:trap+dc0 () ffffff00ff520440 unix:cmntrap+e6 () ffffff00ff520590 zfs:zap_leaf_lookup+46 () ffffff00ff520640 zfs:fzap_lookup+a9 () ffffff00ff5206e0 zfs:zap_lookup_norm+111 () ffffff00ff520730 zfs:zap_contains+42 () ffffff00ff520760 zfs:dsl_dataset_has_resume_receive_state+47 () ffffff00ff520900 zfs:get_receive_resume_stats+3e () ffffff00ff520a90 zfs:dsl_dataset_stats+262 () ffffff00ff520ac0 zfs:dmu_objset_stats+2b () ffffff00ff520b10 zfs:zfs_ioc_objset_stats_impl+64 () ffffff00ff520b60 zfs:zfs_ioc_objset_stats+33 () ffffff00ff520bd0 zfs:zfs_ioc_dataset_list_next+140 () ffffff00ff520c80 zfs:zfsdev_ioctl+4d7 () ffffff00ff520cc0 genunix:cdev_ioctl+39 () ffffff00ff520d10 specfs:spec_ioctl+60 () ffffff00ff520da0 genunix:fop_ioctl+55 () ffffff00ff520ec0 genunix:ioctl+9b () ffffff00ff520f10 unix:brand_sys_sysenter+1c9 () Porting Notes: * DMU_OT_BYTESWAP conditional in zap_lockdir_impl() kept. Authored by: Matthew Ahrens <[email protected]> Reviewed by: Pavel Zakharov <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed-by: George Melikov <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9329 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/e8e0f97 Closes #7578
* OpenZFS 9328 - zap code can take advantage of c99Matthew Ahrens2018-05-313-328/+224
| | | | | | | | | | | | | | | | | | The ZAP code was written before we allowed c99 in the Solaris kernel. We should change it to take advantage of being able to declare variables where they are first used. This reduces variable scope and means less scrolling to find the type of variables. Authored by: Matthew Ahrens <[email protected]> Reviewed by: Steve Gonczi <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed-by: George Melikov <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://illumos.org/issues/9328 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/76ead05 Closes #7578
* zpool reopen should detect expanded devicesSara Hartse2018-05-312-13/+36
| | | | | | | | | | | | | | | | | | | | Update bdev_capacity to have wholedisk vdevs query the size of the underlying block device (correcting for the size of the efi parition and partition alignment) and therefore detect expanded space. Correct vdev_get_stats_ex so that the expandsize is aligned to metaslab size and new space is only reported if it is large enough for a new metaslab. Reviewed by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed by: John Wren Kennedy <[email protected]> Signed-off-by: sara hartse <[email protected]> External-issue: LX-165 Closes #7546 Issue #7582
* Fix zio->io_priority failed (7 < 6) assertTony Hutter2018-05-291-0/+9
| | | | | | | | | | | | | This fixes an assert in vdev_queue_change_io_priority(): VERIFY3(zio->io_priority < ZIO_PRIORITY_NUM_QUEUEABLE) failed (7 < 6) PANIC at vdev_queue.c:832:vdev_queue_change_io_priority() Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7566 Closes #7542
* Update build system and packagingBrian Behlendorf2018-05-29119-309/+193
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minimal changes required to integrate the SPL sources in to the ZFS repository build infrastructure and packaging. Build system and packaging: * Renamed SPL_* autoconf m4 macros to ZFS_*. * Removed redundant SPL_* autoconf m4 macros. * Updated the RPM spec files to remove SPL package dependency. * The zfs package obsoletes the spl package, and the zfs-kmod package obsoletes the spl-kmod package. * The zfs-kmod-devel* packages were updated to add compatibility symlinks under /usr/src/spl-x.y.z until all dependent packages can be updated. They will be removed in a future release. * Updated copy-builtin script for in-kernel builds. * Updated DKMS package to include the spl.ko. * Updated stale AUTHORS file to include all contributors. * Updated stale COPYRIGHT and included the SPL as an exception. * Renamed README.markdown to README.md * Renamed OPENSOLARIS.LICENSE to LICENSE. * Renamed DISCLAIMER to NOTICE. Required code changes: * Removed redundant HAVE_SPL macro. * Removed _BOOT from nvpairs since it doesn't apply for Linux. * Initial header cleanup (removal of empty headers, refactoring). * Remove SPL repository clone/build from zimport.sh. * Use of DEFINE_RATELIMIT_STATE and DEFINE_SPINLOCK removed due to build issues when forcing C99 compilation. * Replaced legacy ACCESS_ONCE with READ_ONCE. * Include needed headers for `current` and `EXPORT_SYMBOL`. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> TEST_ZIMPORT_SKIP="yes" Closes #7556
* Merge branch 'zfsonlinux/merge-spl'Brian Behlendorf2018-05-2921-0/+9806
|\ | | | | | | | | | | | | | | | | | | Merge a minimal version of the zfsonlinux/spl repository in to the zfsonlinux/zfs repository. Care was taken to prevent file conflicts when merging and to preserve the spl repository history. The spl kernel module remains under the GPLv2 license as documented by the additional THIRDPARTYLICENSE.gplv2 file. Signed-off-by: Brian Behlendorf <[email protected]>
| * Prepare SPL repo to merge with ZFS repoBrian Behlendorf2018-05-2932-8637/+430
| | | | | | | | | | | | | | | | | | This commit removes everything from the repository except the core SPL implementation for Linux. Those files which remain have been moved to non-conflicting locations to facilitate the merge. The README.md and associated files have been updated accordingly. Signed-off-by: Brian Behlendorf <[email protected]>
| * Fix more cstyle warningsBrian Behlendorf2018-02-249-34/+51
| | | | | | | | | | | | | | | | | | | | This patch contains no functional changes. It is solely intended to resolve cstyle warnings in order to facilitate moving the spl source code in to the zfs repository. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed by: George Melikov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #687
| * Fix function name typosTomohiro Kusumi2018-02-211-3/+3
| | | | | | | | | | | | | | | | vn_init() and vn_fini() had been renamed by 12ff95ff in 2011. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tomohiro Kusumi <[email protected]> Closes #686
| * Staticize kstat_default_update()Tomohiro Kusumi2018-02-211-1/+1
| | | | | | | | | | | | | | | | | | This is only used via ->ks_update of `kstat_t *`. This isn't exported nor do headers have its prototype. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tomohiro Kusumi <[email protected]> Closes #686
| * Fix cstyle warningsBrian Behlendorf2018-02-0717-882/+873
| | | | | | | | | | | | | | | | This patch contains no functional changes. It is solely intended to resolve cstyle warnings in order to facilitate moving the spl source code in to the zfs repository. Signed-off-by: Brian Behlendorf <[email protected]> Closes #681
| * Add cv_timedwait_io()Brian Behlendorf2018-01-241-8/+50
| | | | | | | | | | | | | | | | Add missing helper function cv_timedwait_io(), it should be used when waiting on IO with a specified timeout. Reviewed-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #674
| * Linux 4.15 compat: timer updatesTony Hutter2017-12-211-2/+25
| | | | | | | | | | | | | | | | Use timer_setup() macro and new timeout function definition. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #670 Closes #671
| * Linux 4.14 compat: vfs_read & vfs_writeBrian Behlendorf2017-11-151-17/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The kernel_read & kernel_write functions have always wrapped the vfs_read & vfs_write functions respectively. However, they could not be used by vn_rdwr() since the offset wasn't passed as a pointer. This prevented us from being able to properly update the file offset. Linux 4.14 unexported vfs_read & vfs_write but also changed the signature of kernel_read & kernel_write to provide the needed functionality. Use these updated functions when available. Reviewed-by: Pritam Baral <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #656 Closes #667
| * Remove all spin_is_locked callsJames Cowgill2017-10-304-20/+0
| | | | | | | | | | | | | | | | | | | | On systems with CONFIG_SMP turned off, spin_is_locked always returns false causing these assertions to fail. Remove them as suggested in zfsonlinux/zfs#6558. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: James Cowgill <[email protected]> Closes #665
| * Remove vn_rename and vn_removeBrian Behlendorf2017-10-272-313/+0
| | | | | | | | | | | | | | | | | | | | Both vn_rename and vn_remove have been historically problematic to implement reliably. Rather than fixing them yet again they are being removed. Reviewed-by: Arkadiusz Bubala <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #648 Closes #661
| * Make file headers conform to ZFS style standardOlaf Faaland2017-10-0931-62/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No semantic changes. Change /************\ and \************/ to /* and */ Signed-off-by: Olaf Faaland <[email protected]>
| * Pool io stat shows wlentime instead of rlentimeRichard Elling2017-09-251-1/+1
| | | | | | | | | | | | | | | | | | Reviewed-by: Tim Chase <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Elling <[email protected]> Closes #652 Closes #651
| * Allow longer SPA names in statsgaurkuma2017-08-111-4/+9
| | | | | | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: gaurkuma <[email protected]> Closes #641
| * make module/spl/spl-kmem.c non-executable (again)Fabian-Gruenbichler2017-08-101-0/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | This was probably accidentally committed in aeb9baa618beea1458ab3ab22cbc0f39213da6cf Fix: handle NULL case in spl_kmem_free_track() Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Gvozden Neskovic <[email protected]> Signed-off-by: Fabian Grünbichler <[email protected]> Closes #644
| * Add assert under lock to detect cases of dispach of a preallocatedBoris Protopopov2017-08-081-0/+6
| | | | | | | | | | | | | | | | taskq work item to more than one queue concurrently. Also, please see discussion in zfsonlinux/zfs#3840. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Boris Protopopov <[email protected]> Closes #609
| * Fix use-after-free in taskq_seq_show_implChunwei Chen2017-08-041-26/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | taskq_seq_show_impl walks the tq_active_list to show the tqent_func and tqent_arg. However for taskq_dispatch_ent, it's very likely that the task entry will be freed during the function call, and causes a use-after-free bug. To fix this, we duplicate the task entry to an on-stack struct, and assign it instead to tqt_task. This way, the tq_lock alone will guarantee its safety. Reviewed-by: Tim Chase <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #638 Closes #640
| * Add __divmoddi4 and __udivmoddi4 for 32-bit archChunwei Chen2017-08-031-0/+43
| | | | | | | | | | | | | | | | | | | | gcc-7 seems to use __udivmoddi4 for 64-bit division on 32-bit arch. This patch implement them so we don't get undefined reference error. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: loli10K <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes zfsonlinux/zfs#6417 Closes #636
| * Module parameter to enable spl_panic() to panic the kernelOleg Drokin2017-07-251-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | In unattended operations it's often more useful to have node panic and reboot when it encounters problems as opposed to sit there indefinitely waiting for somebody to discover it. This implements an spl_panic_crash module parameter, set it to nonzero to cause spl_panic() to call panic(). Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Oleg Drokin <[email protected]> Closes #634
| * Avoid WARN() from procfs on kstat collisionLOLi2017-07-241-0/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we load a ZFS pool having spa_name equals to some existing kstat we would have to create a duplicate entry, which procfs doesn't like. For instance a ZFS pool named "zil" would have its kstat "txgs" (module "zfs/zil") intalled under "/proc/spl/kstat/zfs/zil": unfortunately we already have a kstat named "zil" (module "zfs") installed in the same procfs location. Avoid this issue by skipping the duplicate entry creation in procfs. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #628
| * Linux 4.13 compat: wait queuesBrian Behlendorf2017-07-235-7/+19
| | | | | | | | | | | | | | | | | | | | | | | | Commit torvalds/linux@ac6424b9 - Renamed struct wait_queue -> struct wait_queue_entry. Commit torvalds/linux@2055da97 - Renamed wait_queue_head::task_list -> wait_queue_head::head - Renamed wait_queue_entry::task_list -> wait_queue_entry::entry Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #629
| * Don't cache the system hostidBrian Behlendorf2017-07-132-46/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | Historically the SPL cached the system hostid the first time it was accessed. This was done to speed up subsequent accesses. But in practice the system host id is rarely accessed and its inconvenient that it doesn't promptly detect /etc/hostid configuration changes. Therefore, zone_get_hostid() has been updated to always refresh the system hostid reported. Reviewed-by: Olaf Faaland <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #626
| * Improve gitignoreChunwei Chen2017-05-251-0/+2
| | | | | | | | | | | | | | Exclude Makefile.in in module/ and fix the gitignore in cmd/ Also, ignore *.patch and *.orig files Signed-off-by: Chunwei Chen <[email protected]>
| * Fix cv_timedwait timeoutBrian Behlendorf2017-05-251-18/+12
| | | | | | | | | | | | | | | | | | | | | | Perform the already past expiration time check before updating cvp->cv_mutex with the provided mutex. This check only depends on local state. Doing it first ensures that cvp->cv_mutex will not be updated in the timeout case or if it's ever called with an expire_time <= now. Reviewed-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #616
| * Linux 4.12 compat: PF_FSTRANS was removedChunwei Chen2017-05-091-6/+6
| | | | | | | | | | | | | | | | Change SPL_FSTRANS to optionally contains PF_FSTRANS. Also, add __spl_pf_fstrans_check for the checks specifically for PF_FSTRANS. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #614
| * Linux 4.11 compat: remove stub for __put_task_structOlaf Faaland2017-03-201-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Before kernel 2.6.29 credentials were embedded in task_structs, and zfs had cases where one thread would need to refer to the credential of another thread, forcing it to take a hold on the foreign thread's task_struct to ensure it was not freed. Since 2.6.29, the credential has been moved out of the task_struct into a cred_t. In addition, the mainline kernel originally did not export __put_task_struct() but the RHEL5 kernel did, according to zfsonlinux/spl@e811949a570. As of 2.6.39 the mainline kernel exports it. There is no longer zfs code that takes or releases holds on a task_struct, and so there is no longer any reference to __put_task_struct(). This affects the linux 4.11 kernel because the prototype for __put_task_struct() is in a new include file (linux/sched/task.h) and so the config check failed to detect the exported symbol. Removing the unnecessary stub and corresponding config check. This works on kernels since the oldest one currently supported, 2.6.32 as shipped with Centos/RHEL. Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #608
| * Linux 4.11 compat: vfs_getattr() takes 4 argsOlaf Faaland2017-03-201-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are changes to vfs_getattr() in torvalds/linux@a528d35. The new interface is: int vfs_getattr(const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags) The request_mask argument indicates which field(s) the caller intends to use. Fields the caller does not specify via request_mask may be set in the returned struct anyway, but their values may be approximate. The query_flags argument indicates whether the filesystem must update the attributes from the backing store. This patch uses the query_flags which result in vfs_getattr behaving the same as it did with the 2-argument version which the kernel provided before Linux 4.11. Members blksize and blocks are now always the same size regardless of arch. They match the size of the equivalent members in vnode_t. The configure checks are modified to ensure that the appropriate vfs_getattr() interface is used. A more complete fix, removing the ZFS dependency on vfs_getattr() entirely, is deferred as it is a much larger project. Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #608
| * Linux 4.11 compat: set_task_state() removedOlaf Faaland2017-02-231-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace uses of set_task_state(current, STATE) with set_current_state(STATE). In Linux 4.11, torvalds/linux@642fa44, set_task_state() is removed. All spl uses are of the form set_task_state(current, STATE). set_current_state(STATE) is equivalent and has been available since Linux 2.2.26. Furthermore, set_current_state(STATE) is already used in about 15 locations within spl. This change should have no impact other than removing an unnecessary dependency. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #603
| * Use kernel slab for vn_cache and vn_file_cacheChunwei Chen2017-01-311-2/+2
| | | | | | | | | | | | | | | | | | | | Resolve a false positive in the kmemleak checker by shifting to the kernel slab. It shows up because vn_file_cache is using KMC_KMEM which is directly allocated using __get_free_pages, which is not automatically tracked by kmemleak. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #599
| * Reimplement rt_mutex_owner to fix build with DEBUG & PREEMPT_RT_FULLclefru2017-01-191-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | rt_mutex_owner is internal to kernel/locking/rtmutex_common.h and inaccessible for SPL via the public kernel headers. The way of accessing the owner has been stable since at least 3.13 ([1], [2]), which is masking the lowest bit in the owner pointer in rt_mutex. We do the same. [1] http://lxr.free-electrons.com/source/kernel/locking/rtmutex_common.h?v=3.13#L99 [2] http://lxr.free-electrons.com/source/kernel/locking/rtmutex_common.h?v=4.9#L78 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Clemens Fruhwirth <[email protected]> Closes #593
| * Remove identical if statements in module/spl/spl-vnode.cGeorge Melikov2017-01-191-3/+0
| | | | | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Melikov <[email protected]> Closes #594
| * Add support for recent kmem_cache_create_usercopyKevin Tanguy2017-01-171-2/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | SLAB_USERCOPY flag was used to indicate PAX not to kill copies from kernel to userland. With recent grsecurity patchset and CONFIG_GRKERNSEC_HIDESYM that enables CONFIG_PAX_USERCOPY zfs would panic. Handle newer API while keeping old one functional. Tested-by: RageLtMan <rageltman@sempervictus> Reviewed-by: spendergrsec <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Kevin Tanguy <[email protected]> Closes #595
| * Update struct member intializers to C89RageLtMan2017-01-131-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | When building SPL within the kernel tree, C99 initializers cause build failures and need to be converted to C89 as kernel CFLAGS specify -std=gnu89. This fix was provided by @behlendorf in #595 discussion notes and manually implemented in the current master revision. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: RageLtMan <rageltman@sempervictus> Closes #597
| * Add support for rw semaphore under PREEMPT_RT_FULLClemens Fruhwirth2016-12-192-2/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main complication from the RT patch set is that the RW semaphore locks change such that read locks on an rwsem can be taken only by a single thread. All other threads are locked out. This single thread can take a read lock multiple times though. The underlying implementation changes to a mutex with an additional read_depth count. The implementation can be best understood by inspecting the RT patch. rwsem_rt.h and rt.c give the best insight into how RT rwsem works. My implementation for rwsem_tryupgrade is basically an inversion of rt_downgrade_write found in rt.c. Please see the comments in the code. Unfortunately, I have to drop SPLAT rwlock test4 completely as this test tries to take multiple locks from different threads, which RT rwsems do not support. Otherwise SPLAT, zconfig.sh, zpios-sanity.sh and zfs-tests.sh pass on my Debian-testing VM with the kernel linux-image-4.8.0-1-rt-amd64. Tested-by: kernelOfTruth <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Clemens Fruhwirth <[email protected]> Closes zfsonlinux/zfs#5491 Closes #589 Closes #308
| * Refactor some splat macro to functionChunwei Chen2016-12-1518-221/+233
| | | | | | | | | | | | | | | | Refactor the code by making splat_test_{init,fini}, splat_subsystem_{init,fini} into functions. They don't have reason to be macro and it would be too bloated to inline every call. Signed-off-by: Chunwei Chen <[email protected]>
| * Fix splat memleakChunwei Chen2016-12-151-0/+1
| | | | | | | | | | | | SPLAT_TEST_FINI didn't call kfree causing memleak. Signed-off-by: Chunwei Chen <[email protected]>
| * Add system_delay_taskq for long delayChunwei Chen2016-12-081-0/+14
| | | | | | | | | | | | | | | | | | Add a dedicated system_delay_taskq for long delay like spa_deadman and zpl_posix_acl_free. This will allow us to use system_taskq in the manner of dispatch multiple tasks and call taskq_wait_outstanding. Reviewed by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #588
| * Limit number of tasks shown in taskq procChunwei Chen2016-12-011-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | To prevent holding tq_lock for too long. Before zfsonlinux/zfs@8e71ab9, hogging delay tasks and cat /proc/spl/taskq would easily cause a lockup. While that bug has been fixed. It's probably still a good idea to do this just in case task lists grow too large. Reviewed-by: Tim Chase <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #586
| * Add TASKQID_INVALID and TASKQID_INITIAL macrosUbuntu2016-11-024-36/+43
| | | | | | | | | | | | | | | | Add the TASKQID_INVALID and TASKQID_INITIAL macros and update the taskq implementation and test cases to use them. This is solely for the purposes of readability and introduces no functional change. Signed-off-by: Brian Behlendorf <[email protected]>
| * Fix vmem_size()Ubuntu2016-11-022-9/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a minimal implementation of vmem_size() which accounts for the virtual memory usage of the SPL's kmem cache. This functionality is only useful on 32-bit systems with a small virtual address space. The following assumptions are made: 1) The major SPL consumer of virtual memory is the kmem cache. 2) Memory allocated with vmem_alloc() is short lived and can be ignored. 3) Allow a 4MB floor as a generous pad given normal consumption. 4) The spl_kmem_cache_sem only contends with cache create/destroy. Signed-off-by: Brian Behlendorf <[email protected]>
| * Linux 4.9 compat: group_info changesChunwei Chen2016-10-202-2/+13
| | | | | | | | | | | | | | | | | | In Linux 4.9, torvalds/linux@81243ea, group_info changed from 2d array via ->blocks to 1d array via ->gid. We change the spl cred functions accordingly. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #581