diff options
author | Don Brady <[email protected]> | 2019-06-22 16:51:46 -0700 |
---|---|---|
committer | Brian Behlendorf <[email protected]> | 2019-06-22 16:51:46 -0700 |
commit | 186898bbb580a830c02d994e961d717f7cf5dcca (patch) | |
tree | 3af5af5af4d7bed1bafb671c86f3876f01e0dc57 /module/spl | |
parent | cb9e5b7e84654a8c7dba0f9a0d1227f3c8fa1012 (diff) |
OpenZFS 9425 - channel programs can be interrupted
Problem Statement
=================
ZFS Channel program scripts currently require a timeout, so that hung or
long-running scripts return a timeout error instead of causing ZFS to get
wedged. This limit can currently be set up to 100 million Lua instructions.
Even with a limit in place, it would be desirable to have a sys admin
(support engineer) be able to cancel a script that is taking a long time.
Proposed Solution
=================
Make it possible to abort a channel program by sending an interrupt signal.In
the underlying txg_wait_sync function, switch the cv_wait to a cv_wait_sig to
catch the signal. Once a signal is encountered, the dsl_sync_task function can
install a Lua hook that will get called before the Lua interpreter executes a
new line of code. The dsl_sync_task can resume with a standard txg_wait_sync
call and wait for the txg to complete. Meanwhile, the hook will abort the
script and indicate that the channel program was canceled. The kernel returns
a EINTR to indicate that the channel program run was canceled.
Porting notes: Added missing return value from cv_wait_sig()
Authored by: Don Brady <[email protected]>
Reviewed by: Sebastien Roy <[email protected]>
Reviewed by: Serapheim Dimitropoulos <[email protected]>
Reviewed by: Matt Ahrens <[email protected]>
Reviewed by: Sara Hartse <[email protected]>
Reviewed by: Brian Behlendorf <[email protected]>
Approved by: Robert Mustacchi <[email protected]>
Ported-by: Don Brady <[email protected]>
Signed-off-by: Don Brady <[email protected]>
OpenZFS-issue: https://www.illumos.org/issues/9425
OpenZFS-commit: https://github.com/illumos/illumos-gate/commit/d0cb1fb926
Closes #8904
Diffstat (limited to 'module/spl')
-rw-r--r-- | module/spl/spl-condvar.c | 19 |
1 files changed, 18 insertions, 1 deletions
diff --git a/module/spl/spl-condvar.c b/module/spl/spl-condvar.c index a7a9d1db9..19c575f77 100644 --- a/module/spl/spl-condvar.c +++ b/module/spl/spl-condvar.c @@ -29,6 +29,12 @@ #include <linux/hrtimer.h> #include <linux/compiler_compat.h> +#include <linux/sched.h> + +#ifdef HAVE_SCHED_SIGNAL_HEADER +#include <linux/sched/signal.h> +#endif + void __cv_init(kcondvar_t *cvp, char *name, kcv_type_t type, void *arg) { @@ -144,10 +150,21 @@ __cv_wait_io(kcondvar_t *cvp, kmutex_t *mp) } EXPORT_SYMBOL(__cv_wait_io); -void +int +__cv_wait_io_sig(kcondvar_t *cvp, kmutex_t *mp) +{ + cv_wait_common(cvp, mp, TASK_INTERRUPTIBLE, 1); + + return (signal_pending(current) ? 0 : 1); +} +EXPORT_SYMBOL(__cv_wait_io_sig); + +int __cv_wait_sig(kcondvar_t *cvp, kmutex_t *mp) { cv_wait_common(cvp, mp, TASK_INTERRUPTIBLE, 0); + + return (signal_pending(current) ? 0 : 1); } EXPORT_SYMBOL(__cv_wait_sig); |