summaryrefslogtreecommitdiffstats
path: root/module
diff options
context:
space:
mode:
authorloli10K <[email protected]>2019-09-14 03:09:59 +0200
committerBrian Behlendorf <[email protected]>2019-09-13 18:09:59 -0700
commit2a0d41889e1c7c430e708cea76e70b11e0e2b0aa (patch)
tree24e3051afaf491e49d3256006854e72168f2731a /module
parente60e158eff920825311c1e18b3631876eaaacb54 (diff)
Scrubbing root pools may deadlock on kernels without elevator_change() (#9321)
Originally the zfs_vdev_elevator module option was added as a convenience so the requested elevator would be automatically set on the underlying block devices. At the time this was simple because the kernel provided an API function which did exactly this. This API was then removed in the Linux 4.12 kernel which prompted us to add compatibly code to set the elevator via a usermodehelper. Unfortunately changing the evelator via usermodehelper requires reading some userland binaries, most notably modprobe(8) or sh(1), from a zfs dataset on systems with root-on-zfs. This can deadlock the system if used during the following call path because it may need, if the data is not already cached in the ARC, reading directly from disk while holding the spa config lock as a writer: zfs_ioc_pool_scan() -> spa_scan() -> spa_scan() -> vdev_reopen() -> vdev_elevator_switch() -> call_usermodehelper() While the usermodehelper waits sh(1), modprobe(8) is blocked in the ZIO pipeline trying to read from disk: INFO: task modprobe:2650 blocked for more than 10 seconds. Tainted: P OE 5.2.14 modprobe D 0 2650 206 0x00000000 Call Trace: ? __schedule+0x244/0x5f0 schedule+0x2f/0xa0 cv_wait_common+0x156/0x290 [spl] ? do_wait_intr_irq+0xb0/0xb0 spa_config_enter+0x13b/0x1e0 [zfs] zio_vdev_io_start+0x51d/0x590 [zfs] ? tsd_get_by_thread+0x3b/0x80 [spl] zio_nowait+0x142/0x2f0 [zfs] arc_read+0xb2d/0x19d0 [zfs] ... zpl_iter_read+0xfa/0x170 [zfs] new_sync_read+0x124/0x1b0 vfs_read+0x91/0x140 ksys_read+0x59/0xd0 do_syscall_64+0x4f/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 This commit changes how we use the usermodehelper functionality from synchronous (UMH_WAIT_PROC) to asynchronous (UMH_NO_WAIT) which prevents scrubs, and other vdev_elevator_switch() consumers, from triggering the aforementioned issue. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Issue #8664 Closes #9321
Diffstat (limited to 'module')
-rw-r--r--module/os/linux/zfs/vdev_disk.c2
1 files changed, 1 insertions, 1 deletions
diff --git a/module/os/linux/zfs/vdev_disk.c b/module/os/linux/zfs/vdev_disk.c
index 21f9ae454..d223ef3b3 100644
--- a/module/os/linux/zfs/vdev_disk.c
+++ b/module/os/linux/zfs/vdev_disk.c
@@ -219,7 +219,7 @@ vdev_elevator_switch(vdev_t *v, char *elevator)
char *envp[] = { NULL };
argv[2] = kmem_asprintf(SET_SCHEDULER_CMD, device, elevator);
- error = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
+ error = call_usermodehelper(argv[0], argv, envp, UMH_NO_WAIT);
strfree(argv[2]);
#endif /* HAVE_ELEVATOR_CHANGE */
if (error) {