Fix mmap(2)/write(2)/read(2) deadlock

When modifing overlapping regions of a file using mmap(2) and write(2)/read(2) it is possible to deadlock due to a lock inversion. The zfs_write() and zfs_read() hooks first take the zfs range lock and then lock the individual pages. Conversely, when using mmap'ed I/O the zpl_writepage() hook is called with the individual page locks already taken and then zfs_putpage() takes the zfs range lock. The most straight forward fix is to simply not take the zfs range lock in the mmap(2) case. The individual pages will still be locked thus serializing access. Updating the same region of a file with write(2) and mmap(2) has always been a dodgy thing to do. This change at a minimum ensures we don't deadlock and is consistent with the existing Linux semantics enforced by the VFS. This isn't an issue under Solaris because the only range locking performed will be with the zfs range locks. It's up to each filesystem to perform its own file locking. Under Linux the VFS provides many of these services. It may be possible/desirable at a latter date to entirely dump the existing zfs range locking and rely on the Linux VFS page locks. However, for now its safest to perform both layers of locking until zfs is more tightly integrated with the page cache. Signed-off-by: Brian Behlendorf <[email protected]> Issue #302
author: Brian Behlendorf <[email protected]> 2011-07-18 20:06:32 -0700
committer: Brian Behlendorf <[email protected]> 2011-07-19 11:55:42 -0700
commit: a140dc5469399eaf2347ee7db503275f38f5dc12 (patch)
tree: 5fd6f14ff9684e2c876fe776a40abf4c76c298a7 /module
parent: 7f9d9946f88a37954ac208272bee0d4b3988da7f (diff)
1 files changed, 0 insertions, 5 deletions
diff --git a/module/zfs/zfs_vnops.c b/module/zfs/zfs_vnops.c
index dd9f4d9e5..262c1ed64 100644
--- a/module/zfs/zfs_vnops.c
+++ b/module/zfs/zfs_vnops.c
@@ -3838,7 +3838,6 @@ zfs_putpage(struct page *page, struct writeback_control *wbc, void *data)
 	struct inode         *ip      = mapping->host;
 	znode_t              *zp      = ITOZ(ip);
 	zfs_sb_t             *zsb     = ITOZSB(ip);
-	rl_t		     *rl;
 	u_offset_t	     io_off;
 	size_t		     io_len;
 	size_t		     len;
@@ -3850,11 +3849,8 @@ zfs_putpage(struct page *page, struct writeback_control *wbc, void *data)
 	ZFS_ENTER(zsb);
 	ZFS_VERIFY_ZP(zp);
 
-	rl = zfs_range_lock(zp, io_off, io_len, RL_WRITER);
-
 	if (io_off > zp->z_size) {
 		/* past end of file */
-		zfs_range_unlock(rl);
 		ZFS_EXIT(zsb);
 		return (0);
 	}
@@ -3862,7 +3858,6 @@ zfs_putpage(struct page *page, struct writeback_control *wbc, void *data)
 	len = MIN(io_len, P2ROUNDUP(zp->z_size, PAGESIZE) - io_off);
 
 	error = zfs_putapage(ip, page, io_off, len);
-	zfs_range_unlock(rl);
 
 	if (zsb->z_os->os_sync == ZFS_SYNC_ALWAYS)
 		zil_commit(zsb->z_log, zp->z_id);
author	Brian Behlendorf <[email protected]>	2011-07-18 20:06:32 -0700
committer	Brian Behlendorf <[email protected]>	2011-07-19 11:55:42 -0700
commit	a140dc5469399eaf2347ee7db503275f38f5dc12 (patch)
tree	5fd6f14ff9684e2c876fe776a40abf4c76c298a7 /module
parent	7f9d9946f88a37954ac208272bee0d4b3988da7f (diff)