diff options
author | Brian Behlendorf <[email protected]> | 2015-03-17 15:07:47 -0700 |
---|---|---|
committer | Brian Behlendorf <[email protected]> | 2015-03-20 10:35:20 -0700 |
commit | 2cbb06b561f500732de2214eb590149d0c4f3cf5 (patch) | |
tree | 3835d7c748f615abfa42dd26eb0fa1c2c3aadeb5 /module/zfs/zfs_vfsops.c | |
parent | 596a8935a140d3238b46d9858de7a727524c2b51 (diff) |
Restructure per-filesystem reclaim
Originally when the ARC prune callback was introduced the idea was
to register a single callback for the ZPL. The ARC could invoke this
call back if it needed the ZPL to drop dentries, inodes, or other
cache objects which might be pinning buffers in the ARC. The ZPL
would iterate over all ZFS super blocks and perform the reclaim.
For the most part this design has worked well but due to limitations
in 2.6.35 and earlier kernels there were some problems. This patch
is designed to address those issues.
1) iterate_supers_type() is not provided by all kernels which makes
it impossible to safely iterate over all zpl_fs_type filesystems in
a single callback. The most straight forward and portable way to
resolve this is to register a callback per-filesystem during mount.
The arc_*_prune_callback() functions have always supported multiple
callbacks so this is functionally a very small change.
2) Commit 050d22b removed the non-portable shrink_dcache_memory()
and shrink_icache_memory() functions and didn't replace them with
equivalent functionality. This meant that for Linux 3.1 and older
kernels the ARC had no mechanism to drop dentries and inodes from
the caches if needed. This patch adds that missing functionality
by calling shrink_dcache_parent() to release dentries which may be
pinning inodes. This will result in all unused cache entries being
dropped which is a bit heavy handed but it's the only interface
available for old kernels.
3) A zpl_drop_inode() callback is registered for kernels older than
2.6.35 which do not support the .evict_inode callback. This ensures
that when the last reference on an inode is dropped it is immediately
removed from the cache. If this isn't done than inode can end up on
the global unused LRU with no mechanism available to ZFS to drop them.
Since the ARC buffers are not dropped the hottest inodes can still
be recreated without performing disk IO.
Signed-off-by: Brian Behlendorf <[email protected]>
Signed-off-by: Pavel Snajdr <[email protected]>
Issue #3160
Diffstat (limited to 'module/zfs/zfs_vfsops.c')
-rw-r--r-- | module/zfs/zfs_vfsops.c | 37 |
1 files changed, 31 insertions, 6 deletions
diff --git a/module/zfs/zfs_vfsops.c b/module/zfs/zfs_vfsops.c index 4df324a68..e98f4bf6a 100644 --- a/module/zfs/zfs_vfsops.c +++ b/module/zfs/zfs_vfsops.c @@ -1068,29 +1068,52 @@ zfs_root(zfs_sb_t *zsb, struct inode **ipp) } EXPORT_SYMBOL(zfs_root); -#if defined(HAVE_SHRINK) || defined(HAVE_SPLIT_SHRINKER_CALLBACK) +/* + * The ARC has requested that the filesystem drop entries from the dentry + * and inode caches. This can occur when the ARC needs to free meta data + * blocks but can't because they are all pinned by entries in these caches. + */ int zfs_sb_prune(struct super_block *sb, unsigned long nr_to_scan, int *objects) { zfs_sb_t *zsb = sb->s_fs_info; + int error = 0; +#if defined(HAVE_SHRINK) || defined(HAVE_SPLIT_SHRINKER_CALLBACK) struct shrinker *shrinker = &sb->s_shrink; struct shrink_control sc = { .nr_to_scan = nr_to_scan, .gfp_mask = GFP_KERNEL, }; +#endif ZFS_ENTER(zsb); -#ifdef HAVE_SPLIT_SHRINKER_CALLBACK + +#if defined(HAVE_SPLIT_SHRINKER_CALLBACK) *objects = (*shrinker->scan_objects)(shrinker, &sc); -#else +#elif defined(HAVE_SHRINK) *objects = (*shrinker->shrink)(shrinker, &sc); +#else + /* + * Linux kernels older than 3.1 do not support a per-filesystem + * shrinker. Therefore, we must fall back to the only available + * interface which is to discard all unused dentries and inodes. + * This behavior clearly isn't ideal but it's required so the ARC + * may free memory. The performance impact is mitigated by the + * fact that the frequently accessed dentry and inode buffers will + * still be in the ARC making them relatively cheap to recreate. + */ + *objects = 0; + shrink_dcache_parent(sb->s_root); #endif ZFS_EXIT(zsb); - return (0); + dprintf_ds(zsb->z_os->os_dsl_dataset, + "pruning, nr_to_scan=%lu objects=%d error=%d\n", + nr_to_scan, *objects, error); + + return (error); } EXPORT_SYMBOL(zfs_sb_prune); -#endif /* defined(HAVE_SHRINK) || defined(HAVE_SPLIT_SHRINKER_CALLBACK) */ /* * Teardown the zfs_sb_t. @@ -1286,6 +1309,8 @@ zfs_domount(struct super_block *sb, void *data, int silent) if (!zsb->z_issnap) zfsctl_create(zsb); + + zsb->z_arc_prune = arc_add_prune_callback(zpl_prune_sb, sb); out: if (error) { dmu_objset_disown(zsb->z_os, zsb); @@ -1324,6 +1349,7 @@ zfs_umount(struct super_block *sb) zfs_sb_t *zsb = sb->s_fs_info; objset_t *os; + arc_remove_prune_callback(zsb->z_arc_prune); VERIFY(zfs_sb_teardown(zsb, B_TRUE) == 0); os = zsb->z_os; bdi_destroy(sb->s_bdi); @@ -1682,7 +1708,6 @@ zfs_init(void) zfs_znode_init(); dmu_objset_register_type(DMU_OST_ZFS, zfs_space_delta_cb); register_filesystem(&zpl_fs_type); - (void) arc_add_prune_callback(zpl_prune_sbs, NULL); } void |