diff options
author | Brian Behlendorf <[email protected]> | 2018-07-23 15:40:15 -0700 |
---|---|---|
committer | GitHub <[email protected]> | 2018-07-23 15:40:15 -0700 |
commit | d441e85dd754ecc15659322b4d36796cbd3838de (patch) | |
tree | 3b5adc51a6bda08c513edd382769cade243bb0ca /include | |
parent | 2e5dc449c1a65e0b0bf730fd69c9b5804bd57ee8 (diff) |
Add support for autoexpand property
While the autoexpand property may seem like a small feature it
depends on a significant amount of system infrastructure. Enough
of that infrastructure is now in place that with a few modifications
for Linux it can be supported.
Auto-expand works as follows; when a block device is modified
(re-sized, closed after being open r/w, etc) a change uevent is
generated for udev. The ZED, which is monitoring udev events,
passes the change event along to zfs_deliver_dle() if the disk
or partition contains a zfs_member as identified by blkid.
From here the device is matched against all imported pool vdevs
using the vdev_guid which was read from the label by blkid. If
a match is found the ZED reopens the pool vdev. This re-opening
is important because it allows the vdev to be briefly closed so
the disk partition table can be re-read. Otherwise, it wouldn't
be possible to report the maximum possible expansion size.
Finally, if the property autoexpand=on a vdev expansion will be
attempted. After performing some sanity checks on the disk to
verify that it is safe to expand, the primary partition (-part1)
will be expanded and the partition table updated. The partition
is then re-opened (again) to detect the updated size which allows
the new capacity to be used.
In order to make all of the above possible the following changes
were required:
* Updated the zpool_expand_001_pos and zpool_expand_003_pos tests.
These tests now create a pool which is layered on a loopback,
scsi_debug, and file vdev. This allows for testing of non-
partitioned block device (loopback), a partition block device
(scsi_debug), and a file which does not receive udev change
events. This provided for better test coverage, and by removing
the layering on ZFS volumes there issues surrounding layering
one pool on another are avoided.
* zpool_find_vdev_by_physpath() updated to accept a vdev guid.
This allows for matching by guid rather than path which is a
more reliable way for the ZED to reference a vdev.
* Fixed zfs_zevent_wait() signal handling which could result
in the ZED spinning when a signal was not handled.
* Removed vdev_disk_rrpart() functionality which can be abandoned
in favor of kernel provided blkdev_reread_part() function.
* Added a rwlock which is held as a writer while a disk is being
reopened. This is important to prevent errors from occurring
for any configuration related IOs which bypass the SCL_ZIO lock.
The zpool_reopen_007_pos.ksh test case was added to verify IO
error are never observed when reopening. This is not expected
to impact IO performance.
Additional fixes which aren't critical but were discovered and
resolved in the course of developing this functionality.
* Added PHYS_PATH="/dev/zvol/dataset" to the vdev configuration for
ZFS volumes. This is as good as a unique physical path, while the
volumes are not used in the test cases anymore for other reasons
this improvement was included.
Reviewed by: Richard Elling <[email protected]>
Signed-off-by: Sara Hartse <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #120
Closes #2437
Closes #5771
Closes #7366
Closes #7582
Closes #7629
Diffstat (limited to 'include')
-rw-r--r-- | include/linux/blkdev_compat.h | 14 | ||||
-rw-r--r-- | include/sys/vdev_disk.h | 1 |
2 files changed, 15 insertions, 0 deletions
diff --git a/include/linux/blkdev_compat.h b/include/linux/blkdev_compat.h index 88b0e48cd..274552d5d 100644 --- a/include/linux/blkdev_compat.h +++ b/include/linux/blkdev_compat.h @@ -365,6 +365,20 @@ bio_set_bi_error(struct bio *bio, int error) #endif /* HAVE_BLKDEV_GET_BY_PATH | HAVE_OPEN_BDEV_EXCLUSIVE */ /* + * 4.1 - x.y.z API, + * 3.10.0 CentOS 7.x API, + * blkdev_reread_part() + * + * For older kernels trigger a re-reading of the partition table by calling + * check_disk_change() which calls flush_disk() to invalidate the device. + */ +#ifdef HAVE_BLKDEV_REREAD_PART +#define vdev_bdev_reread_part(bdev) blkdev_reread_part(bdev) +#else +#define vdev_bdev_reread_part(bdev) check_disk_change(bdev) +#endif /* HAVE_BLKDEV_REREAD_PART */ + +/* * 2.6.22 API change * The function invalidate_bdev() lost it's second argument because * it was unused. diff --git a/include/sys/vdev_disk.h b/include/sys/vdev_disk.h index b8a32b316..908f5f326 100644 --- a/include/sys/vdev_disk.h +++ b/include/sys/vdev_disk.h @@ -47,6 +47,7 @@ typedef struct vdev_disk { ddi_devid_t vd_devid; char *vd_minor; struct block_device *vd_bdev; + krwlock_t vd_lock; } vdev_disk_t; #endif /* _KERNEL */ |