diff options
author | Alexander Motin <[email protected]> | 2023-06-27 12:09:48 -0400 |
---|---|---|
committer | GitHub <[email protected]> | 2023-06-27 09:09:48 -0700 |
commit | 8469b5aac0cee4f0e8b13018c3e83129554a6945 (patch) | |
tree | 27713c3476097b5b611bab7c7305394dc1088eb7 /man | |
parent | 35a6247c5fe788aa77e0b3c7e8010fedb9e60eb5 (diff) |
Another set of vdev queue optimizations.
Switch FIFO queues (SYNC/TRIM) and active queue of vdev queue from
time-sorted AVL-trees to simple lists. AVL-trees are too expensive
for such a simple task. To change I/O priority without searching
through the trees, add io_queue_state field to struct zio.
To not check number of queued I/Os for each priority add vq_cqueued
bitmap to struct vdev_queue. Update it when adding/removing I/Os.
Make vq_cactive a separate array instead of struct vdev_queue_class
member. Together those allow to avoid lots of cache misses when
looking for work in vdev_queue_class_to_issue().
Introduce deadline of ~0.5s for LBA-sorted queues. Before this I
saw some I/Os waiting in a queue for up to 8 seconds and possibly
more due to starvation. With this change I no longer see it. I
had to slightly more complicate the comparison function, but since
it uses all the same cache lines the difference is minimal. For a
sequential I/Os the new code in vdev_queue_io_to_issue() actually
often uses more simple avl_first(), falling back to avl_find() and
avl_nearest() only when needed.
Arrange members in struct zio to access only one cache line when
searching through vdev queues. While there, remove io_alloc_node,
reusing the io_queue_node instead. Those two are never used same
time.
Remove zfs_vdev_aggregate_trim parameter. It was disabled for 4
years since implemented, while still wasted time maintaining the
offset-sorted tree of TRIM requests. Just remove the tree.
Remove locking from txg_all_lists_empty(). It is racy by design,
while 2 pair of locks/unlocks take noticeable time under the vdev
queue lock.
With these changes in my tests with volblocksize=4KB I measure vdev
queue lock spin time reduction by 50% on read and 75% on write.
Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #14925
Diffstat (limited to 'man')
-rw-r--r-- | man/man4/zfs.4 | 6 |
1 files changed, 0 insertions, 6 deletions
diff --git a/man/man4/zfs.4 b/man/man4/zfs.4 index 5fbd9d7db..04bbbc5fd 100644 --- a/man/man4/zfs.4 +++ b/man/man4/zfs.4 @@ -2016,12 +2016,6 @@ Historical statistics for this many latest TXGs will be available in Flush dirty data to disk at least every this many seconds (maximum TXG duration). . -.It Sy zfs_vdev_aggregate_trim Ns = Ns Sy 0 Ns | Ns 1 Pq uint -Allow TRIM I/O operations to be aggregated. -This is normally not helpful because the extents to be trimmed -will have been already been aggregated by the metaslab. -This option is provided for debugging and performance analysis. -. .It Sy zfs_vdev_aggregation_limit Ns = Ns Sy 1048576 Ns B Po 1 MiB Pc Pq uint Max vdev I/O aggregation size. . |