diff options
author | ednadolski-ix <[email protected]> | 2023-11-06 11:38:42 -0700 |
---|---|---|
committer | GitHub <[email protected]> | 2023-11-06 10:38:42 -0800 |
commit | 3bd4df3841529316e5145590cc67076467b6abb7 (patch) | |
tree | 816fe92ee22d00a2e82e9b931b3a524c9740a707 /module/os/linux/spl | |
parent | 052777406601a4049c28e25fe0b4df57160b5a58 (diff) |
Improve ZFS objset sync parallelism
As part of transaction group commit, dsl_pool_sync() sequentially calls
dsl_dataset_sync() for each dirty dataset, which subsequently calls
dmu_objset_sync(). dmu_objset_sync() in turn uses up to 75% of CPU
cores to run sync_dnodes_task() in taskq threads to sync the dirty
dnodes (files).
There are two problems:
1. Each ZVOL in a pool is a separate dataset/objset having a single
dnode. This means the objsets are synchronized serially, which
leads to a bottleneck of ~330K blocks written per second per pool.
2. In the case of multiple dirty dnodes/files on a dataset/objset on a
big system they will be sync'd in parallel taskq threads. However,
it is inefficient to to use 75% of CPU cores of a big system to do
that, because of (a) bottlenecks on a single write issue taskq, and
(b) allocation throttling. In addition, if not for the allocation
throttling sorting write requests by bookmarks (logical address),
writes for different files may reach space allocators interleaved,
leading to unwanted fragmentation.
The solution to both problems is to always sync no more and (if
possible) no fewer dnodes at the same time than there are allocators
the pool.
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Signed-off-by: Edmund Nadolski <[email protected]>
Closes #15197
Diffstat (limited to 'module/os/linux/spl')
-rw-r--r-- | module/os/linux/spl/spl-taskq.c | 36 |
1 files changed, 36 insertions, 0 deletions
diff --git a/module/os/linux/spl/spl-taskq.c b/module/os/linux/spl/spl-taskq.c index d18f935b1..79a1a8e5a 100644 --- a/module/os/linux/spl/spl-taskq.c +++ b/module/os/linux/spl/spl-taskq.c @@ -1262,6 +1262,42 @@ taskq_destroy(taskq_t *tq) } EXPORT_SYMBOL(taskq_destroy); +/* + * Create a taskq with a specified number of pool threads. Allocate + * and return an array of nthreads kthread_t pointers, one for each + * thread in the pool. The array is not ordered and must be freed + * by the caller. + */ +taskq_t * +taskq_create_synced(const char *name, int nthreads, pri_t pri, + int minalloc, int maxalloc, uint_t flags, kthread_t ***ktpp) +{ + taskq_t *tq; + taskq_thread_t *tqt; + int i = 0; + kthread_t **kthreads = kmem_zalloc(sizeof (*kthreads) * nthreads, + KM_SLEEP); + + flags &= ~(TASKQ_DYNAMIC | TASKQ_THREADS_CPU_PCT | TASKQ_DC_BATCH); + + /* taskq_create spawns all the threads before returning */ + tq = taskq_create(name, nthreads, minclsyspri, nthreads, INT_MAX, + flags | TASKQ_PREPOPULATE); + VERIFY(tq != NULL); + VERIFY(tq->tq_nthreads == nthreads); + + list_for_each_entry(tqt, &tq->tq_thread_list, tqt_thread_list) { + kthreads[i] = tqt->tqt_thread; + i++; + } + + ASSERT3S(i, ==, nthreads); + *ktpp = kthreads; + + return (tq); +} +EXPORT_SYMBOL(taskq_create_synced); + static unsigned int spl_taskq_kick = 0; /* |