diff options
author | Shengqi Chen <[email protected]> | 2024-09-07 21:55:03 +0800 |
---|---|---|
committer | Brian Behlendorf <[email protected]> | 2024-09-19 15:18:59 -0700 |
commit | 0ae4460c61e2dc16203029344b083423b21eb755 (patch) | |
tree | 5064c973dc428759dc3cb1817728ca61db487a11 /module/zcommon | |
parent | 1c35206124a31929102916e308dc68908a14f515 (diff) |
zcommon: add specialized versions of cityhash4
Specializing cityhash4 on 32-bit architectures can reduce the size
of stack frames as well as instruction count. This is a tiny but
useful optimization, since some callers invoke it frequently.
When specializing into 1/2/3/4-arg versions, the stack usage
(in bytes) on some 32-bit arches are listed as follows:
- x86: 32, 32, 32, 40
- arm-v7a: 20, 20, 28, 36
- riscv: 0, 0, 0, 16
- power: 16, 16, 16, 32
- mipsel: 8, 8, 8, 24
And each actual argument (even if passing 0) contributes evenly
to the number of multiplication instructions generated:
- x86: 9, 12, 15 ,18
- arm-v7a: 6, 8, 10, 12
- riscv / power: 12, 18, 20, 24
- mipsel: 9, 12, 15, 19
On 64-bit architectures, the tendencies are similar. But both stack
sizes and instruction counts are significantly smaller thus negligible.
Reviewed by: Brian Behlendorf <[email protected]>
Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by: Shengqi Chen <[email protected]>
Closes #16131
Closes #16483
Diffstat (limited to 'module/zcommon')
-rw-r--r-- | module/zcommon/cityhash.c | 33 |
1 files changed, 31 insertions, 2 deletions
diff --git a/module/zcommon/cityhash.c b/module/zcommon/cityhash.c index 413a96df2..c758ec47d 100644 --- a/module/zcommon/cityhash.c +++ b/module/zcommon/cityhash.c @@ -49,8 +49,8 @@ cityhash_helper(uint64_t u, uint64_t v, uint64_t mul) return (b); } -uint64_t -cityhash4(uint64_t w1, uint64_t w2, uint64_t w3, uint64_t w4) +static inline uint64_t +cityhash_impl(uint64_t w1, uint64_t w2, uint64_t w3, uint64_t w4) { uint64_t mul = HASH_K2 + 64; uint64_t a = w1 * HASH_K1; @@ -59,9 +59,38 @@ cityhash4(uint64_t w1, uint64_t w2, uint64_t w3, uint64_t w4) uint64_t d = w3 * HASH_K2; return (cityhash_helper(rotate(a + b, 43) + rotate(c, 30) + d, a + rotate(b + HASH_K2, 18) + c, mul)); +} +/* + * Passing w as the 2nd argument could save one 64-bit multiplication. + */ +uint64_t +cityhash1(uint64_t w) +{ + return (cityhash_impl(0, w, 0, 0)); +} + +uint64_t +cityhash2(uint64_t w1, uint64_t w2) +{ + return (cityhash_impl(w1, w2, 0, 0)); +} + +uint64_t +cityhash3(uint64_t w1, uint64_t w2, uint64_t w3) +{ + return (cityhash_impl(w1, w2, w3, 0)); +} + +uint64_t +cityhash4(uint64_t w1, uint64_t w2, uint64_t w3, uint64_t w4) +{ + return (cityhash_impl(w1, w2, w3, w4)); } #if defined(_KERNEL) +EXPORT_SYMBOL(cityhash1); +EXPORT_SYMBOL(cityhash2); +EXPORT_SYMBOL(cityhash3); EXPORT_SYMBOL(cityhash4); #endif |