diff options
author | Gvozden Neskovic <[email protected]> | 2016-07-06 13:42:04 +0200 |
---|---|---|
committer | Brian Behlendorf <[email protected]> | 2016-08-16 14:11:14 -0700 |
commit | 70b258fc962fd40673b9a47574cb83d8438e7d94 (patch) | |
tree | 6e45c08b144622dc78f1106681ce5566c77b588d /man/man5 | |
parent | 32ffaa3de58981814342fe6d3556c03d41d121f8 (diff) |
Fletcher4 implementation using avx512f instruction set
Algorithm runs 8 parallel sums, consuming 8x uint32_t elements per
loop iteration. Size alignment of main fletcher4 methods is adjusted
accordingly. New implementation is called 'avx512f'.
Note: byteswap method can be implemented more efficiently when avx512bw hardware
becomes available. Currently, it is ~ 2x slower than native method.
Table shows result of full (native) fletcher4 calculation for different buffer size:
fletcher4 4KB 16KB 64KB 128KB 256KB 1MB 16MB
--------------------------------------------------------------------
[scalar] 1213 1228 1231 1231 1225 1200 1160
[sse2] 2374 2442 2459 2456 2462 2250 2220
[avx2] 4288 4753 4871 4893 4900 4050 3882
[avx512f] 5975 8445 9196 9221 9262 6307 5620
Signed-off-by: Gvozden Neskovic <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #4952
Diffstat (limited to 'man/man5')
-rw-r--r-- | man/man5/zfs-module-parameters.5 | 16 |
1 files changed, 8 insertions, 8 deletions
diff --git a/man/man5/zfs-module-parameters.5 b/man/man5/zfs-module-parameters.5 index 3e62a4436..b4ad3700f 100644 --- a/man/man5/zfs-module-parameters.5 +++ b/man/man5/zfs-module-parameters.5 @@ -883,14 +883,14 @@ Default value: \fB67,108,864\fR. Select a fletcher 4 implementation. .sp Supported selectors are: \fBfastest\fR, \fBscalar\fR, \fBsse2\fR, \fBssse3\fR, -and \fBavx2\fR. All of the selectors except \fBfastest\fR and \fBscalar\fR -require instruction set extensions to be available and will only appear if ZFS -detects that they are present at runtime. If multiple implementations of -fletcher 4 are available, the \fBfastest\fR will be chosen using a micro -benchmark. Selecting \fBscalar\fR results in the original CPU based calculation -being used. Selecting any option other than \fBfastest\fR and \fBscalar\fR -results in vector instructions from the respective CPU instruction set being -used. +\fBavx2\fR, and \fBavx512f\fR. +All of the selectors except \fBfastest\fR and \fBscalar\fR require instruction +set extensions to be available and will only appear if ZFS detects that they are +present at runtime. If multiple implementations of fletcher 4 are available, +the \fBfastest\fR will be chosen using a micro benchmark. Selecting \fBscalar\fR +results in the original, CPU based calculation, being used. Selecting any option +other than \fBfastest\fR and \fBscalar\fR results in vector instructions from +the respective CPU instruction set being used. .sp Default value: \fBfastest\fR. .RE |