aboutsummaryrefslogtreecommitdiffstats
path: root/config/toolchain-simd.m4
Commit message (Collapse)AuthorAgeFilesLines
* Linux 5.16 compat: don't use XSTATE_XSAVE to save FPU stateAttila Fülöp2022-02-091-0/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | Linux 5.16 moved XSTATE_XSAVE and XSTATE_XRESTORE out of our reach, so add our own XSAVE{,OPT,S} code and use it for Linux 5.16. Please note that this differs from previous behavior in that it won't handle exceptions created by XSAVE an XRSTOR. This is sensible for three reasons. - Exceptions during XSAVE and XRSTOR can only occur if the feature is not supported or enabled or the memory operand isn't aligned on a 64 byte boundary. If this happens something else went terribly wrong, and it may be better to stop execution. - Previously we just printed a warning and didn't handle the fault, this is arguable for the above reason. - All other *SAVE instruction also don't handle exceptions, so this at least aligns behavior. Finally add a test to catch such a regression in the future. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #13042 Closes #13059
* Add FreeBSD support to OpenZFSMatthew Macy2020-04-141-1/+1
| | | | | | | | | | | | | | | | | | Add the FreeBSD platform code to the OpenZFS repository. As of this commit the source can be compiled and tested on FreeBSD 11 and 12. Subsequent commits are now required to compile on FreeBSD and Linux. Additionally, they must pass the ZFS Test Suite on FreeBSD which is being run by the CI. As of this commit 1230 tests pass on FreeBSD and there are no unexpected failures. Reviewed-by: Sean Eric Fagan <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Co-authored-by: Ryan Moeller <[email protected]> Signed-off-by: Matt Macy <[email protected]> Signed-off-by: Ryan Moeller <[email protected]> Closes #898 Closes #8987
* ICP: Improve AES-GCM performanceAttila Fülöp2020-02-101-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently SIMD accelerated AES-GCM performance is limited by two factors: a. The need to disable preemption and interrupts and save the FPU state before using it and to do the reverse when done. Due to the way the code is organized (see (b) below) we have to pay this price twice for each 16 byte GCM block processed. b. Most processing is done in C, operating on single GCM blocks. The use of SIMD instructions is limited to the AES encryption of the counter block (AES-NI) and the Galois multiplication (PCLMULQDQ). This leads to the FPU not being fully utilized for crypto operations. To solve (a) we do crypto processing in larger chunks while owning the FPU. An `icp_gcm_avx_chunk_size` module parameter was introduced to make this chunk size tweakable. It defaults to 32 KiB. This step alone roughly doubles performance. (b) is tackled by porting and using the highly optimized openssl AES-GCM assembler routines, which do all the processing (CTR, AES, GMULT) in a single routine. Both steps together result in up to 32x reduction of the time spend in the en/decryption routines, leading up to approximately 12x throughput increase for large (128 KiB) blocks. Lastly, this commit changes the default encryption algorithm from AES-CCM to AES-GCM when setting the `encryption=on` property. Reviewed-By: Brian Behlendorf <[email protected]> Reviewed-By: Jason King <[email protected]> Reviewed-By: Tom Caputi <[email protected]> Reviewed-By: Richard Laager <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #9749
* Add support for selecting encryption backendNathan Lewis2018-08-021-0/+42
| | | | | | | | | | | | | | | | | | - Add two new module parameters to icp (icp_aes_impl, icp_gcm_impl) that control the crypto implementation. At the moment there is a choice between generic and aesni (on platforms that support it). - This enables support for AES-NI and PCLMULQDQ-NI on AMD Family 15h (bulldozer) and newer CPUs (zen). - Modify aes_key_t to track what implementation it was generated with as key schedules generated with various implementations are not necessarily interchangable. Reviewed by: Gvozden Neskovic <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Richard Laager <[email protected]> Signed-off-by: Nathaniel R. Lewis <[email protected]> Closes #7102 Closes #7103
* Add support for AVX-512 family of instruction setsGvozden Neskovic2016-08-161-0/+189
| | | | | | | | | | | | | This patch adds compiler and runtime tests (user and kernel) for following instruction sets: avx512f, avx512cd, avx512er, avx512pf, avx512bw, avx512dq, avx512vl, avx512ifma, avx512vbmi. note: Linux support for AVX-512F (Foundation) instruction set started with linux v3.15 Signed-off-by: Gvozden Neskovic <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4952
* Support for vectorized algorithms on x86Gvozden Neskovic2016-03-211-0/+172
This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4381