aboutsummaryrefslogtreecommitdiffstats
path: root/lib/libspl/include/sys
Commit message (Collapse)AuthorAgeFilesLines
* Prefix zfs internal endian checks with _ZFSMatthew Macy2020-07-281-16/+34
| | | | | | | | | | | FreeBSD defines _BIG_ENDIAN BIG_ENDIAN _LITTLE_ENDIAN LITTLE_ENDIAN on every architecture. Trying to do cross builds whilst hiding this from ZFS has proven extremely cumbersome. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10621
* FreeBSD: Fixes required to build ZFS on PowerPCMatthew Macy2020-07-251-2/+2
| | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #10622
* Drop unnecessary srcdir pathsArvind Sankar2020-06-242-44/+44
| | | | | | | | | | There's no need to specify the srcdir explicitly in _HEADERS and EXTRA_DIST. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arvind Sankar <[email protected]> Closes #10493
* Add convenience wrappers for common uio usageJorgen Lundman2020-06-142-0/+153
| | | | | | | | | The macOS uio struct is opaque and the API must be used, this makes the smallest changes to the code for all platforms. Reviewed-by: Matt Macy <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Jorgen Lundman <[email protected]> Closes #10412
* Remove unnecessary references to slaveryMatthew Ahrens2020-06-101-1/+0
| | | | | | | | | | | | | | | | | | | | | | The horrible effects of human slavery continue to impact society. The casual use of the term "slave" in computer software is an unnecessary reference to a painful human experience. This commit removes all possible references to the term "slave". Implementation notes: The zpool.d/slaves script is renamed to dm-deps, which uses the same terminology as `dmsetup deps`. References to the `/sys/class/block/$dev/slaves` directory remain. This directory name is determined by the Linux kernel. Although `dmsetup deps` provides the same information, it unfortunately requires elevated privileges, whereas the `/sys/...` directory is world-readable. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #10435
* Correctly handle the x32 ABIнаб2020-05-281-1/+5
| | | | | | | | | | __x86_64__ && _ILP32 => don't forcibly define _LP64 Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Signed-off-by: Ahelenia Ziemiańska <[email protected]> Closes #10357 Closes #844
* change libspl list member names to match kernelMatthew Ahrens2020-04-231-2/+2
| | | | | | | | | | This aids in debugging, so that we can use the same infrastructure to walk zfs's list_t in the kernel module and in the userland libraries (e.g. when debugging ztest). Reviewed-by: Serapheim Dimitropoulos <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matthew Ahrens <[email protected]> Closes #10236
* Linux 5.6 compat: time_tBrian Behlendorf2020-02-271-1/+1
| | | | | | | | | | | | | | | | | | | As part of the Linux kernel's y2038 changes the time_t type has been fully retired. Callers are now required to use the time64_t type. Rather than move to the new type, I've removed the few remaining places where a time_t is used in the kernel code. They've been replaced with a uint64_t which is already how ZFS internally handled these values. Going forward we should work towards updating the remaining user space time_t consumers to the 64-bit interfaces. Reviewed-by: Matthew Macy <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #10052 Closes #10064
* ICP: Improve AES-GCM performanceAttila Fülöp2020-02-101-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently SIMD accelerated AES-GCM performance is limited by two factors: a. The need to disable preemption and interrupts and save the FPU state before using it and to do the reverse when done. Due to the way the code is organized (see (b) below) we have to pay this price twice for each 16 byte GCM block processed. b. Most processing is done in C, operating on single GCM blocks. The use of SIMD instructions is limited to the AES encryption of the counter block (AES-NI) and the Galois multiplication (PCLMULQDQ). This leads to the FPU not being fully utilized for crypto operations. To solve (a) we do crypto processing in larger chunks while owning the FPU. An `icp_gcm_avx_chunk_size` module parameter was introduced to make this chunk size tweakable. It defaults to 32 KiB. This step alone roughly doubles performance. (b) is tackled by porting and using the highly optimized openssl AES-GCM assembler routines, which do all the processing (CTR, AES, GMULT) in a single routine. Both steps together result in up to 32x reduction of the time spend in the en/decryption routines, leading up to approximately 12x throughput increase for large (128 KiB) blocks. Lastly, this commit changes the default encryption algorithm from AES-CCM to AES-GCM when setting the `encryption=on` property. Reviewed-By: Brian Behlendorf <[email protected]> Reviewed-By: Jason King <[email protected]> Reviewed-By: Tom Caputi <[email protected]> Reviewed-By: Richard Laager <[email protected]> Signed-off-by: Attila Fülöp <[email protected]> Closes #9749
* Add AltiVec RAID-ZRomain Dolbeau2020-01-231-0/+40
| | | | | | | | | | | | | Implements the RAID-Z function using AltiVec SIMD. This is basically the NEON code translated to AltiVec. Note that the 'fletcher' algorithm requires 64-bits operations, and the initial implementations of AltiVec (PPC74xx a.k.a. G4, PPC970 a.k.a. G5) only has up to 32-bits operations, so no 'fletcher'. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #9539
* Replace ASSERTV macro with compiler annotationMatthew Macy2019-12-051-0/+4
| | | | | | | | | | | Remove the ASSERTV macro and handle suppressing unused compiler warnings for variables only in ASSERTs using the __attribute__((unused)) compiler annotation. The annotation is understood by both gcc and clang. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9671
* Refactor zfs_context.h to build on FreeBSDMatthew Macy2019-12-042-31/+0
| | | | | | | | | | | | - on Linux move Linux specific headers to zfs_context_os.h - on FreeBSD move FreeBSD specific definitions to zfs_context_os.h - remove duplicate tsd_ definitions - remove unused AT_TYPE Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Don Brady <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9668
* Preliminary support for RV64GRomain Dolbeau2019-11-061-1/+20
| | | | | | | This adds basic support for RISC-V, specifically RV64G. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #9540
* Enable use of DTRACE_PROBE* macros in "spl" modulePrakash Surya2019-11-013-1/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | This change modifies some of the infrastructure for enabling the use of the DTRACE_PROBE* macros, such that we can use tehm in the "spl" module. Currently, when the DTRACE_PROBE* macros are used, they get expanded to create new functions, and these dynamically generated functions become part of the "zfs" module. Since the "spl" module does not depend on the "zfs" module, the use of DTRACE_PROBE* in the "spl" module would result in undefined symbols being used in the "spl" module. Specifically, DTRACE_PROBE* would turn into a function call, and the function being called would be a symbol only contained in the "zfs" module; which results in a linker and/or runtime error. Thus, this change adds the necessary logic to the "spl" module, to mirror the tracing functionality available to the "zfs" module. After this change, we'll have a "trace_zfs.h" header file which defines the probes available only to the "zfs" module, and a "trace_spl.h" header file which defines the probes available only to the "spl" module. Reviewed by: Brad Lewis <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Prakash Surya <[email protected]> Closes #9525
* Move sha2.h to platform codeMatthew Macy2019-10-312-0/+152
| | | | | | | | | FreeBSD has its own sha routines that the port uses. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9530
* Remove unneeded header from libzpool/kernel.cMatthew Macy2019-10-261-35/+0
| | | | | | | | | The sys/signal.h header doesn't exist on FreeBSD, nor is it needed on Linux. Reviewed-by: Igor Kozhukhov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9510
* Fix header guard typoMatthew Macy2019-10-261-1/+1
| | | | | | | Fix header guard typo accidentally introduced by #9497. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9514
* Move platform dependent errno aliasesMatthew Macy2019-10-252-36/+0
| | | | | | | | | EBADE, EBADR, and ENOANO do not exist on FreeBSD The libspl errno.h is similarly platform dependent. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9498
* Remove sdt.hMatthew Macy2019-10-251-0/+23
| | | | | | | | | It's mostly a noop on ZoL and it conflicts with platforms that support dtrace. Remove this header to resolve the conflict. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9497
* Linux 4.14, 4.19, 5.0+ compat: SIMD save/restoreBrian Behlendorf2019-10-241-1/+2
| | | | | | | | | | | | | | | | | | | | Contrary to initial testing we cannot rely on these kernels to invalidate the per-cpu FPU state and restore the FPU registers. Nor can we guarantee that the kernel won't modify the FPU state which we saved in the task struck. Therefore, the kfpu_begin() and kfpu_end() functions have been updated to save and restore the FPU state using our own dedicated per-cpu FPU state variables. This has the additional advantage of allowing us to use the FPU again in user threads. So we remove the code which was added to use task queues to ensure some functions ran in kernel threads. Reviewed-by: Fabian Grünbichler <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #9346 Closes #9403
* OpenZFS restructuring - libsplMatthew Macy2019-10-029-795/+0
| | | | | | | | | Factor Linux specific pieces out of libspl. Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Sean Eric Fagan <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9336
* Enable compiler to typecheck loggingMatthew Macy2019-09-121-0/+4
| | | | | | | | | | Annotate spa logging declarations with printflike Workaround gcc bug (non disable-able warning) by replacing "" with " " Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9316
* OpenZFS restructuring - move linux tracing code to platform directoriesMatthew Macy2019-09-112-0/+2
| | | | | | | | | | | Move Linux specific tracing headers and source to platform directories and update the build system. Reviewed-by: Allan Jude <[email protected]> Reviewed-by: Ryan Moeller <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Matt Macy <[email protected]> Closes #9290
* OpenZFS restructuring - move platform specific headersMatthew Macy2019-09-052-0/+449
| | | | | | | | | | | | | | | | | Move platform specific Linux headers under include/os/linux/. Update the build system accordingly to detect the platform. This lays some of the initial groundwork to supporting building for other platforms. As part of this change it was necessary to create both a user and kernel space sys/simd.h header which can be included in either context. No functional change, the source has been refactored and the relevant #include's updated. Reviewed-by: Jorgen Lundman <[email protected]> Reviewed-by: Igor Kozhukhov <[email protected]> Signed-off-by: Matthew Macy <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #9198
* Fix typos in lib/Andrea Gelmini2019-09-024-4/+4
| | | | | | | Reviewed-by: Ryan Moeller <[email protected]> Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Andrea Gelmini <[email protected]> Closes #9237
* ZFS Reads may result in unneccesary calls to zil_commitGeorge Wilson2019-03-221-1/+0
| | | | | | | | | | | | | | | ZFS supports O_RSYNC for read operations and when specified will ensure the same level of data integrity that O_DSYNC and O_SYNC provides for writes. O_RSYNC by itself has no effect so it must be combined with either O_DSYNC or O_SYNC. However, many platforms don't support O_RSYNC and have mapped O_SYNC to mean O_RSYNC within ZFS. This is incorrect and causes unnecessary calls to zil_commit. Only platforms which support O_RSYNC should implement the zil_commit functionality in the read code path. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Wilson <[email protected]> Closes #8523
* Fix kernel unaligned access on sparc64Brian Behlendorf2018-07-111-0/+7
| | | | | | | | | | | | | | Update the SA_COPY_DATA macro to check if architecture supports efficient unaligned memory accesses at compile time. Otherwise fallback to using the sa_copy_data() function. The kernel provided CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is used to determine availability in kernel space. In user space the x86_64, x86, powerpc, and sometimes arm architectures will define the HAVE_EFFICIENT_UNALIGNED_ACCESS macro. Signed-off-by: Brian Behlendorf <[email protected]> Closes #7642 Closes #7684
* Linux 4.18 compat: inode timespec -> timespec64Brian Behlendorf2018-06-191-8/+29
| | | | | | | | | | | | | | | | | | | | | | | | Commit torvalds/linux@95582b0 changes the inode i_atime, i_mtime, and i_ctime members form timespec's to timespec64's to make them 2038 safe. As part of this change the current_time() function was also updated to return the timespec64 type. Resolve this issue by introducing a new inode_timespec_t type which is defined to match the timespec type used by the inode. It should be used when working with inode timestamps to ensure matching types. The timestruc_t type under Illumos was used in a similar fashion but was specified to always be a timespec_t. Rather than incorrectly define this type all timespec_t types have been replaced by the new inode_timespec_t type. Finally, the kernel and user space 'sys/time.h' headers were aligned with each other. They define as appropriate for the context several constants as macros and include static inline implementation of gethrestime(), gethrestime_sec(), and gethrtime(). Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #7643
* Add pool state /proc entry, "SUSPENDED" poolsTony Hutter2018-06-061-0/+2
| | | | | | | | | | | | | | | | | | | 1. Add a proc entry to display the pool's state: $ cat /proc/spl/kstat/zfs/tank/state ONLINE This is done without using the spa config locks, so it will never hang. 2. Fix 'zpool status' and 'zpool list -o health' output to print "SUSPENDED" instead of "ONLINE" for suspended pools. Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed by: Richard Elling <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #7331 Closes #7563
* Update build system and packagingBrian Behlendorf2018-05-297-96/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Minimal changes required to integrate the SPL sources in to the ZFS repository build infrastructure and packaging. Build system and packaging: * Renamed SPL_* autoconf m4 macros to ZFS_*. * Removed redundant SPL_* autoconf m4 macros. * Updated the RPM spec files to remove SPL package dependency. * The zfs package obsoletes the spl package, and the zfs-kmod package obsoletes the spl-kmod package. * The zfs-kmod-devel* packages were updated to add compatibility symlinks under /usr/src/spl-x.y.z until all dependent packages can be updated. They will be removed in a future release. * Updated copy-builtin script for in-kernel builds. * Updated DKMS package to include the spl.ko. * Updated stale AUTHORS file to include all contributors. * Updated stale COPYRIGHT and included the SPL as an exception. * Renamed README.markdown to README.md * Renamed OPENSOLARIS.LICENSE to LICENSE. * Renamed DISCLAIMER to NOTICE. Required code changes: * Removed redundant HAVE_SPL macro. * Removed _BOOT from nvpairs since it doesn't apply for Linux. * Initial header cleanup (removal of empty headers, refactoring). * Remove SPL repository clone/build from zimport.sh. * Use of DEFINE_RATELIMIT_STATE and DEFINE_SPINLOCK removed due to build issues when forcing C99 compilation. * Replaced legacy ACCESS_ONCE with READ_ONCE. * Include needed headers for `current` and `EXPORT_SYMBOL`. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Olaf Faaland <[email protected]> Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> TEST_ZIMPORT_SKIP="yes" Closes #7556
* Remove lib/libspl/include/sys/frame.hBrian Behlendorf2017-12-172-132/+0
| | | | | | | | | | | The functionality provided by this header is not required by any of the ZFS user space code. Minimal functionality was provided in commit c28a677 which added include/sys/frame.h. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6960 Closes #6972
* OpenZFS 8585 - improve batching done in zil_commit()Prakash Surya2017-12-051-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Authored by: Prakash Surya <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: Prakash Surya <[email protected]> Problem ======= The current implementation of zil_commit() can introduce significant latency, beyond what is inherent due to the latency of the underlying storage. The additional latency comes from two main problems: 1. When there's outstanding ZIL blocks being written (i.e. there's already a "writer thread" in progress), then any new calls to zil_commit() will block waiting for the currently oustanding ZIL blocks to complete. The blocks written for each "writer thread" is coined a "batch", and there can only ever be a single "batch" being written at a time. When a batch is being written, any new ZIL transactions will have to wait for the next batch to be written, which won't occur until the current batch finishes. As a result, the underlying storage may not be used as efficiently as possible. While "new" threads enter zil_commit() and are blocked waiting for the next batch, it's possible that the underlying storage isn't fully utilized by the current batch of ZIL blocks. In that case, it'd be better to allow these new threads to generate (and issue) a new ZIL block, such that it could be serviced by the underlying storage concurrently with the other ZIL blocks that are being serviced. 2. Any call to zil_commit() must wait for all ZIL blocks in its "batch" to complete, prior to zil_commit() returning. The size of any given batch is proportional to the number of ZIL transaction in the queue at the time that the batch starts processing the queue; which doesn't occur until the previous batch completes. Thus, if there's a lot of transactions in the queue, the batch could be composed of many ZIL blocks, and each call to zil_commit() will have to wait for all of these writes to complete (even if the thread calling zil_commit() only cared about one of the transactions in the batch). To further complicate the situation, these two issues result in the following side effect: 3. If a given batch takes longer to complete than normal, this results in larger batch sizes, which then take longer to complete and further drive up the latency of zil_commit(). This can occur for a number of reasons, including (but not limited to): transient changes in the workload, and storage latency irregularites. Solution ======== The solution attempted by this change has the following goals: 1. no on-disk changes; maintain current on-disk format. 2. modify the "batch size" to be equal to the "ZIL block size". 3. allow new batches to be generated and issued to disk, while there's already batches being serviced by the disk. 4. allow zil_commit() to wait for as few ZIL blocks as possible. 5. use as few ZIL blocks as possible, for the same amount of ZIL transactions, without introducing significant latency to any individual ZIL transaction. i.e. use fewer, but larger, ZIL blocks. In theory, with these goals met, the new allgorithm will allow the following improvements: 1. new ZIL blocks can be generated and issued, while there's already oustanding ZIL blocks being serviced by the storage. 2. the latency of zil_commit() should be proportional to the underlying storage latency, rather than the incoming synchronous workload. Porting Notes ============= Due to the changes made in commit 119a394ab0, the lifetime of an itx structure differs than in OpenZFS. Specifically, the itx structure is kept around until the data associated with the itx is considered to be safe on disk; this is so that the itx's callback can be called after the data is committed to stable storage. Since OpenZFS doesn't have this itx callback mechanism, it's able to destroy the itx structure immediately after the itx is committed to an lwb (before the lwb is written to disk). To support this difference, and to ensure the itx's callbacks can still be called after the itx's data is on disk, a few changes had to be made: * A list of itxs was added to the lwb structure. This list contains all of the itxs that have been committed to the lwb, such that the callbacks for these itxs can be called from zil_lwb_flush_vdevs_done(), after the data for the itxs is committed to disk. * A list of itxs was added on the stack of the zil_process_commit_list() function; the "nolwb_itxs" list. In some circumstances, an itx may not be committed to an lwb (e.g. if allocating the "next" ZIL block on disk fails), so this list is used to keep track of which itxs fall into this state, such that their callbacks can be called after the ZIL's writer pipeline is "stalled". * The logic to actually call the itx's callback was moved into the zil_itx_destroy() function. Since all consumers of zil_itx_destroy() were effectively performing the same logic (i.e. if callback is non-null, call the callback), it seemed like useful code cleanup to consolidate this logic into a single function. Additionally, the existing Linux tracepoint infrastructure dealing with the ZIL's probes and structures had to be updated to reflect these code changes. Specifically: * The "zil__cw1" and "zil__cw2" probes were removed, so they had to be removed from "trace_zil.h" as well. * Some of the zilog structure's fields were removed, which affected the tracepoint definitions of the structure. * New tracepoints had to be added for the following 3 new probes: * zil__process__commit__itx * zil__process__normal__itx * zil__commit__io__error OpenZFS-issue: https://www.illumos.org/issues/8585 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/5d95a3a Closes #6566
* Native Encryption for ZFS on LinuxTom Caputi2017-08-141-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change incorporates three major pieces: The first change is a keystore that manages wrapping and encryption keys for encrypted datasets. These commands mostly involve manipulating the new DSL Crypto Key ZAP Objects that live in the MOS. Each encrypted dataset has its own DSL Crypto Key that is protected with a user's key. This level of indirection allows users to change their keys without re-encrypting their entire datasets. The change implements the new subcommands "zfs load-key", "zfs unload-key" and "zfs change-key" which allow the user to manage their encryption keys and settings. In addition, several new flags and properties have been added to allow dataset creation and to make mounting and unmounting more convenient. The second piece of this patch provides the ability to encrypt, decyrpt, and authenticate protected datasets. Each object set maintains a Merkel tree of Message Authentication Codes that protect the lower layers, similarly to how checksums are maintained. This part impacts the zio layer, which handles the actual encryption and generation of MACs, as well as the ARC and DMU, which need to be able to handle encrypted buffers and protected data. The last addition is the ability to do raw, encrypted sends and receives. The idea here is to send raw encrypted and compressed data and receive it exactly as is on a backup system. This means that the dataset on the receiving system is protected using the same user key that is in use on the sending side. By doing so, datasets can be efficiently backed up to an untrusted system without fear of data being compromised. Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #494 Closes #5769
* Allow longer SPA names in statsgaurkuma2017-08-111-1/+1
| | | | | | | | | | | | The pool name can be 256 chars long. Today, in /proc/spl/kstat/zfs/ the name is limited to < 32 characters. This change is to allows bigger pool names. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: loli10K <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: gaurkuma <[email protected]> Closes #6481
* Add libtpool (thread pools)Brian Behlendorf2017-08-091-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OpenZFS provides a library called tpool which implements thread pools for user space applications. Porting this library means the zpool utility no longer needs to borrow the kernel mutex and taskq interfaces from libzpool. This code was updated to use the tpool library which behaves in a very similar fashion. Porting libtpool was relatively straight forward and minimal modifications were needed. The core changes were: * Fully convert the library to use pthreads. * Updated signal handling. * lmalloc/lfree converted to calloc/free * Implemented portable pthread_attr_clone() function. Finally, update the build system such that libzpool.so is no longer linked in to zfs(8), zpool(8), etc. All that is required is libzfs to which the zcommon soures were added (which is the way it always should have been). Removing the libzpool dependency resulted in several build issues which needed to be resolved. * Moved zfeature support to module/zcommon/zfeature_common.c * Moved ratelimiting to to module/zfs/zfs_ratelimit.c * Moved get_system_hostid() to lib/libspl/gethostid.c * Removed use of cmn_err() in zcommon source * Removed dprintf_setup() call from zpool_main.c and zfs_main.c * Removed highbit() and lowbit() * Removed unnecessary library dependencies from Makefiles * Removed fletcher-4 kstat in user space * Added sha2 support explicitly to libzfs * Added highbit64() and lowbit64() to zpool_util.c Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6442
* Fix header inclusions for standards conformanceRichard Yao2017-04-124-0/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | musl's sys/errno.h is literally: /#warning redirecting incorrect #include <sys/errno.h> to <errno.h> /#include <errno.h> It does the same for sys/{poll,signal}.h. This is rather noisy when building ZoL against musl. musl is also correct in pointing out that the correct headers are outside of sys/ according to the single unix specification: http://pubs.opengroup.org/onlinepubs/7908799/xsh/errno.h.html http://pubs.opengroup.org/onlinepubs/7908799/xsh/poll.h.html http://pubs.opengroup.org/onlinepubs/7908799/xsh/signal.h.html Lets implement our own sys/* versions of these headers to redirect to the proper userland ones when building in userspace. That will silence the warning. There are also some instances where we include incorrectly from sys/ or from outside of sys/ in userspace only code. In these instances, lets just fix the includes directly. Reviewed-by: George Melikov <[email protected]> Reviewed-by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #5993
* glibc 2.5 compat: use correct header for makedev() et al.Olaf Faaland2017-03-312-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In glibc 2.5, makedev(), major(), and minor() are defined in sys/sysmacros.h. They are also defined in types.h for backward compatability, but using these definitions triggers a compile warning. This breaks the ZFS build, as it builds with -Werror. autoconf email threads indicate these macros may be defined in sys/mkdev.h in some cases. This commit adds configure checks to detect where makedev() is defined: sys/sysmacros.h sys/mkdev.h It assumes major() and minor() are defined in the same place. The libspl types.h then includes sys/sysmacros.h (preferred) or sys/mkdev.h (2nd choice) if one of those defines makedev(). This is done before including the system types.h. An alternative would be to remove uses of major, minor, and makedev, instead comparing the st_dev returned from stat64. These configure checks would then be unnecessary. This change revealed that __NORETURN was being defined unnecessarily in libspl/include/sys/sysmacros.h. That definition is removed. The files in which __NORETURN are used all include types.h, and so all will get the definition provided by feature_tests.h Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #5945
* libspl: Fix incorrect use of platform defines on sparc64John Paul Adrian Glaubitz2017-03-221-2/+2
| | | | | | | | | | | | | | | | libspl tries to detect sparc64 by checking whether __sparc64__ is defined. Unfortunately, this assumption is not correct as sparc64 does not define __sparc64__ but it defines __sparc__ and __arch64__ instead. This leads to sparc64 being detected as 32-Bit sparc and the build fails because both _ILP32 and _LP64 are defined in this case. To fix the problem, remove the checks for __sparc64__ and just check __arch64__ if a sparc host was previously detected with __sparc__. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: John Paul Adrian Glaubitz <[email protected]> Closes #5913
* Fix powerpc buildBrian Behlendorf2017-03-061-4/+0
| | | | | | | | | | | | | | | | | | | Unlike other architectures which sanitize the LDFLAGS from the environment in arch/<arch>/Makefile. The powerpc Makefile allows LDFLAGS to be passed through resulting in the following build failure. /usr/bin/ld: unrecognized option '-Wl,-z,relro' LDFLAGS is set in /usr/lib/rpm/redhat/macros by default. Clear the environment variable when building kmods for powerpc. Additionally, now that ppc64le exists it's not longer safe to assume a powerpc system is big endian. Rely on the endianness provided by the compiler. Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #5856
* codebase style improvements for OpenZFS 6459 portGeorge Melikov2017-01-221-4/+8
|
* Fix spellingka72017-01-032-2/+2
| | | | | | | | | Reviewed-by: Brian Behlendorf <[email protected] Reviewed-by: Giuseppe Di Natale <[email protected]>> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Haakan T Johansson <[email protected]> Closes #5547 Closes #5543
* ABD optimized page allocation codeChunwei Chen2016-11-291-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | * Convert ABD to use the Linux Kernel scatterlist implementation instead of the hand rolled one from illumos. * Scatter ABDs are preferentially populated with higher order compound pages from a single zone. Allocation size is progressively decreased until it can be satisfied without performing reclaim or compaction. * An alternate page allocator is provided for kernels older than 3.6 and for CONFIG_HIGHMEM systems. This allocator is designed as a fallback for maximum compatibility. * Extended abdstats to provide visibility in the the allocator. * Add cached value for PAGESIZE in userspace. Contributions-by: Chunwei Chen <[email protected]> Gvozden Neskovic <[email protected]> Jinshan Xiong <[email protected]> Isaac Huang <[email protected]> David Quigley <[email protected]> Brian Behlendorf <[email protected]>
* fix: Shift exponent too largeGvozden Neskovic2016-09-291-0/+3
| | | | | | | | | | | | Undefined operation is reported by running ztest (or zloop) compiled with GCC UndefinedBehaviorSanitizer. Error only happens on top level of dnode indirection with large enough offset values. Logically, left shift operation would work, but bit shift semantics in C, and limitation of uint64_t, do not produce desired result. Issue #5059, #4883 Signed-off-by: Gvozden Neskovic <[email protected]>
* Change /etc/mtab to /proc/self/mountsslashdd2016-09-201-1/+1
| | | | | | | | | | | Fix misleading error message: "The /dev/zfs device is missing and must be created.", if /etc/mtab is missing. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Eric Desrochers <[email protected]> Closes #4680 Closes #5029
* OpenZFS 5997 - FRU field not set during pool creation and never updatedHans Rosenfeld2016-08-124-271/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Authored by: Hans Rosenfeld <[email protected]> Reviewed by: Dan Fields <[email protected]> Reviewed by: Josef Sipek <[email protected]> Reviewed by: Richard Elling <[email protected]> Reviewed by: George Wilson <[email protected]> Approved by: Robert Mustacchi <[email protected]> Signed-off-by: Don Brady <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/5997 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/1437283 Porting Notes: In addition to the OpenZFS changes this patch realigns the events with those found in OpenZFS. Events which would be logged as sysevents on illumos have been been mapped to the 'sysevent' class for Linux. In addition, several subclass names have been changed to match what is used in OpenZFS. In all cases this means a '.' was changed to an '_' in the subclass. The scripts provided by ZoL have been updated, however users which provide scripts for any of the following events will need to rename them based on the new subclass names. ereport.fs.zfs.config.sync sysevent.fs.zfs.config_sync ereport.fs.zfs.zpool.destroy sysevent.fs.zfs.pool_destroy ereport.fs.zfs.zpool.reguid sysevent.fs.zfs.pool_reguid ereport.fs.zfs.vdev.remove sysevent.fs.zfs.vdev_remove ereport.fs.zfs.vdev.clear sysevent.fs.zfs.vdev_clear ereport.fs.zfs.vdev.check sysevent.fs.zfs.vdev_check ereport.fs.zfs.vdev.spare sysevent.fs.zfs.vdev_spare ereport.fs.zfs.vdev.autoexpand sysevent.fs.zfs.vdev_autoexpand ereport.fs.zfs.resilver.start sysevent.fs.zfs.resilver_start ereport.fs.zfs.resilver.finish sysevent.fs.zfs.resilver_finish ereport.fs.zfs.scrub.start sysevent.fs.zfs.scrub_start ereport.fs.zfs.scrub.finish sysevent.fs.zfs.scrub_finish ereport.fs.zfs.bootfs.vdev.attach sysevent.fs.zfs.bootfs_vdev_attach
* Illumos Crypto Port module added to enable native encryption in zfsTom Caputi2016-07-203-1/+24
| | | | | | | | | | | | | | | | | | | | A port of the Illumos Crypto Framework to a Linux kernel module (found in module/icp). This is needed to do the actual encryption work. We cannot use the Linux kernel's built in crypto api because it is only exported to GPL-licensed modules. Having the ICP also means the crypto code can run on any of the other kernels under OpenZFS. I ended up porting over most of the internals of the framework, which means that porting over other API calls (if we need them) should be fairly easy. Specifically, I have ported over the API functions related to encryption, digests, macs, and crypto templates. The ICP is able to use assembly-accelerated encryption on amd64 machines and AES-NI instructions on Intel chips that support it. There are place-holder directories for similar assembly optimizations for other architectures (although they have not been written). Signed-off-by: Tom Caputi <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4329
* Add isa_defs for MIPSYunQiang Su2016-05-311-1/+22
| | | | | | | | | GCC for MIPS only defines _LP64 when 64bit, while no _ILP32 defined when 32bit. Signed-off-by: YunQiang Su <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4712
* Add -lhHpw options to "zpool iostat" for avg latency, histograms, & queuesTony Hutter2016-05-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the zfs module to collect statistics on average latencies, queue sizes, and keep an internal histogram of all IO latencies. Along with this, update "zpool iostat" with some new options to print out the stats: -l: Include average IO latencies stats: total_wait disk_wait syncq_wait asyncq_wait scrub read write read write read write read write wait ----- ----- ----- ----- ----- ----- ----- ----- ----- - 41ms - 2ms - 46ms - 4ms - - 5ms - 1ms - 1us - 4ms - - 5ms - 1ms - 1us - 4ms - - - - - - - - - - - 49ms - 2ms - 47ms - - - - - - - - - - - - - 2ms - 1ms - - - 1ms - ----- ----- ----- ----- ----- ----- ----- ----- ----- 1ms 1ms 1ms 413us 16us 25us - 5ms - 1ms 1ms 1ms 413us 16us 25us - 5ms - 2ms 1ms 2ms 412us 26us 25us - 5ms - - 1ms - 413us - 25us - 5ms - - 1ms - 460us - 29us - 5ms - 196us 1ms 196us 370us 7us 23us - 5ms - ----- ----- ----- ----- ----- ----- ----- ----- ----- -w: Print out latency histograms: sdb total disk sync_queue async_queue latency read write read write read write read write scrub ------- ------ ------ ------ ------ ------ ------ ------ ------ ------ 1ns 0 0 0 0 0 0 0 0 0 ... 33us 0 0 0 0 0 0 0 0 0 66us 0 0 107 2486 2 788 12 12 0 131us 2 797 359 4499 10 558 184 184 6 262us 22 801 264 1563 10 286 287 287 24 524us 87 575 71 52086 15 1063 136 136 92 1ms 152 1190 5 41292 4 1693 252 252 141 2ms 245 2018 0 50007 0 2322 371 371 220 4ms 189 7455 22 162957 0 3912 6726 6726 199 8ms 108 9461 0 102320 0 5775 2526 2526 86 17ms 23 11287 0 37142 0 8043 1813 1813 19 34ms 0 14725 0 24015 0 11732 3071 3071 0 67ms 0 23597 0 7914 0 18113 5025 5025 0 134ms 0 33798 0 254 0 25755 7326 7326 0 268ms 0 51780 0 12 0 41593 10002 10002 0 537ms 0 77808 0 0 0 64255 13120 13120 0 1s 0 105281 0 0 0 83805 20841 20841 0 2s 0 88248 0 0 0 73772 14006 14006 0 4s 0 47266 0 0 0 29783 17176 17176 0 9s 0 10460 0 0 0 4130 6295 6295 0 17s 0 0 0 0 0 0 0 0 0 34s 0 0 0 0 0 0 0 0 0 69s 0 0 0 0 0 0 0 0 0 137s 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------- -h: Help -H: Scripted mode. Do not display headers, and separate fields by a single tab instead of arbitrary space. -q: Include current number of entries in sync & async read/write queues, and scrub queue: syncq_read syncq_write asyncq_read asyncq_write scrubq_read pend activ pend activ pend activ pend activ pend activ ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- 0 0 0 0 78 29 0 0 0 0 0 0 0 0 78 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - - - - - - - - - - 0 0 0 0 0 0 0 0 0 0 - - - - - - - - - - 0 0 0 0 0 0 0 0 0 0 ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- 0 0 227 394 0 19 0 0 0 0 0 0 227 394 0 19 0 0 0 0 0 0 108 98 0 19 0 0 0 0 0 0 19 98 0 0 0 0 0 0 0 0 78 98 0 0 0 0 0 0 0 0 19 88 0 0 0 0 0 0 ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -p: Display numbers in parseable (exact) values. Also, update iostat syntax to allow the user to specify specific vdevs to show statistics for. The three options for choosing pools/vdevs are: Display a list of pools: zpool iostat ... [pool ...] Display a list of vdevs from a specific pool: zpool iostat ... [pool vdev ...] Display a list of vdevs from any pools: zpool iostat ... [vdev ...] Lastly, allow zpool command "interval" value to be floating point: zpool iostat -v 0.5 Signed-off-by: Tony Hutter <[email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #4433
* OpenZFS 6672 - arc_reclaim_thread() should use gethrtime()David Quigley2016-05-061-0/+9
| | | | | | | | | | | | | | | 6672 arc_reclaim_thread() should use gethrtime() instead of ddi_get_lbolt() Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Prakash Surya <[email protected]> Reviewed by: Josef 'Jeff' Sipek <[email protected]> Reviewed by: Robert Mustacchi <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: David Quigley <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/6672 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/571be5c Closes #4600
* Move hrtime_t timestruc_t and timespec_tCarlo Landmeter2016-03-292-4/+5
| | | | | | | | | | | hrtime_t timestruc_t and timespec_t should have originally been included in sys/time.h so lets move them. longlong_t is not defined by any standard so change it to long long Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4459