aboutsummaryrefslogtreecommitdiffstats
path: root/lib/libspl
Commit message (Collapse)AuthorAgeFilesLines
* Remove lib/libspl/include/sys/frame.hBrian Behlendorf2017-12-172-132/+0
| | | | | | | | | | | The functionality provided by this header is not required by any of the ZFS user space code. Minimal functionality was provided in commit c28a677 which added include/sys/frame.h. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6960 Closes #6972
* OpenZFS 8585 - improve batching done in zil_commit()Prakash Surya2017-12-052-3/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Authored by: Prakash Surya <[email protected]> Reviewed by: Brad Lewis <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: Prakash Surya <[email protected]> Problem ======= The current implementation of zil_commit() can introduce significant latency, beyond what is inherent due to the latency of the underlying storage. The additional latency comes from two main problems: 1. When there's outstanding ZIL blocks being written (i.e. there's already a "writer thread" in progress), then any new calls to zil_commit() will block waiting for the currently oustanding ZIL blocks to complete. The blocks written for each "writer thread" is coined a "batch", and there can only ever be a single "batch" being written at a time. When a batch is being written, any new ZIL transactions will have to wait for the next batch to be written, which won't occur until the current batch finishes. As a result, the underlying storage may not be used as efficiently as possible. While "new" threads enter zil_commit() and are blocked waiting for the next batch, it's possible that the underlying storage isn't fully utilized by the current batch of ZIL blocks. In that case, it'd be better to allow these new threads to generate (and issue) a new ZIL block, such that it could be serviced by the underlying storage concurrently with the other ZIL blocks that are being serviced. 2. Any call to zil_commit() must wait for all ZIL blocks in its "batch" to complete, prior to zil_commit() returning. The size of any given batch is proportional to the number of ZIL transaction in the queue at the time that the batch starts processing the queue; which doesn't occur until the previous batch completes. Thus, if there's a lot of transactions in the queue, the batch could be composed of many ZIL blocks, and each call to zil_commit() will have to wait for all of these writes to complete (even if the thread calling zil_commit() only cared about one of the transactions in the batch). To further complicate the situation, these two issues result in the following side effect: 3. If a given batch takes longer to complete than normal, this results in larger batch sizes, which then take longer to complete and further drive up the latency of zil_commit(). This can occur for a number of reasons, including (but not limited to): transient changes in the workload, and storage latency irregularites. Solution ======== The solution attempted by this change has the following goals: 1. no on-disk changes; maintain current on-disk format. 2. modify the "batch size" to be equal to the "ZIL block size". 3. allow new batches to be generated and issued to disk, while there's already batches being serviced by the disk. 4. allow zil_commit() to wait for as few ZIL blocks as possible. 5. use as few ZIL blocks as possible, for the same amount of ZIL transactions, without introducing significant latency to any individual ZIL transaction. i.e. use fewer, but larger, ZIL blocks. In theory, with these goals met, the new allgorithm will allow the following improvements: 1. new ZIL blocks can be generated and issued, while there's already oustanding ZIL blocks being serviced by the storage. 2. the latency of zil_commit() should be proportional to the underlying storage latency, rather than the incoming synchronous workload. Porting Notes ============= Due to the changes made in commit 119a394ab0, the lifetime of an itx structure differs than in OpenZFS. Specifically, the itx structure is kept around until the data associated with the itx is considered to be safe on disk; this is so that the itx's callback can be called after the data is committed to stable storage. Since OpenZFS doesn't have this itx callback mechanism, it's able to destroy the itx structure immediately after the itx is committed to an lwb (before the lwb is written to disk). To support this difference, and to ensure the itx's callbacks can still be called after the itx's data is on disk, a few changes had to be made: * A list of itxs was added to the lwb structure. This list contains all of the itxs that have been committed to the lwb, such that the callbacks for these itxs can be called from zil_lwb_flush_vdevs_done(), after the data for the itxs is committed to disk. * A list of itxs was added on the stack of the zil_process_commit_list() function; the "nolwb_itxs" list. In some circumstances, an itx may not be committed to an lwb (e.g. if allocating the "next" ZIL block on disk fails), so this list is used to keep track of which itxs fall into this state, such that their callbacks can be called after the ZIL's writer pipeline is "stalled". * The logic to actually call the itx's callback was moved into the zil_itx_destroy() function. Since all consumers of zil_itx_destroy() were effectively performing the same logic (i.e. if callback is non-null, call the callback), it seemed like useful code cleanup to consolidate this logic into a single function. Additionally, the existing Linux tracepoint infrastructure dealing with the ZIL's probes and structures had to be updated to reflect these code changes. Specifically: * The "zil__cw1" and "zil__cw2" probes were removed, so they had to be removed from "trace_zil.h" as well. * Some of the zilog structure's fields were removed, which affected the tracepoint definitions of the structure. * New tracepoints had to be added for the following 3 new probes: * zil__process__commit__itx * zil__process__normal__itx * zil__commit__io__error OpenZFS-issue: https://www.illumos.org/issues/8585 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/5d95a3a Closes #6566
* Remove useless DEFAULT_INCLUDES in AM_CCASFLAGSKireinaHoro2017-10-121-1/+0
| | | | | | | CPPASCOMPILE and LTCPPASCOMPILE all include DEFAULT_INCLUDES, hence it's unnecessary to add the includes again. Signed-off-by: Pengcheng Xu <[email protected]>
* Fix libspl assembler flags to respect cpu typeKireinaHoro2017-10-121-1/+2
| | | | | | | | It's important to respect the user's CFLAGS as mismatched -mcpu will directly result in the assembler not able to produce correct code. Fixes #6733. Signed-off-by: Pengcheng Xu <[email protected]>
* Native Encryption for ZFS on LinuxTom Caputi2017-08-141-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change incorporates three major pieces: The first change is a keystore that manages wrapping and encryption keys for encrypted datasets. These commands mostly involve manipulating the new DSL Crypto Key ZAP Objects that live in the MOS. Each encrypted dataset has its own DSL Crypto Key that is protected with a user's key. This level of indirection allows users to change their keys without re-encrypting their entire datasets. The change implements the new subcommands "zfs load-key", "zfs unload-key" and "zfs change-key" which allow the user to manage their encryption keys and settings. In addition, several new flags and properties have been added to allow dataset creation and to make mounting and unmounting more convenient. The second piece of this patch provides the ability to encrypt, decyrpt, and authenticate protected datasets. Each object set maintains a Merkel tree of Message Authentication Codes that protect the lower layers, similarly to how checksums are maintained. This part impacts the zio layer, which handles the actual encryption and generation of MACs, as well as the ARC and DMU, which need to be able to handle encrypted buffers and protected data. The last addition is the ability to do raw, encrypted sends and receives. The idea here is to send raw encrypted and compressed data and receive it exactly as is on a backup system. This means that the dataset on the receiving system is protected using the same user key that is in use on the sending side. By doing so, datasets can be efficiently backed up to an untrusted system without fear of data being compromised. Reviewed by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Jorgen Lundman <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #494 Closes #5769
* Allow longer SPA names in statsgaurkuma2017-08-111-1/+1
| | | | | | | | | | | | The pool name can be 256 chars long. Today, in /proc/spl/kstat/zfs/ the name is limited to < 32 characters. This change is to allows bigger pool names. Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: loli10K <[email protected]> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: gaurkuma <[email protected]> Closes #6481
* Add libtpool (thread pools)Brian Behlendorf2017-08-093-1/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | OpenZFS provides a library called tpool which implements thread pools for user space applications. Porting this library means the zpool utility no longer needs to borrow the kernel mutex and taskq interfaces from libzpool. This code was updated to use the tpool library which behaves in a very similar fashion. Porting libtpool was relatively straight forward and minimal modifications were needed. The core changes were: * Fully convert the library to use pthreads. * Updated signal handling. * lmalloc/lfree converted to calloc/free * Implemented portable pthread_attr_clone() function. Finally, update the build system such that libzpool.so is no longer linked in to zfs(8), zpool(8), etc. All that is required is libzfs to which the zcommon soures were added (which is the way it always should have been). Removing the libzpool dependency resulted in several build issues which needed to be resolved. * Moved zfeature support to module/zcommon/zfeature_common.c * Moved ratelimiting to to module/zfs/zfs_ratelimit.c * Moved get_system_hostid() to lib/libspl/gethostid.c * Removed use of cmn_err() in zcommon source * Removed dprintf_setup() call from zpool_main.c and zfs_main.c * Removed highbit() and lowbit() * Removed unnecessary library dependencies from Makefiles * Removed fletcher-4 kstat in user space * Added sha2 support explicitly to libzfs * Added highbit64() and lowbit64() to zpool_util.c Reviewed-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #6442
* Fixed VERIFY3_IMPL() bug from 682ce104Tom Caputi2017-07-121-1/+2
| | | | | | | | | | | When VERIFY3_IMPL() was adjusted in 682ce104, the values of the operands were omitted from the variadic arguments list. This patch simply corrects this. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Tom Caputi <[email protected]> Closes #6343
* Musl libc fixesalaviss2017-07-051-0/+1
| | | | | | | | | Musl libc's <stdio.h> doesn't include <stdarg.h>, which cause `va_start` and `va_end` end up being undefined symbols. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Leorize <[email protected]> Closes #6310
* GCC 7.1 fixesTony Hutter2017-06-281-7/+17
| | | | | | | | | | | GCC 7.1 with will warn when we're not checking the snprintf() return code in cases where the buffer could be truncated. This patch either checks the snprintf return code (where applicable), or simply disables the warnings (ztest.c). Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #6253
* Compatibilty with glibc-2.23Justin Lecher2017-05-161-0/+1
| | | | | | | | | | In glibc-2.23 <sys/sysmacros.h> isn't automatically included in <sys/types.h> [1], so we need ot explicitely include it. https://sourceware.org/ml/libc-alpha/2015-11/msg00253.html Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Justin Lecher <[email protected]> Closes #6132
* Fix header inclusions for standards conformanceRichard Yao2017-04-124-0/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | musl's sys/errno.h is literally: /#warning redirecting incorrect #include <sys/errno.h> to <errno.h> /#include <errno.h> It does the same for sys/{poll,signal}.h. This is rather noisy when building ZoL against musl. musl is also correct in pointing out that the correct headers are outside of sys/ according to the single unix specification: http://pubs.opengroup.org/onlinepubs/7908799/xsh/errno.h.html http://pubs.opengroup.org/onlinepubs/7908799/xsh/poll.h.html http://pubs.opengroup.org/onlinepubs/7908799/xsh/signal.h.html Lets implement our own sys/* versions of these headers to redirect to the proper userland ones when building in userspace. That will silence the warning. There are also some instances where we include incorrectly from sys/ or from outside of sys/ in userspace only code. In these instances, lets just fix the includes directly. Reviewed-by: George Melikov <[email protected]> Reviewed-by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #5993
* Fix `zpool iostat -T d 1` on muslRichard Yao2017-04-121-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | When building on Gentoo against musl, GCC complains: timestamp.c: In function ‘print_timestamp’: timestamp.c:32:19: warning: passing argument 1 of ‘nl_langinfo’ makes integer from pointer without a cast #define _DATE_FMT "%+" ^ timestamp.c:47:21: note: in expansion of macro ‘_DATE_FMT’ fmt = nl_langinfo(_DATE_FMT); ^ The error was wrapped to meet comment style requirements. This code is used by `zpool iostat -T d 1` to print a date and upon testing it, I see no date printed. Lets use D_T_FMT so that something gets printed and if D_T_FMT is not avaliable, then we can fall back to "%+". Reviewed-by: George Melikov <[email protected]> Reviewed-by: loli10K <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: George Melikov <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes #5993
* glibc 2.5 compat: use correct header for makedev() et al.Olaf Faaland2017-03-312-1/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In glibc 2.5, makedev(), major(), and minor() are defined in sys/sysmacros.h. They are also defined in types.h for backward compatability, but using these definitions triggers a compile warning. This breaks the ZFS build, as it builds with -Werror. autoconf email threads indicate these macros may be defined in sys/mkdev.h in some cases. This commit adds configure checks to detect where makedev() is defined: sys/sysmacros.h sys/mkdev.h It assumes major() and minor() are defined in the same place. The libspl types.h then includes sys/sysmacros.h (preferred) or sys/mkdev.h (2nd choice) if one of those defines makedev(). This is done before including the system types.h. An alternative would be to remove uses of major, minor, and makedev, instead comparing the st_dev returned from stat64. These configure checks would then be unnecessary. This change revealed that __NORETURN was being defined unnecessarily in libspl/include/sys/sysmacros.h. That definition is removed. The files in which __NORETURN are used all include types.h, and so all will get the definition provided by feature_tests.h Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #5945
* libspl: Fix incorrect use of platform defines on sparc64John Paul Adrian Glaubitz2017-03-221-2/+2
| | | | | | | | | | | | | | | | libspl tries to detect sparc64 by checking whether __sparc64__ is defined. Unfortunately, this assumption is not correct as sparc64 does not define __sparc64__ but it defines __sparc__ and __arch64__ instead. This leads to sparc64 being detected as 32-Bit sparc and the build fails because both _ILP32 and _LP64 are defined in this case. To fix the problem, remove the checks for __sparc64__ and just check __arch64__ if a sparc host was previously detected with __sparc__. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: John Paul Adrian Glaubitz <[email protected]> Closes #5913
* Fix powerpc buildBrian Behlendorf2017-03-061-4/+0
| | | | | | | | | | | | | | | | | | | Unlike other architectures which sanitize the LDFLAGS from the environment in arch/<arch>/Makefile. The powerpc Makefile allows LDFLAGS to be passed through resulting in the following build failure. /usr/bin/ld: unrecognized option '-Wl,-z,relro' LDFLAGS is set in /usr/lib/rpm/redhat/macros by default. Clear the environment variable when building kmods for powerpc. Additionally, now that ppc64le exists it's not longer safe to assume a powerpc system is big endian. Rely on the endianness provided by the compiler. Reviewed-by: Giuseppe Di Natale <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #5856
* Fix atomic_sub_64() i386 assembly implementationBrian Behlendorf2017-02-011-1/+1
| | | | | | | | | | | | | The atomic_sub_64() should use sbbl instead of adcl. In user space these atomics are used for statistics tracking and aren't critical which explain how this was overlooked. The kernel space implementation of these atomics are layered on the architecture specific implementations provided by the kernel. Reviewed by: Stefan Ring <[email protected]> Reviewed-by: Gvozden Neskovic <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #5671 Closes #5717
* codebase style improvements for OpenZFS 6459 portGeorge Melikov2017-01-222-5/+10
|
* Fix spellingka72017-01-033-3/+3
| | | | | | | | | Reviewed-by: Brian Behlendorf <[email protected] Reviewed-by: Giuseppe Di Natale <[email protected]>> Reviewed-by: George Melikov <[email protected]> Reviewed-by: Haakan T Johansson <[email protected]> Closes #5547 Closes #5543
* Use cstyle -cpP in `make cstyle` checkBrian Behlendorf2016-12-123-5/+6
| | | | | | | | | | | | | | | | | | | | | | | Enable picky cstyle checks and resolve the new warnings. The vast majority of the changes needed were to handle minor issues with whitespace formatting. This patch contains no functional changes. Non-whitespace changes are as follows: * 8 times ; to { } in for/while loop * fix missing ; in cmd/zed/agents/zfs_diagnosis.c * comment (confim -> confirm) * change endline , to ; in cmd/zpool/zpool_main.c * a number of /* BEGIN CSTYLED */ /* END CSTYLED */ blocks * /* CSTYLED */ markers * change == 0 to ! * ulong to unsigned long in module/zfs/dsl_scan.c * rearrangement of module_param lines in module/zfs/metaslab.c * add { } block around statement after for_each_online_node Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Håkan Johansson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #5465
* ABD optimized page allocation codeChunwei Chen2016-11-293-2/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | * Convert ABD to use the Linux Kernel scatterlist implementation instead of the hand rolled one from illumos. * Scatter ABDs are preferentially populated with higher order compound pages from a single zone. Allocation size is progressively decreased until it can be satisfied without performing reclaim or compaction. * An alternate page allocator is provided for kernels older than 3.6 and for CONFIG_HIGHMEM systems. This allocator is designed as a fallback for maximum compatibility. * Extended abdstats to provide visibility in the the allocator. * Add cached value for PAGESIZE in userspace. Contributions-by: Chunwei Chen <[email protected]> Gvozden Neskovic <[email protected]> Jinshan Xiong <[email protected]> Isaac Huang <[email protected]> David Quigley <[email protected]> Brian Behlendorf <[email protected]>
* Fix coverity defects: CID 150921, 150927luozhengzheng2016-10-141-2/+2
| | | | | | | | CID 150921: Unchecked return value (CHECKED_RETURN) CID 150927 : Unchecked return value (CHECKED_RETURN) Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: luozhengzheng <[email protected]> Closes #5267
* OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-RTony Hutter2016-10-031-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed by: George Wilson <[email protected]> Reviewed by: Prakash Surya <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Reviewed by: Richard Lowe <[email protected]> Approved by: Garrett D'Amore <[email protected]> Ported by: Tony Hutter <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/4185 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/45818ee Porting Notes: This code is ported on top of the Illumos Crypto Framework code: https://github.com/zfsonlinux/zfs/pull/4329/commits/b5e030c8dbb9cd393d313571dee4756fbba8c22d The list of porting changes includes: - Copied module/icp/include/sha2/sha2.h directly from illumos - Removed from module/icp/algs/sha2/sha2.c: #pragma inline(SHA256Init, SHA384Init, SHA512Init) - Added 'ctx' to lib/libzfs/libzfs_sendrecv.c:zio_checksum_SHA256() since it now takes in an extra parameter. - Added CTASSERT() to assert.h from for module/zfs/edonr_zfs.c - Added skein & edonr to libicp/Makefile.am - Added sha512.S. It was generated from sha512-x86_64.pl in Illumos. - Updated ztest.c with new fletcher_4_*() args; used NULL for new CTX argument. - In icp/algs/edonr/edonr_byteorder.h, Removed the #if defined(__linux) section to not #include the non-existant endian.h. - In skein_test.c, renane NULL to 0 in "no test vector" array entries to get around a compiler warning. - Fixup test files: - Rename <sys/varargs.h> -> <varargs.h>, <strings.h> -> <string.h>, - Remove <note.h> and define NOTE() as NOP. - Define u_longlong_t - Rename "#!/usr/bin/ksh" -> "#!/bin/ksh -p" - Rename NULL to 0 in "no test vector" array entries to get around a compiler warning. - Remove "for isa in $($ISAINFO); do" stuff - Add/update Makefiles - Add some userspace headers like stdio.h/stdlib.h in places of sys/types.h. - EXPORT_SYMBOL *_Init/*_Update/*_Final... routines in ICP modules. - Update scripts/zfs2zol-patch.sed - include <sys/sha2.h> in sha2_impl.h - Add sha2.h to include/sys/Makefile.am - Add skein and edonr dirs to icp Makefile - Add new checksums to zpool_get.cfg - Move checksum switch block from zfs_secpolicy_setprop() to zfs_check_settable() - Fix -Wuninitialized error in edonr_byteorder.h on PPC - Fix stack frame size errors on ARM32 - Don't unroll loops in Skein on 32-bit to save stack space - Add memory barriers in sha2.c on 32-bit to save stack space - Add filetest_001_pos.ksh checksum sanity test - Add option to write psudorandom data in file_write utility
* fix: Shift exponent too largeGvozden Neskovic2016-09-291-0/+3
| | | | | | | | | | | | Undefined operation is reported by running ztest (or zloop) compiled with GCC UndefinedBehaviorSanitizer. Error only happens on top level of dnode indirection with large enough offset values. Logically, left shift operation would work, but bit shift semantics in C, and limitation of uint64_t, do not produce desired result. Issue #5059, #4883 Signed-off-by: Gvozden Neskovic <[email protected]>
* Change /etc/mtab to /proc/self/mountsslashdd2016-09-201-1/+1
| | | | | | | | | | | Fix misleading error message: "The /dev/zfs device is missing and must be created.", if /etc/mtab is missing. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Eric Desrochers <[email protected]> Closes #4680 Closes #5029
* OpenZFS 5997 - FRU field not set during pool creation and never updatedHans Rosenfeld2016-08-124-271/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Authored by: Hans Rosenfeld <[email protected]> Reviewed by: Dan Fields <[email protected]> Reviewed by: Josef Sipek <[email protected]> Reviewed by: Richard Elling <[email protected]> Reviewed by: George Wilson <[email protected]> Approved by: Robert Mustacchi <[email protected]> Signed-off-by: Don Brady <[email protected]> Ported-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/5997 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/1437283 Porting Notes: In addition to the OpenZFS changes this patch realigns the events with those found in OpenZFS. Events which would be logged as sysevents on illumos have been been mapped to the 'sysevent' class for Linux. In addition, several subclass names have been changed to match what is used in OpenZFS. In all cases this means a '.' was changed to an '_' in the subclass. The scripts provided by ZoL have been updated, however users which provide scripts for any of the following events will need to rename them based on the new subclass names. ereport.fs.zfs.config.sync sysevent.fs.zfs.config_sync ereport.fs.zfs.zpool.destroy sysevent.fs.zfs.pool_destroy ereport.fs.zfs.zpool.reguid sysevent.fs.zfs.pool_reguid ereport.fs.zfs.vdev.remove sysevent.fs.zfs.vdev_remove ereport.fs.zfs.vdev.clear sysevent.fs.zfs.vdev_clear ereport.fs.zfs.vdev.check sysevent.fs.zfs.vdev_check ereport.fs.zfs.vdev.spare sysevent.fs.zfs.vdev_spare ereport.fs.zfs.vdev.autoexpand sysevent.fs.zfs.vdev_autoexpand ereport.fs.zfs.resilver.start sysevent.fs.zfs.resilver_start ereport.fs.zfs.resilver.finish sysevent.fs.zfs.resilver_finish ereport.fs.zfs.scrub.start sysevent.fs.zfs.scrub_start ereport.fs.zfs.scrub.finish sysevent.fs.zfs.scrub_finish ereport.fs.zfs.bootfs.vdev.attach sysevent.fs.zfs.bootfs_vdev_attach
* Illumos Crypto Port module added to enable native encryption in zfsTom Caputi2016-07-203-1/+24
| | | | | | | | | | | | | | | | | | | | A port of the Illumos Crypto Framework to a Linux kernel module (found in module/icp). This is needed to do the actual encryption work. We cannot use the Linux kernel's built in crypto api because it is only exported to GPL-licensed modules. Having the ICP also means the crypto code can run on any of the other kernels under OpenZFS. I ended up porting over most of the internals of the framework, which means that porting over other API calls (if we need them) should be fairly easy. Specifically, I have ported over the API functions related to encryption, digests, macs, and crypto templates. The ICP is able to use assembly-accelerated encryption on amd64 machines and AES-NI instructions on Intel chips that support it. There are place-holder directories for similar assembly optimizations for other architectures (although they have not been written). Signed-off-by: Tom Caputi <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4329
* Add `zfs allow` and `zfs unallow` supportBrian Behlendorf2016-06-072-47/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ZFS allows for specific permissions to be delegated to normal users with the `zfs allow` and `zfs unallow` commands. In addition, non- privileged users should be able to run all of the following commands: * zpool [list | iostat | status | get] * zfs [list | get] Historically this functionality was not available on Linux. In order to add it the secpolicy_* functions needed to be implemented and mapped to the equivalent Linux capability. Only then could the permissions on the `/dev/zfs` be relaxed and the internal ZFS permission checks used. Even with this change some limitations remain. Under Linux only the root user is allowed to modify the namespace (unless it's a private namespace). This means the mount, mountpoint, canmount, unmount, and remount delegations cannot be supported with the existing code. It may be possible to add this functionality in the future. This functionality was validated with the cli_user and delegation test cases from the ZFS Test Suite. These tests exhaustively verify each of the supported permissions which can be delegated and ensures only an authorized user can perform it. Two minor bug fixes were required for test-running.py. First, the Timer() object cannot be safely created in a `try:` block when there is an unconditional `finally` block which references it. Second, when running as a normal user also check for scripts using the both the .ksh and .sh suffixes. Finally, existing users who are simulating delegations by setting group permissions on the /dev/zfs device should revert that customization when updating to a version with this change. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #362 Closes #434 Closes #4100 Closes #4394 Closes #4410 Closes #4487
* Add isa_defs for MIPSYunQiang Su2016-05-311-1/+22
| | | | | | | | | GCC for MIPS only defines _LP64 when 64bit, while no _ILP32 defined when 32bit. Signed-off-by: YunQiang Su <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4712
* Add -lhHpw options to "zpool iostat" for avg latency, histograms, & queuesTony Hutter2016-05-121-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update the zfs module to collect statistics on average latencies, queue sizes, and keep an internal histogram of all IO latencies. Along with this, update "zpool iostat" with some new options to print out the stats: -l: Include average IO latencies stats: total_wait disk_wait syncq_wait asyncq_wait scrub read write read write read write read write wait ----- ----- ----- ----- ----- ----- ----- ----- ----- - 41ms - 2ms - 46ms - 4ms - - 5ms - 1ms - 1us - 4ms - - 5ms - 1ms - 1us - 4ms - - - - - - - - - - - 49ms - 2ms - 47ms - - - - - - - - - - - - - 2ms - 1ms - - - 1ms - ----- ----- ----- ----- ----- ----- ----- ----- ----- 1ms 1ms 1ms 413us 16us 25us - 5ms - 1ms 1ms 1ms 413us 16us 25us - 5ms - 2ms 1ms 2ms 412us 26us 25us - 5ms - - 1ms - 413us - 25us - 5ms - - 1ms - 460us - 29us - 5ms - 196us 1ms 196us 370us 7us 23us - 5ms - ----- ----- ----- ----- ----- ----- ----- ----- ----- -w: Print out latency histograms: sdb total disk sync_queue async_queue latency read write read write read write read write scrub ------- ------ ------ ------ ------ ------ ------ ------ ------ ------ 1ns 0 0 0 0 0 0 0 0 0 ... 33us 0 0 0 0 0 0 0 0 0 66us 0 0 107 2486 2 788 12 12 0 131us 2 797 359 4499 10 558 184 184 6 262us 22 801 264 1563 10 286 287 287 24 524us 87 575 71 52086 15 1063 136 136 92 1ms 152 1190 5 41292 4 1693 252 252 141 2ms 245 2018 0 50007 0 2322 371 371 220 4ms 189 7455 22 162957 0 3912 6726 6726 199 8ms 108 9461 0 102320 0 5775 2526 2526 86 17ms 23 11287 0 37142 0 8043 1813 1813 19 34ms 0 14725 0 24015 0 11732 3071 3071 0 67ms 0 23597 0 7914 0 18113 5025 5025 0 134ms 0 33798 0 254 0 25755 7326 7326 0 268ms 0 51780 0 12 0 41593 10002 10002 0 537ms 0 77808 0 0 0 64255 13120 13120 0 1s 0 105281 0 0 0 83805 20841 20841 0 2s 0 88248 0 0 0 73772 14006 14006 0 4s 0 47266 0 0 0 29783 17176 17176 0 9s 0 10460 0 0 0 4130 6295 6295 0 17s 0 0 0 0 0 0 0 0 0 34s 0 0 0 0 0 0 0 0 0 69s 0 0 0 0 0 0 0 0 0 137s 0 0 0 0 0 0 0 0 0 ------------------------------------------------------------------------------- -h: Help -H: Scripted mode. Do not display headers, and separate fields by a single tab instead of arbitrary space. -q: Include current number of entries in sync & async read/write queues, and scrub queue: syncq_read syncq_write asyncq_read asyncq_write scrubq_read pend activ pend activ pend activ pend activ pend activ ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- 0 0 0 0 78 29 0 0 0 0 0 0 0 0 78 29 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - - - - - - - - - - 0 0 0 0 0 0 0 0 0 0 - - - - - - - - - - 0 0 0 0 0 0 0 0 0 0 ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- 0 0 227 394 0 19 0 0 0 0 0 0 227 394 0 19 0 0 0 0 0 0 108 98 0 19 0 0 0 0 0 0 19 98 0 0 0 0 0 0 0 0 78 98 0 0 0 0 0 0 0 0 19 88 0 0 0 0 0 0 ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -p: Display numbers in parseable (exact) values. Also, update iostat syntax to allow the user to specify specific vdevs to show statistics for. The three options for choosing pools/vdevs are: Display a list of pools: zpool iostat ... [pool ...] Display a list of vdevs from a specific pool: zpool iostat ... [pool vdev ...] Display a list of vdevs from any pools: zpool iostat ... [vdev ...] Lastly, allow zpool command "interval" value to be floating point: zpool iostat -v 0.5 Signed-off-by: Tony Hutter <[email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #4433
* OpenZFS 6672 - arc_reclaim_thread() should use gethrtime()David Quigley2016-05-061-0/+9
| | | | | | | | | | | | | | | 6672 arc_reclaim_thread() should use gethrtime() instead of ddi_get_lbolt() Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Prakash Surya <[email protected]> Reviewed by: Josef 'Jeff' Sipek <[email protected]> Reviewed by: Robert Mustacchi <[email protected]> Approved by: Dan McDonald <[email protected]> Ported-by: David Quigley <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/6672 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/571be5c Closes #4600
* Add support for libtirpcBrian Behlendorf2016-04-285-129/+26
| | | | | | | | | | | | | | | | | | | | | While OpenSolaris libc and glibc both include XDR support, the musl libc does not in favor of depending on the BSD-licensed libtirpc library. Adding support is a simple matter of detecting the library, including the headers and linking against it. By default libtirpc will be checked for and if available used. Otherwise, configure will fall back to using the xdr implementation provided by libc if available. The options --with-tirpc/--without-tirpc can be used to disable this checking. In addition, the xdr_control() function has been simplied to only handle ZFSs specific use case. Original-patch-by: stf <[email protected]> Original-patch-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Signed-off-by: Carlo Landmeter <[email protected]> Closes #2254 Closes #4559
* Add support for devid and phys_path keys in vdev disk labelsDon Brady2016-03-311-77/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is foundational work for ZED. Updates a leaf vdev's persistent device strings on Linux platform * only applies for a dedicated leaf vdev (aka whole disk) * updated during pool create|add|attach|import * used for matching device matching during auto-{online,expand,replace} * stored in a leaf disk config label (i.e. alongside 'path' NVP) * can opt-out using env var ZFS_VDEV_DEVID_OPT_OUT=YES Some examples: path: '/dev/sdb1' devid: 'scsi-350000394a8ca4fbc-part1' phys_path: 'pci-0000:04:00.0-sas-0x50000394a8ca4fbf-lun-0' path: '/dev/mapper/mpatha' devid: 'dm-uuid-mpath-35000c5006304de3f' Signed-off-by: Don Brady <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #2856 Closes #3978 Closes #4416
* Remove complicated libspl assert wrappersBrian Behlendorf2016-03-301-46/+36
| | | | | | | | | | Effectively provide our own version of assert()/verify() for use in user space. This minimizes our dependencies and aligns the user space assertion handling with what's used in the kernel. Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4449
* Move hrtime_t timestruc_t and timespec_tCarlo Landmeter2016-03-292-4/+5
| | | | | | | | | | | hrtime_t timestruc_t and timespec_t should have originally been included in sys/time.h so lets move them. longlong_t is not defined by any standard so change it to long long Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4459
* Set _DATE_FMT to '%+' if not defined in libspl/timestamp.cCarlo Landmeter2016-03-291-0/+4
| | | | | | Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4458
* Include sys/types.h in devid.hCarlo Landmeter2016-03-291-0/+1
| | | | | | | | This is needed for musl libc Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4454
* Add support for s390[x].Dimitri John Ledkov2016-03-171-1/+16
| | | | | | | Signed-off-by: Dimitri John Ledkov <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4425
* Fix aarch64 compilationGordan Bobic2016-03-151-1/+2
| | | | | | | | | | sys/param.h depends on types defined in sys/types.h (hrtime_t & timestruc_t). Signed-off-by: Gordan Bobic <[email protected]> Signed-off-by: Christopher J. Morrone <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4420
* Linux 4.5 compat: pfn_t typedefBrian Behlendorf2016-01-201-1/+0
| | | | | | | | | | | | The pfn_t typedef was inherited from Illumos but never directly used by any libspl consumers. This doesn't cause any issues in user space but for consistency with the kernel build it has been removed. See torvalds/linux/commit/34c0fd54. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Issue #4228
* Either _ILP32 or _LP64 must be definedBrian Behlendorf2015-12-101-15/+27
| | | | | | | | | | | For some arm, powerpc, and sparc platforms it was possible that neither _ILP32 of _LP64 would be defined. Update the isa_defs.h header to explicitly set these macros and generate a compile error in the case neither are defined. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4048
* sysmacros: Make P2ROUNDUP not trigger int overflowJason Zaman2015-11-161-4/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | The original P2ROUNDUP and P2ROUNDUP_TYPED macros contain -x which triggers PaX's integer overflow detection for unsigned integers. Replace the macros with an equivalent version that does not trigger the overflow. Axioms: A. (-(x)) === (~((x) - 1)) === (~(x) + 1) under two's complement. B. ~(x & y) === ((~(x)) | (~(y))) under De Morgan's law. C. ~(~x) === x under the law of excluded middle. Proof: 0. (-(-(x) & -(align))) original 1. (~(-(x) & -(align)) + 1) by A 2. (((~(-(x))) | (~(-(align)))) + 1) by B 3. (((~(~((x) - 1))) | (~(~((align) - 1)))) + 1) by A 4. (((((x) - 1)) | (((align) - 1))) + 1) by C Q.E.D. Signed-off-by: Jason Zaman <[email protected]> Reviewed-by: Chris Dunlop <[email protected]> Reviewed-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3949
* Add temporary mount optionsBrian Behlendorf2015-09-034-105/+1
| | | | | | | | | | | | | Add the required kernel side infrastructure to parse arbitrary mount options. This enables us to support temporary mount options in largely the same way it is handled on other platforms. See the 'Temporary Mount Point Properties' section of zfs(8) for complete details. Signed-off-by: Brian Behlendorf <[email protected]> Closes #985 Closes #3351
* Support parallel build trees (VPATH builds)Turbo Fredriksson2015-07-174-24/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Build products from an out of tree build should be written relative to the build directory. Sources should be referred to by their locations in the source directory. This is accomplished by adding the 'src' and 'obj' variables for the module Makefile.am, using relative paths to reference source files, and by setting VPATH when source files are not co-located with the Makefile. This enables the following: $ mkdir build $ cd build $ ../configure \ --with-spl=$HOME/src/git/spl/ \ --with-spl-obj=$HOME/src/git/spl/build $ make -s This change also has the advantage of resolving the following warning which is generated by modern versions of automake. Makefile.am:00: warning: source file 'xxx' is in a subdirectory, Makefile.am:00: but option 'subdir-objects' is disabled Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #1082
* Illumos 5163 - arc should reap range_seg_cacheGeorge Wilson2015-06-251-0/+5
| | | | | | | | | | | | | | | | | | | | 5163 arc should reap range_seg_cache Reviewed by: Christopher Siden <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Richard Elling <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/5163 https://github.com/illumos/illumos-gate/commit/83803b5 Porting Notes: Added umem_cache_reap_now() wrapped to suppress unused variable warning for user space build in arc_kmem_reap_now(). Ported-by: Brian Behlendorf <[email protected]>
* Add IMPLY() and EQUIV() macrosBrian Behlendorf2015-06-241-1/+17
| | | | | | | | | Added for upstream compatibility, they are of the form: * IMPLY(a, b) - if (a) then (b) * EQUIV(a, b) - if (a) then (b) *AND* if (b) then (a) Signed-off-by: Brian Behlendorf <[email protected]>
* Fix type mismatch on 32-bit systemsBrian Behlendorf2015-05-111-2/+0
| | | | | | | | | | | | | | The umem_alloc_aligned() function should not assume that a 'void *' type is 64-bit. It will not be on 32-bit platforms. Rather than complicating the ASSERT to handle this it is simply removed. Additionally, the '%lu' format specifier should not be assumed to imply a 64-bit value. Fix this by using the 'llu' format specifier which will always be atleast 64-bit and explicitly casing the variable to an u_longlong_t. This issue is handled the same way in many other parts of the code. Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 3897 - zfs filesystem and snapshot limitsJerry Jelinek2015-04-282-0/+27
| | | | | | | | | | | | | | | | | | 3897 zfs filesystem and snapshot limits Author: Jerry Jelinek <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Approved by: Christopher Siden <[email protected]> References: https://www.illumos.org/issues/3897 https://github.com/illumos/illumos-gate/commit/a2afb61 Porting Notes: dsl_dataset_snapshot_check(): reduce stack usage using kmem_alloc(). Ported-by: Chris Dunlop <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Swap DTRACE_PROBE* with Linux tracepointsPrakash Surya2014-11-172-37/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch leverages Linux tracepoints from within the ZFS on Linux code base. It also refactors the debug code to bring it back in sync with Illumos. The information exported via tracepoints can be used for a variety of reasons (e.g. debugging, tuning, general exploration/understanding, etc). It is advantageous to use Linux tracepoints as the mechanism to export this kind of information (as opposed to something else) for a number of reasons: * A number of external tools can make use of our tracepoints "automatically" (e.g. perf, systemtap) * Tracepoints are designed to be extremely cheap when disabled * It's one of the "accepted" ways to export this kind of information; many other kernel subsystems use tracepoints too. Unfortunately, though, there are a few caveats as well: * Linux tracepoints appear to only be available to GPL licensed modules due to the way certain kernel functions are exported. Thus, to actually make use of the tracepoints introduced by this patch, one might have to patch and re-compile the kernel; exporting the necessary functions to non-GPL modules. * Prior to upstream kernel version v3.14-rc6-30-g66cc69e, Linux tracepoints are not available for unsigned kernel modules (tracepoints will get disabled due to the module's 'F' taint). Thus, one either has to sign the zfs kernel module prior to loading it, or use a kernel versioned v3.14-rc6-30-g66cc69e or newer. Assuming the above two requirements are satisfied, lets look at an example of how this patch can be used and what information it exposes (all commands run as 'root'): # list all zfs tracepoints available $ ls /sys/kernel/debug/tracing/events/zfs enable filter zfs_arc__delete zfs_arc__evict zfs_arc__hit zfs_arc__miss zfs_l2arc__evict zfs_l2arc__hit zfs_l2arc__iodone zfs_l2arc__miss zfs_l2arc__read zfs_l2arc__write zfs_new_state__mfu zfs_new_state__mru # enable all zfs tracepoints, clear the tracepoint ring buffer $ echo 1 > /sys/kernel/debug/tracing/events/zfs/enable $ echo 0 > /sys/kernel/debug/tracing/trace # import zpool called 'tank', inspect tracepoint data (each line was # truncated, they're too long for a commit message otherwise) $ zpool import tank $ cat /sys/kernel/debug/tracing/trace | head -n35 # tracer: nop # # entries-in-buffer/entries-written: 1219/1219 #P:8 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | lt-zpool-30132 [003] .... 91344.200050: zfs_arc__miss: hdr... z_rd_int/0-30156 [003] .... 91344.200611: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.201173: zfs_arc__miss: hdr... z_rd_int/1-30157 [003] .... 91344.201756: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.201795: zfs_arc__miss: hdr... z_rd_int/2-30158 [003] .... 91344.202099: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.202126: zfs_arc__hit: hdr ... lt-zpool-30132 [003] .... 91344.202130: zfs_arc__hit: hdr ... lt-zpool-30132 [003] .... 91344.202134: zfs_arc__hit: hdr ... lt-zpool-30132 [003] .... 91344.202146: zfs_arc__miss: hdr... z_rd_int/3-30159 [003] .... 91344.202457: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.202484: zfs_arc__miss: hdr... z_rd_int/4-30160 [003] .... 91344.202866: zfs_new_state__mru... lt-zpool-30132 [003] .... 91344.202891: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.203034: zfs_arc__miss: hdr... z_rd_iss/1-30149 [001] .... 91344.203749: zfs_new_state__mru... lt-zpool-30132 [001] .... 91344.203789: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.203878: zfs_arc__miss: hdr... z_rd_iss/3-30151 [001] .... 91344.204315: zfs_new_state__mru... lt-zpool-30132 [001] .... 91344.204332: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204337: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204352: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204356: zfs_arc__hit: hdr ... lt-zpool-30132 [001] .... 91344.204360: zfs_arc__hit: hdr ... To highlight the kind of detailed information that is being exported using this infrastructure, I've taken the first tracepoint line from the output above and reformatted it such that it fits in 80 columns: lt-zpool-30132 [003] .... 91344.200050: zfs_arc__miss: hdr { dva 0x1:0x40082 birth 15491 cksum0 0x163edbff3a flags 0x640 datacnt 1 type 1 size 2048 spa 3133524293419867460 state_type 0 access 0 mru_hits 0 mru_ghost_hits 0 mfu_hits 0 mfu_ghost_hits 0 l2_hits 0 refcount 1 } bp { dva0 0x1:0x40082 dva1 0x1:0x3000e5 dva2 0x1:0x5a006e cksum 0x163edbff3a:0x75af30b3dd6:0x1499263ff5f2b:0x288bd118815e00 lsize 2048 } zb { objset 0 object 0 level -1 blkid 0 } For the specific tracepoint shown here, 'zfs_arc__miss', data is exported detailing the arc_buf_hdr_t (hdr), blkptr_t (bp), and zbookmark_t (zb) that caused the ARC miss (down to the exact DVA!). This kind of precise and detailed information can be extremely valuable when trying to answer certain kinds of questions. For anybody unfamiliar but looking to build on this, I found the XFS source code along with the following three web links to be extremely helpful: * http://lwn.net/Articles/379903/ * http://lwn.net/Articles/381064/ * http://lwn.net/Articles/383362/ I should also node the more "boring" aspects of this patch: * The ZFS_LINUX_COMPILE_IFELSE autoconf macro was modified to support a sixth paramter. This parameter is used to populate the contents of the new conftest.h file. If no sixth parameter is provided, conftest.h will be empty. * The ZFS_LINUX_TRY_COMPILE_HEADER autoconf macro was introduced. This macro is nearly identical to the ZFS_LINUX_TRY_COMPILE macro, except it has support for a fifth option that is then passed as the sixth parameter to ZFS_LINUX_COMPILE_IFELSE. These autoconf changes were needed to test the availability of the Linux tracepoint macros. Due to the odd nature of the Linux tracepoint macro API, a separate ".h" must be created (the path and filename is used internally by the kernel's define_trace.h file). * The HAVE_DECLARE_EVENT_CLASS autoconf macro was introduced. This is to determine if we can safely enable the Linux tracepoint functionality. We need to selectively disable the tracepoint code due to the kernel exporting certain functions as GPL only. Without this check, the build process will fail at link time. In addition, the SET_ERROR macro was modified into a tracepoint as well. To do this, the 'sdt.h' file was moved into the 'include/sys' directory and now contains a userspace portion and a kernel space portion. The dprintf and zfs_dbgmsg* interfaces are now implemented as tracepoint as well. Signed-off-by: Prakash Surya <[email protected]> Signed-off-by: Ned Bass <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Update utsname supportBrian Behlendorf2014-10-172-35/+0
| | | | | | | | | | | | | | | | | | Modify the code to use the utsname() kernel function rather than a global variable. This results is cleaner more portable code because utsname() is already provided by the kernel and can be easily emulated in user space via uname(2). This means that it will behave consistently in both contexts. This is also has the benefit that it allows the removal of a few _KERNEL pre-processor conditions. And it also is a pre-requisite for a proper FUSE port because we need to provide a valid utsname. Finally, it allows us to remove this functionality from the SPL and all the related compatibility code. Signed-off-by: Brian Behlendorf <[email protected]> Issue #2757