aboutsummaryrefslogtreecommitdiffstats
path: root/man
Commit message (Collapse)AuthorAgeFilesLines
* Illumos 5987 - zfs prefetch code needs workMatthew Ahrens2016-01-121-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5987 zfs prefetch code needs work Reviewed by: Adam Leventhal <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Approved by: Gordon Ross <[email protected]> References: https://www.illumos.org/issues/5987 zfs prefetch code needs work illumos/illumos-gate@cf6106c 5987 zfs prefetch code needs work Porting notes: - [module/zfs/dbuf.c] - 5f6d0b6 Handle block pointers with a corrupt logical size - [module/zfs/dmu_zfetch.c] - c65aa5b Fix gcc missing parenthesis warnings - 428870f Update core ZFS code from build 121 to build 141. - 79c76d5 Change KM_PUSHPAGE -> KM_SLEEP - b8d06fc Switch KM_SLEEP to KM_PUSHPAGE - Account for ISO C90 - mixed declarations and code - warnings - Module parameters (new/changed): - Replaced zfetch_block_cap with zfetch_max_distance (Max bytes to prefetch per stream (default 8MB; 8 * 1024 * 1024)) - Preserved zfs_prefetch_disable as 'int' for consistency with existing Linux module options. - [include/sys/trace_arc.h] - Added new tracepoints - DEFINE_ARC_BUF_HDR_EVENT(zfs_arc__sync__wait__for__async); - DEFINE_ARC_BUF_HDR_EVENT(zfs_arc__demand__hit__predictive__prefetch); - [man/man5/zfs-module-parameters.5] - Updated man page Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 4891 - want zdb option to dump all metadataMatthew Ahrens2016-01-111-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 4891 want zdb option to dump all metadata Reviewed by: Sonu Pillai <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: Dan McDonald <[email protected]> Reviewed by: Richard Lowe <[email protected]> Approved by: Garrett D'Amore <[email protected]> We'd like a way for zdb to dump metadata in a machine-readable format, so that we can bring that back from a customer site for in-house diagnosis. Think of it as a crash dump for zpools, which can be used for post-mortem analysis of a malfunctioning pool References: https://www.illumos.org/issues/4891 https://github.com/illumos/illumos-gate/commit/df15e41 Porting notes: - [cmd/zdb/zdb.c] - a5778ea zdb: Introduce -V for verbatim import - In main() getopt 'opt' variable removed and the code was brought back in line with illumos. - [lib/libzpool/kernel.c] - 1e33ac1 Fix Solaris thread dependency by using pthreads - f0e324f Update utsname support - 4d58b69 Fix vn_open/vn_rdwr error handling - In vn_open() allocate 'dumppath' on heap instead of stack - Properly handle 'dump_fd == -1' error path - Free 'realpath' after added vn_dumpdir_code block Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Man page whitespacenathancheek2016-01-111-0/+2
| | | | | | Signed-off-by: nathancheek <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4184
* Illumos 5960, 5925Paul Dagnelie2016-01-081-4/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5960 zfs recv should prefetch indirect blocks 5925 zfs receive -o origin= Reviewed by: Prakash Surya <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> References: https://www.illumos.org/issues/5960 https://www.illumos.org/issues/5925 https://github.com/illumos/illumos-gate/commit/a2cdcdd Porting notes: - [lib/libzfs/libzfs_sendrecv.c] - b8864a2 Fix gcc cast warnings - 325f023 Add linux kernel device support - 5c3f61e Increase Linux pipe buffer size on 'zfs receive' - [module/zfs/zfs_vnops.c] - 3558fd7 Prototype/structure update for Linux - c12e3a5 Restructure zfs_readdir() to fix regressions - [module/zfs/zvol.c] - Function @zvol_map_block() isn't needed in ZoL - 9965059 Prefetch start and end of volumes - [module/zfs/dmu.c] - Fixed ISO C90 - mixed declarations and code - Function dmu_prefetch() 'int i' is initialized before the following code block (c90 vs. c99) - [module/zfs/dbuf.c] - fc5bb51 Fix stack dbuf_hold_impl() - 9b67f60 Illumos 4757, 4913 - 34229a2 Reduce stack usage for recursive traverse_visitbp() - [module/zfs/dmu_send.c] - Fixed ISO C90 - mixed declarations and code - b58986e Use large stacks when available - 241b541 Illumos 5959 - clean up per-dataset feature count code - 77aef6f Use vmem_alloc() for nvlists - 00b4602 Add linux kernel memory support Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 5745 - zfs set allows only one dataset property to be set at a timeChris Williamson2015-12-291-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5745 zfs set allows only one dataset property to be set at a time Reviewed by: Christopher Siden <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Bayard Bell <[email protected]> Reviewed by: Richard PALO <[email protected]> Reviewed by: Steven Hartland <[email protected]> Approved by: Rich Lowe <[email protected]> References: https://www.illumos.org/issues/5745 https://github.com/illumos/illumos-gate/commit/3092556 Porting notes: - Fix the missing braces around initializer, zfs_cmd_t zc = {"\0"}; - Remove extra format argument in zfs_do_set() - Declare at the top: - zfs_prop_t prop; - nvpair_t *elem; - nvpair_t *next; - int i; - Additionally initialize: - int added_resv = 0; - zfs_prop_t prop = 0; - Assign 0 install of NULL for uint64_t types. - zc->zc_nvlist_conf = '\0'; - zc->zc_nvlist_src = '\0'; - zc->zc_nvlist_dst = '\0'; Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3574
* Make zio_taskq_batch_pct user configurableDHE2015-12-181-0/+17
| | | | | | | | | Adds zio_taskq_batch_pct as an exported module parameter, allowing users to modify it at module load time. Signed-off-by: DHE <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4110
* Man page white space and spelling correctionsNed Bass2015-12-187-136/+136
| | | | | | | | | Correct some misspelled words and grammatical errors, and remove trailing white space in the man pages. Signed-off-by: Ned Bass <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4115
* Remove shareiscsi description and example from zfs(8).Turbo Fredriksson2015-10-131-47/+9
| | | | | Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Unmount is part of the shutdown process, not the boot process.Turbo Fredriksson2015-10-131-1/+1
| | | | | | Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes: #3762
* Add large block support to zpios(1) benchmarkDon Brady2015-09-221-2/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the large block support effort, it makes sense to add support for large blocks to **zpios(1)**. The specifying of a zfs block size for zpios is optional and will default to 128K if the block size is not specified. `zpios ... -S size | --blocksize size ...` This will use *size* ZFS blocks for each test, specified as a comma delimited list with an optional unit suffix. The supported range is powers of two from 128K through 16M. A range of block sizes can be tested as follows: `-S 128K,256K,512K,1M` Example run below (non realistic results from a VM and output abbreviated for space) ``` --regioncount=750 --regionsize=8M --chunksize=1M --offset=4K --threaddelay=0 --cleanup --human-readable --verbose --cleanup --blocksize=128K,256K,512K,1M th-cnt rg-cnt rg-sz ch-sz blksz wr-data wr-bw rd-data rd-bw --------------------------------------------------------------------- 4 750 8m 1m 128k 5g 90.06m 5g 93.37m 4 750 8m 1m 256k 5g 79.71m 5g 99.81m 4 750 8m 1m 512k 5g 42.20m 5g 93.14m 4 750 8m 1m 1m 5g 35.51m 5g 89.36m 8 750 8m 1m 128k 5g 85.49m 5g 90.81m 8 750 8m 1m 256k 5g 61.42m 5g 99.24m 8 750 8m 1m 512k 5g 49.09m 5g 108.78m 16 750 8m 1m 128k 5g 86.28m 5g 88.73m 16 750 8m 1m 256k 5g 64.34m 5g 93.47m 16 750 8m 1m 512k 5g 68.84m 5g 124.47m 16 750 8m 1m 1m 5g 53.97m 5g 97.20m --------------------------------------------------------------------- ``` Signed-off-by: Don Brady <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3795 Closes #2071
* Honor xattr=sa dataset propertyNed Bass2015-09-191-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | ZFS incorrectly uses directory-based extended attributes even when xattr=sa is specified as a dataset property or mount option. Support to honor temporary mount options including "xattr" was added in commit 0282c4137e7409e6d85289f4955adf07fac834f5. There are two issues with the mount option handling: * Libzfs has historically included "xattr" in its list of default mount options. This overrides the dataset property, so the dataset is always configured to use directory-based xattrs even when the xattr dataset property is set to off or sa. Address this by removing "xattr" from the set of default mount options in libzfs. * There was no way to enable system attribute-based extended attributes using temporary mount options. Add the mount options "saxattr" and "dirxattr" which enable the xattr behavior their names suggest. This approach has the advantages of mirroring the valid xattr dataset property values and following existing conventions for mount option names. Signed-off-by: Ned Bass <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3787
* Prefetch start and end of volumesBrian Behlendorf2015-09-091-0/+15
| | | | | | | | | | | | When adding a zvol to the system prefetch zvol_prefetch_bytes from the start and end of the volume. Prefetching these regions of the volume is desirable because they are likely to be accessed immediately by blkid(8), the kernel scanning for a partition table, or another task which probes the devices. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #3659
* Add dbgmsg kstatBrian Behlendorf2015-09-041-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Internally ZFS keeps a small log to facilitate debugging. By default the log is disabled, to enable it set zfs_dbgmsg_enable=1. The contents of the log can be accessed by reading the /proc/spl/kstat/zfs/dbgmsg file. Writing 0 to this proc file clears the log. $ echo 1 >/sys/module/zfs/parameters/zfs_dbgmsg_enable $ echo 0 >/proc/spl/kstat/zfs/dbgmsg $ zpool import tank $ cat /proc/spl/kstat/zfs/dbgmsg 1 0 0x01 -1 0 2492357525542 2525836565501 timestamp message 1441141408 spa=tank async request task=1 1441141408 txg 70 open pool version 5000; software version 5000/5; ... 1441141409 spa=tank async request task=32 1441141409 txg 72 import pool version 5000; software version 5000/5; ... 1441141414 command: lt-zpool import tank Note the zfs_dbgmsg() and dprintf() functions are both now mapped to the same log. As mentioned above the kernel debug log can be accessed though the /proc/spl/kstat/zfs/dbgmsg kstat. For user space consumers log messages are immediately written to stdout after applying the ZFS_DEBUG environment variable. $ ZFS_DEBUG=on ./cmd/ztest/ztest -V Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Ned Bass <[email protected]> Closes #3728
* Support accessing .zfs/snapshot via NFSBrian Behlendorf2015-09-041-0/+15
| | | | | | | | | | | | | | | | | | | | This patch is based on the previous work done by @andrey-ve and @yshui. It triggers the automount by using kern_path() to traverse to the known snapshout mount point. Once the snapshot is mounted NFS can access the contents of the snapshot. Allowing NFS clients to access to the .zfs/snapshot directory would normally mean that a root user on a client mounting an export with 'no_root_squash' would be able to use mkdir/rmdir/mv to manipulate snapshots on the server. To prevent configuration mistakes a zfs_admin_snapshot module option was added which disables the mkdir/rmdir/mv functionally. System administators desiring this functionally must explicitly enable it. Signed-off-by: Brian Behlendorf <[email protected]> Closes #2797 Closes #1655 Closes #616
* Merge branch 'zvol'Brian Behlendorf2015-09-041-11/+0
|\ | | | | | | | | | | | | | | Performance improvements for zvols. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3720
| * zvol processing should use struct bioRichard Yao2015-09-041-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Internally, zvols are files exposed through the block device API. This is intended to reduce overhead when things require block devices. However, the ZoL zvol code emulates a traditional block device in that it has a top half and a bottom half. This is an unnecessary source of overhead that does not exist on any other OpenZFS platform does this. This patch removes it. Early users of this patch reported double digit performance gains in IOPS on zvols in the range of 50% to 80%. Comments in the code suggest that the current implementation was done to obtain IO merging from Linux's IO elevator. However, the DMU already does write merging while arc_read() should implicitly merge read IOs because only 1 thread is permitted to fetch the buffer into ARC. In addition, commercial ZFSOnLinux distributions report that regular files are more performant than zvols under the current implementation, and the main consumers of zvols are VMs and iSCSI targets, which have their own elevators to merge IOs. Some minor refactoring allows us to register zfs_request() as our ->make_request() handler in place of the generic_make_request() function. This eliminates the layer of code that broke IO requests on zvols into a top half and a bottom half. This has several benefits: 1. No per zvol spinlocks. 2. No redundant IO elevator processing. 3. Interrupts are disabled only when actually necessary. 4. No redispatching of IOs when all taskq threads are busy. 5. Linux's page out routines will properly block. 6. Many autotools checks become obsolete. An unfortunate consequence of eliminating the layer that generic_make_request() is that we no longer calls the instrumentation hooks for block IO accounting. Those hooks are GPL-exported, so we cannot call them ourselves and consequently, we lose the ability to do IO monitoring via iostat. Since zvols are internally files mapped as block devices, this should be okay. Anyone who is willing to accept the performance penalty for the block IO layer's accounting could use the loop device in between the zvol and its consumer. Alternatively, perf and ftrace likely could be used. Also, tools like latencytop will still work. Tools such as latencytop sometimes provide a better view of performance bottlenecks than the traditional block IO accounting tools do. Lastly, if direct reclaim occurs during spacemap loading and swap is on a zvol, this code will deadlock. That deadlock could already occur with sync=always on zvols. Given that swap on zvols is not yet production ready, this is not a blocker. Signed-off-by: Richard Yao <[email protected]>
* | Add temporary mount optionsBrian Behlendorf2015-09-031-0/+3
|/ | | | | | | | | | | | | Add the required kernel side infrastructure to parse arbitrary mount options. This enables us to support temporary mount options in largely the same way it is handled on other platforms. See the 'Temporary Mount Point Properties' section of zfs(8) for complete details. Signed-off-by: Brian Behlendorf <[email protected]> Closes #985 Closes #3351
* Add spa_slop_shift module optionBrian Behlendorf2015-09-021-0/+16
| | | | | | | | | | Allow for easy turning of a pools reserved free space. Previous versions of ZFS (v0.6.4 and earlier) held 1/64 of the pools capacity in reserve. Commits 3d45fdd and 0c60cc3 increased this to 1/32. Setting spa_slop_shift=6 will restore the previous default setting. Signed-off-by: Brian Behlendorf <[email protected]> Closes #3724
* Add extra keyword 'slot' to vdev_id.confAndreas Buschmann2015-08-301-0/+14
| | | | | | | | | | | Add new keyword 'slot' to vdev_id.conf This selects from where to get the slot number for a SAS/SATA disk Needed to enable access to the physical position of a disk in a Supermicro 2027R-AR24NV . Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Ned Bass <[email protected]> Closes #3693
* Update arc_memory_throttle() to check pageoutBrian Behlendorf2015-07-301-6/+7
| | | | | | | | | | | | | | | | | | | | | | | | This brings the behavior of arc_memory_throttle() back in sync with illumos. The updated memory throttling policy roughly goes like this: * Never throttle if more than 10% of memory is free. This threshold is configurable with the zfs_arc_lotsfree_percent module option. * Minimize any throttling of kswapd even when free memory is below the set threshold. Allow it to write out pages as quickly as possible to help alleviate the memory pressure. * Delay all other threads when free memory is below the set threshold in order to avoid compounding the memory pressure. Buffers will be evicted from the ARC to reduce the issue. The Linux specific zfs_arc_memory_throttle_disable module option has been removed in favor of the existing zfs_arc_lotsfree_percent tuning. Setting zfs_arc_lotsfree_percent=0 will have the same effect as zfs_arc_memory_throttle_disable and it was therefore redundant. Signed-off-by: Brian Behlendorf <[email protected]> Closes #3637
* Update arc_available_memory() to check freememBrian Behlendorf2015-07-301-0/+13
| | | | | | | | | | | | | | | | | | | | | | While Linux doesn't provide detailed information about the state of the VM it does provide us total free pages. This information should be incorporated in to the arc_available_memory() calculation rather than solely relying on a signal from direct reclaim. Conceptually this brings arc_available_memory() back in sync with illumos. It is also desirable that the target amount of free memory be tunable on a system. While the default values are expected to work well for most workloads there may be cases where custom values are needed. The zfs_arc_sys_free module option was added for this purpose. zfs_arc_sys_free - The target number of bytes the ARC should leave as free memory on the system. This value can checked in /proc/spl/kstat/zfs/arcstats and setting this module option will override the default value. Signed-off-by: Brian Behlendorf <[email protected]> Closes #3637
* Bound zvol_threads module optionBrian Behlendorf2015-07-291-1/+1
| | | | | | | | | | The zvol_threads module option should be bounded to a reasonable range. The taskq must have at least 1 thread and shouldn't have more than 1,024 at most. The default value of 32 is a reasonable default. Signed-off-by: Brian Behlendorf <[email protected]> Closes #3614
* Reinstate zfs_arc_p_min_shiftBrian Behlendorf2015-07-231-0/+12
| | | | | | | | | | | | | | | Commit f521ce1 removed the minimum value for "arc_p" allowing it to drop to zero or grow to "arc_c". This was done to improve specific workload which constantly dirties new "metadata" but also frequently touches a "small" amount of mfu data (e.g. mkdir's). This change may still be desirable but it needs to be re-investigated. in the context of the recent ARC changes from upstream. Therefore this code is being restored to facilitate benchmarking. By setting "zfs_arc_p_min_shift=64" we easily compare the performance. Signed-off-by: Brian Behlendorf <[email protected]> Issue #3533
* Illumos 5764 - "zfs send -nv" directs output to stderrManoj Joseph2015-07-141-1/+3
| | | | | | | | | | | | | | | | | | 5764 "zfs send -nv" directs output to stderr Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Basil Crow <[email protected]> Reviewed by: Steven Hartland <[email protected]> Reviewed by: Bayard Bell <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://github.com/illumos/illumos-gate/commit/dc5f28a https://www.illumos.org/issues/5764 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3585
* Illumos 5661 - ZFS: "compression = on" should use lz4 if feature is enabledJustin T. Gibbs2015-07-101-5/+23
| | | | | | | | | | | | | | | | 5661 ZFS: "compression = on" should use lz4 if feature is enabled Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Josef 'Jeff' Sipek <[email protected]> Reviewed by: Xin LI <[email protected]> Approved by: Robert Mustacchi <[email protected]> References: https://github.com/illumos/illumos-gate/commit/db1741f https://www.illumos.org/issues/5661 Ported-by: kernelOfTruth [email protected] Signed-off-by: Brian Behlendorf <[email protected]> Closes #3571
* man: fix spelling mistakes in manualColin Ian King2015-07-014-6/+6
| | | | | | | | | | | | | | | | | | | | | A few minor mistakes than should be fixed: zpool: compatability -> compatibility zfs: accessable -> accessible availible -> available zfs-events: availible -> available zfs-module-parameters: proceding -> proceeding Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3544
* Make metaslab_aliquot a module parameter.Etienne Dechamps2015-06-221-0/+14
| | | | | | | | | | This seems generally useful. metaslab_aliquot is the ZFS allocation granularity, which is roughly equivalent to what is called the stripe size in traditional RAID arrays. It seems relevant to performance tuning. Signed-off-by: Etienne Dechamps <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Add -y option to `zpool iostat`Hajo Möller2015-06-171-2/+13
| | | | | | | | | | sysstat's iostat omits the first report when the -y option is used. This patch adds that functionality and omits the first report with statistics since system boot. Signed-off-by: Hajo Möller <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3439
* Illumos 5497 - lock contention on arcs_mtxPrakash Surya2015-06-111-0/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed by: George Wilson <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Richard Elling <[email protected]> Approved by: Dan McDonald <[email protected]> Porting notes and other significant code changes: The illumos 5368 patch (ARC should cache more metadata), which was never picked up by ZoL, is mostly reverted by this patch. Since ZoL relies on the kernel asynchronously calling the shrinker to actually reap memory, the shrinker wakes up arc_reclaim_waiters_cv every time it runs. The arc_adapt_thread() function no longer calls arc_do_user_evicts() since the newly-added arc_user_evicts_thread() calls it periodically. Notable conflicting ZoL commits which conflicted with this patch or whose effects are either duplicated or un-done by this patch: 302f753 - Integrate ARC more tightly with Linux 39e055c - Adjust arc_p based on "bytes" in arc_shrink f521ce1 - Allow "arc_p" to drop to zero or grow to "arc_c" 77765b5 - Remove "arc_meta_used" from arc_adjust calculation 94520ca - Prune metadata from ghost lists in arc_adjust_meta Trace support for multilist_insert() and multilist_remove() has been added and produces the following output: fio-12498 [077] .... 112936.448324: zfs_multilist__insert: ml { offset 240 numsublists 80 sublistidx 63 } fio-12498 [077] .... 112936.448347: zfs_multilist__remove: ml { offset 240 numsublists 80 sublistidx 29 } The following arcstats have been removed: recycle_miss - Used by arcstat.py and arc_summary.py, both of which have been updated appropriately. l2_writes_hdr_miss The following arcstats have been added: evict_not_enough - Number of times arc_evict_state() was unable to evict enough buffers to reach its target amount. evict_l2_skip - Number of times arc_evict_hdr() skipped eviction because it was being written to the l2arc. l2_writes_lock_retry - Replaces l2_writes_hdr_miss. Number of times l2arc_write_done() failed to acquire hash_lock (and re-tries). arc_meta_min - Shows the value of the zfs_arc_meta_min module parameter (see below). The "index" column of the "dbuf" kstat has been removed since it doesn't have a direct analog in the new multilist scheme. Additional multilist- related stats could be added in the future but would likely require extensions to the mulilist API. The following module parameters have been added: zfs_arc_evict_batch_limit - Number of ARC headers to free per sub-list before moving on to the next sub-list. zfs_arc_meta_min - Enforce a floor on the amount of metadata in the ARC. zfs_arc_num_sublists_per_state - Number of multilist sub-lists per ARC state. zfs_arc_overflow_shift - Controls amount by which the ARC must exceed the target size to be considered "overflowing". Ported-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]
* Improve on the ZFS events documentationTurbo Fredriksson2015-06-092-11/+212
| | | | | | | | | | | * Add information about the 'zpool events' command in zpool(8). * More events and payloads defined in zfs-events(5). * I/O Stages and I/O Flags sections added. * Remove unused legacy "zio_deadline" payload define. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3467
* Zdb should be able to open the root datasetTim Chase2015-05-151-0/+5
| | | | | | | | | If the pool/dataset command-line argument is specified with a trailing slash, for example, "tank/", it is interpreted as the root dataset. Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3415
* Update ZED copyright boilerplateChris Dunlap2015-05-111-26/+12
| | | | | | | | | | | | | | | | | | | | | | This commit updates the copyright boilerplate within the ZED subtree. The instructions for appending a contributor copyright line have been removed. Manually maintaining copyright notices in this manner is error-prone, imprecise at a file-scope granularity, and oftentimes inaccurate. These lines can become a pernicious source of merge conflicts. A commit log is better suited to maintaining this information. Consequently, a line has been added to the boilerplate to refer to the git commit log for authoritative copyright attribution. To account for the scenario where a file may become separated from the codebase and commit history (i.e., it is copied somewhere else), a line has been added to identify the file's origin. http://softwarefreedom.org/resources/2012/ManagingCopyrightInformation.html Signed-off-by: Chris Dunlap <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3384
* Illumos 5027 - zfs large block supportMatthew Ahrens2015-05-113-5/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5027 zfs large block support Reviewed by: Alek Pinchuk <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Josef 'Jeff' Sipek <[email protected]> Reviewed by: Richard Elling <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/5027 https://github.com/illumos/illumos-gate/commit/b515258 Porting Notes: * Included in this patch is a tiny ISP2() cleanup in zio_init() from Illumos 5255. * Unlike the upstream Illumos commit this patch does not impose an arbitrary 128K block size limit on volumes. Volumes, like filesystems, are limited by the zfs_max_recordsize=1M module option. * By default the maximum record size is limited to 1M by the module option zfs_max_recordsize. This value may be safely increased up to 16M which is the largest block size supported by the on-disk format. At the moment, 1M blocks clearly offer a significant performance improvement but the benefits of going beyond this for the majority of workloads are less clear. * The illumos version of this patch increased DMU_MAX_ACCESS to 32M. This was determined not to be large enough when using 16M blocks because the zfs_make_xattrdir() function will fail (EFBIG) when assigning a TX. This was immediately observed under Linux because all newly created files must have a security xattr created and that was failing. Therefore, we've set DMU_MAX_ACCESS to 64M. * On 32-bit platforms a hard limit of 1M is set for blocks due to the limited virtual address space. We should be able to relax this one the ABD patches are merged. Ported-by: Brian Behlendorf <[email protected]> Closes #354
* Add the '-a' option to 'zpool export'Turbo Fredriksson2015-05-041-3/+16
| | | | | | | | | | | | | Support exporting all imported pools in one go, using 'zpool export -a'. This is accomplished by moving the export parts from zpool_do_export() in to the new function zpool_export_one(). The for_each_pool() function is used to enumerate the list of pools to be exported. Passing an argc of 0 implies the function should be called on all pools. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes: #3203
* Illumos 3897 - zfs filesystem and snapshot limitsJerry Jelinek2015-04-282-1/+80
| | | | | | | | | | | | | | | | | | 3897 zfs filesystem and snapshot limits Author: Jerry Jelinek <[email protected]> Reviewed by: Matthew Ahrens <[email protected]> Approved by: Christopher Siden <[email protected]> References: https://www.illumos.org/issues/3897 https://github.com/illumos/illumos-gate/commit/a2afb61 Porting Notes: dsl_dataset_snapshot_check(): reduce stack usage using kmem_alloc(). Ported-by: Chris Dunlop <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* 5410 Document -S option to zfs inheritPaul B. Henson2015-04-241-5/+26
| | | | | | | | | | | | | | | 5410 Document -S option to zfs inherit 5412 Mention -S option when zfs inherit fails on quota Reviewed by: Matthew Ahrens <[email protected]> Approved by: Richard Lowe <[email protected]> References: https://www.illumos.org/issues/5410 https://github.com/illumos/illumos-gate/commit/5ff8cfa9 Ported-by: DHE <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3279
* align zfs_autoimport_disable manpage with realitycburroughs2015-04-241-1/+1
| | | | | | | | The default was changed in #2820. Signed-off-by: cburroughs <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3341
* Fix formatting error in zfs(8)DHE2015-04-231-0/+2
| | | | | | | | | | Commit b1a3e93217e6e474e86345010469994c066cf875 accidentally introduced an intentation error between the 'zfs receive' and 'zfs allow' detailed documentation sections. Signed-off-by: DHE <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3312
* Document bookmarks a little better in zfs(8)Turbo Fredriksson2015-04-141-0/+18
| | | | | | | | Add a basic summary to zfs(8) describing bookmarks. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3268
* Update zfs_pd_bytes_max default in zfs(8)Brian Behlendorf2015-03-311-1/+1
| | | | | | | Commit b738bc5 should have updated the default value of zfs_pd_bytes_max in the zfs(8) man page. The correct default value is 50*1024*1024. Signed-off-by: Brian Behlendorf <[email protected]>
* Illumos 5694 - traverse_prefetcher does not prefetch enoughGeorge Wilson2015-03-271-2/+2
| | | | | | | | | | | | | | | | | | 5694 traverse_prefetcher does not prefetch enough Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: Alex Reece <[email protected]> Reviewed by: Christopher Siden <[email protected]> Reviewed by: Josef 'Jeff' Sipek <[email protected]> Reviewed by: Bayard Bell <[email protected]> Approved by: Garrett D'Amore <[email protected]> References: https://www.illumos.org/issues/5694 https://github.com/illumos/illumos-gate/commit/34d7ce05 Ported-by: Chris Dunlop <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3230
* zio_injection_enabled should not be a module optionIsaac Huang2015-03-241-11/+0
| | | | | | | | | | | | | The zio_inject.c keeps zio_injection_enabled as a counter of fault handlers, so it should not be exported to user space as a module option. Several EXPORT_SYMBOLs are moved from zio.c to zio_inject.c, where the symbols are defined. Signed-off-by: Isaac Huang <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3199
* Fix arc_adjust_meta() behaviorBrian Behlendorf2015-03-201-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | The goal of this function is to evict enough meta data buffers from the ARC in order to enforce the arc_meta_limit. Achieving this is slightly more complicated than it appears because it is common for data buffers to have holds on meta data buffers. In addition, dnode meta data buffers will be held by the dnodes in the block preventing them from being freed. This means we can't simply traverse the ARC and expect to always find enough unheld meta data buffer to release. Therefore, this function has been updated to make alternating passes over the ARC releasing data buffers and then newly unheld meta data buffers. This ensures forward progress is maintained and arc_meta_used will decrease. Normally this is sufficient, but if required the ARC will call the registered prune callbacks causing dentry and inodes to be dropped from the VFS cache. This will make dnode meta data buffers available for reclaim. The number of total restarts in limited by zfs_arc_meta_adjust_restarts to prevent spinning in the rare case where all meta data is pinned. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Pavel Snajdr <[email protected]> Issue #3160
* Restructure per-filesystem reclaimBrian Behlendorf2015-03-201-3/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally when the ARC prune callback was introduced the idea was to register a single callback for the ZPL. The ARC could invoke this call back if it needed the ZPL to drop dentries, inodes, or other cache objects which might be pinning buffers in the ARC. The ZPL would iterate over all ZFS super blocks and perform the reclaim. For the most part this design has worked well but due to limitations in 2.6.35 and earlier kernels there were some problems. This patch is designed to address those issues. 1) iterate_supers_type() is not provided by all kernels which makes it impossible to safely iterate over all zpl_fs_type filesystems in a single callback. The most straight forward and portable way to resolve this is to register a callback per-filesystem during mount. The arc_*_prune_callback() functions have always supported multiple callbacks so this is functionally a very small change. 2) Commit 050d22b removed the non-portable shrink_dcache_memory() and shrink_icache_memory() functions and didn't replace them with equivalent functionality. This meant that for Linux 3.1 and older kernels the ARC had no mechanism to drop dentries and inodes from the caches if needed. This patch adds that missing functionality by calling shrink_dcache_parent() to release dentries which may be pinning inodes. This will result in all unused cache entries being dropped which is a bit heavy handed but it's the only interface available for old kernels. 3) A zpl_drop_inode() callback is registered for kernels older than 2.6.35 which do not support the .evict_inode callback. This ensures that when the last reference on an inode is dropped it is immediately removed from the cache. If this isn't done than inode can end up on the global unused LRU with no mechanism available to ZFS to drop them. Since the ARC buffers are not dropped the hottest inodes can still be recreated without performing disk IO. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Pavel Snajdr <[email protected]> Issue #3160
* Move duplicate information about the 'zfs send -e' option.Turbo Fredriksson2015-03-191-18/+18
| | | | | | | | | The extra one was under the 'zfs receive' command (which isn't relevant). Instead, it should have been further up (still in the 'zfs send' option). Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3194
* Retire zio_bulk_flagsBrian Behlendorf2015-02-101-11/+0
| | | | | | | | | | | | Long ago the zio_bulk_flags module parameter was introduced to facilitate debugging and profiling the zio_buf_caches. Today this code works well and there's no compelling reason to keep this functionality. In fact it's preferable to revert this so the code is more consistent with other ZFS implementations. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Ned Bass <[email protected]> Issue #3063
* Document zfs_flags module parameterNed Bass2015-01-071-2/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a table describing the debugging flags that can be set in the zfs_flags module parameter. Also change the module_param type to 'uint' so users aren't shown a negative value. The updated man page text is reproduced below for convenience. zfs_flags (int) Set additional debugging flags. The following flags may be bitwise-or'd together. +-------------------------------------------------------+ |Value Symbolic Name | | Description | +-------------------------------------------------------+ | 1 ZFS_DEBUG_DPRINTF | | Enable dprintf entries in the debug log. | +-------------------------------------------------------+ | 2 ZFS_DEBUG_DBUF_VERIFY * | | Enable extra dbuf verifications. | +-------------------------------------------------------+ | 4 ZFS_DEBUG_DNODE_VERIFY * | | Enable extra dnode verifications. | +-------------------------------------------------------+ | 8 ZFS_DEBUG_SNAPNAMES | | Enable snapshot name verification. | +-------------------------------------------------------+ | 16 ZFS_DEBUG_MODIFY | | Check for illegally modified ARC buffers. | +-------------------------------------------------------+ | 32 ZFS_DEBUG_SPA | | Enable spa_dbgmsg entries in the debug log. | +-------------------------------------------------------+ | 64 ZFS_DEBUG_ZIO_FREE | | Enable verification of block frees. | +-------------------------------------------------------+ | 128 ZFS_DEBUG_HISTOGRAM_VERIFY | | Enable extra spacemap histogram verifications. | +-------------------------------------------------------+ * Requires debug build. Default value: 0. Signed-off-by: Ned Bass <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #2988
* Fix small spelling mistakeRandall Mason2014-11-141-1/+1
| | | | | | | | recieve becomes receive Signed-off-by: Randall Mason <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #2877
* Illumos 4924 - LZ4 Compression for metadataDaniil Lunev2014-10-201-5/+8
| | | | | | | | | | | | | | | | | | | | | Reviewed by Matthew Ahrens <[email protected]> Reviewed by Saso Kiselkov <[email protected]> Approved by: Christopher Siden <[email protected]> References: https://github.com/illumos/illumos-gate/commit/b8289d2 https://www.illumos.org/issues/3756 Porting notes: The static function zfs_prop_activate_feature() was removed because this change removes the only caller. The function was not removed from Illumos but instead left as dead code. However, to keep gcc happy it was removed from Linux and may be easily restored if needed. Ported by: DHE <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #1540
* Add a stern warning about dedupTurbo Fredriksson2014-10-081-0/+12
| | | | | | | | | | | Users intending to use dedup should be clearly advised about its memory requirements and the risks involved. Thanx to Sachiru for comments and suggestions. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #2754