summaryrefslogtreecommitdiffstats
path: root/cmd
Commit message (Collapse)AuthorAgeFilesLines
* Do not force VDEV_NAME_TYPE_ID in max_width()Håkan Johansson2016-11-301-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not force VDEV_NAME_TYPE_ID in max_width(), instead add it in the relevant calls to max_width(). The first location of max_width() where VDEV_NAME_TYPE_ID is now added in show_import() is followed by print_import_config() and print_logs(). Both these print children vdev names that have been retrieved using an explicit VDEV_NAME_TYPE_ID added. The second location is in status_callback(). This is followed by print_status_config(), print_logs(), print_l2cache(), and print_spares(). For l2cache and spares it should not matter as there are no mirror-X or raidz-X involved. print_status_config() as above retrieves the name using explicit VDEV_NAME_TYPE_ID before calling itself to print children. The call of max_width() in get_namewidth() is not changed, as this is used by zpool_do_iostat(), followed by print_iostat(), which does not add VDEV_NAME_TYPE_ID. Overall, we should consider adding VDEV_NAME_TYPE_ID to the relevant name_flags / cb_name_flags fields, and remove the explicit adding in called routines. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Haakan T Johansson <[email protected]> Closes #5401
* Introduce ARC Buffer Data (ABD)Brian Behlendorf2016-11-306-82/+110
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ZFS currently uses ARC buffers which are backed by virtual memory. While functional, there are some major problems with this approach which can be observed on all OpenZFS platforms. ABD was designed to address these issues and includes contributions from OpenZFS developers from multiple platforms. While all OpenZFS platforms will benefit from ABD this functionality is critical for Linux. Unlike the other OpenZFS platforms the Linux kernel discourages extensive use of virtual memory. The provided interfaces are not optimized for frequent allocations from the virtual address space. To maintain good performance a kmem cache is used which contains relatively long lived slabs backed by virtual memory. The downside to the approach is that those slabs can become highly fragmented resulting in an inefficient use of memory. Another issue is that on 32-bit systems the available virtual address space in the kernel is only a small fraction of total system memory. This means the ARC size is highly constrained which hurts performance and make allocating memory difficult and OOMs more likely. ABD is designed to address these issues by using scatter lists of pages for data buffers. This removes the need for slabs which resolves the fragmentation issue. It also allows high memory pages to be allocated which alleviates the virtual address space pressure on 32-bit systems. For metadata buffers, which are small, linear ABDs are allocated from the slab. This is preferable because there are many places in the code which expect to be able to read from a given offset in the buffer. Using linear ABDs means none of that code needs to be modified. The majority of these buffers are allocated with kmalloc so there's minimal impact of the virtual address space. Tested-by: Kash Pande <[email protected]> Tested-by: kernelOfTruth <[email protected]> Tested-by: RageLtMan <rageltman@sempervictus> Tested-by: DHE <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Dan Kimmel <[email protected]> Reviewed-by: David Quigley <[email protected]> Reviewed-by: Gvozden Neskovic <[email protected]> Reviewed-by: Tom Caputi <[email protected]> Reviewed-by: Isaac Huang <[email protected]> Reviewed-by: Jinshan Xiong <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #3441 Closes #5135
| * ABD raidz avx512f supportGvozden Neskovic2016-11-291-15/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Implement shift based multiplication for 512f. Higher IPC over lookup based methods yields up to 40% better performance on the current hardware. Results on Xeon Phi(TM) CPU 7210: implementation gen_p gen_pq gen_pqr rec_p rec_q rec_r rec_pq rec_pr rec_qr rec_pqr original 142232671 24411492 12948205 283053705 22348167 4215911 9171609 2265548 2378370 1648495 scalar 295711162 49851491 33253815 293198109 88179448 61866752 27941684 25764416 17384442 12138153 sse2 410055998 199642658 117973654 406240463 152688682 121092250 84968180 79291076 47473657 20779719 ssse3 411641595 199669571 117937647 406211024 137638508 117050346 81263322 76120405 46281559 32696722 avx2 616485806 311515332 188595628 605455115 260602390 230554476 148198817 138800254 92273356 62937819 avx512f 832191523 408509425 253599522 810094481 404325734 317590971 218235687 197204920 133101937 94001219 fastest avx512f avx512f avx512f avx512f avx512f avx512f avx512f avx512f avx512f avx512f Signed-off-by: Gvozden Neskovic <[email protected]>
| * ABD Vectorized raidzGvozden Neskovic2016-11-293-46/+49
| | | | | | | | | | | | | | | | | | Enable vectorized raidz code on ABD buffers. The avx512f, avx512bw, neon and aarch64_neonx2 are disabled in this commit. With the exception of avx512bw these implementations are updated for ABD in the subsequent commits. Signed-off-by: Gvozden Neskovic <[email protected]>
| * DLPX-44812 integrate EP-220 large memory scalabilityDavid Quigley2016-11-295-50/+89
| |
* | Fix coverity defects: CID 154591luozhengzheng2016-11-301-1/+1
|/ | | | | | | | CID 154591: Incorrect expression (SIZEOF_MISMATCH) Reviewed-by: Gvozden Neskovic <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: luozhengzheng <[email protected]> Closes #5435
* zstreamdump needs to initialize fletcher 4 supportTim Chase2016-11-291-0/+2
| | | | | | | Otherwise, the checksum function pointer isn't initialized. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes #5411
* Add -c to zpool iostat & status to run commandTony Hutter2016-11-295-10/+244
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a command (-c) option to zpool status and zpool iostat. The -c option allows you to run an arbitrary command on each vdev and display the first line of output in zpool status/iostat. The environment vars VDEV_PATH and VDEV_UPATH are set to the vdev's path and "underlying path" before running the command. For device mapper, multipath, or partitioned vdevs, VDEV_UPATH is the actual underlying /dev/sd* disk. This can be useful if the command you're running requires a /dev/sd* device. The patch also uses /sys/block/<dev>/slaves/ to lookup the underlying device instead of using libdevmapper. This not only removes the libdevmapper requirement at build time, but also allows you to resolve device mapper devices without being root. This means that UDEV_UPATH get set correctly when running zpool status/iostat as an unprivileged user. Example: $ zpool status -c 'echo I am $VDEV_PATH, $VDEV_UPATH' NAME STATE READ WRITE CKSUM mypool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 mpatha ONLINE 0 0 0 I am /dev/mapper/mpatha, /dev/sdc sdb ONLINE 0 0 0 I am /dev/sdb1, /dev/sdb Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #5368
* Allow zfs unshare <protocol> -aLOLi2016-11-291-5/+23
| | | | | | | | | | | | | | | | | | Allow `zfs unshare <protocol> -a` command to share or unshare all datasets of a given protocol, nfs or smb. Additionally, enable most of ZFS Test Suite zfs_share/zfs_unshare test cases. To work around some Illumos-specific functionalities ($SHARE/$UNSHARE) some function wrappers were added around them. Finally, fix and issue in smb_is_share_active() that would leave SMB shares exported when invoking 'zfs unshare -a' Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #3238 Closes #5367
* Add a statechange notify zedletDon Brady2016-11-106-82/+125
| | | | | | | | | | Now that ZED has internal fault diagnosis and the statechange event is generated for faulted states, we can replace the io-notify and checksum-notify zedlets with one based on statechange. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #5383
* Fix symlinks for {vdev_clear,statechange}-led.shOlaf Faaland2016-11-091-2/+2
| | | | | | | | | These were named in the zed/Makefile.am as vdev_clear-blinkled.sh and statechange-blinkled.sh causing bad symlinks to be created. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #5384
* Fix coverity defects: CID 147586cao2016-11-081-3/+3
| | | | | | | | CID 147586: function:allow_usage Type:out-of-bounds read Reviewed-by: Chunwei Chen <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: cao.xuewen <[email protected]> Closes #5364
* Fix coverity defects: 154021luozhengzheng2016-11-081-0/+3
| | | | | | | | CID 154021: Null pointer dereference Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: luozhengzheng <[email protected]> Closes #5380
* Add illumos FMD ZFS logic to ZED -- phase 2Don Brady2016-11-0716-338/+3581
| | | | | | | | | | | | | | | | | | | | | | | The phase 2 work primarily entails the Diagnosis Engine and the Retire Agent modules. It also includes infrastructure to support a crude FMD environment to host these modules. The Diagnosis Engine consumes I/O and checksum ereports and feeds them into a SERD engine which will generate a corres- ponding fault diagnosis when the SERD engine fires. All the diagnosis state data is collected into cases, one case per vdev being tracked. The Retire Agent responds to diagnosed faults by isolating the faulty VDEV. It will notify the ZFS kernel module of the new VDEV state (degraded or faulted). This agent is also responsible for managing hot spares across pools. When it encounters a device fault or a device removal it replaces the device with an appropriate spare if available. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Don Brady <[email protected]> Closes #5343
* Allow autoreplace even when enclosure LED sysfs entries don't existTony Hutter2016-11-041-2/+2
| | | | | | | | | | | | The previous autoreplace code assumed that if you were using autoreplace, then you also had the enclosure SES driver loaded. This could lead to autoreplace not working if the SES driver wasn't loaded, or if it wasn't creating the proper enclosure_device symlinks (which has happened). This patch removes that assumption. Reviewed by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #5363
* Add parity generation/rebuild using AVX-512 for x86-64Romain Dolbeau2016-11-021-0/+2
| | | | | | | | | | | | | | | avx512f should work on all AVX512 hardware, since it only uses Foundation instructions. avx512bw should be faster on hardware supporting the AVW512BW extension. We can use full-width pshufb (instead of relying on the 256 bits AVX2 pshufb). As a side-effect, the code is also unrolled more. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Gvozden Neskovic <[email protected]> Reviewed-by: Jinshan Xiong <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #5219
* Allow for '-o feature@<feature>=disabled' on the command lineLOLi2016-10-251-14/+19
| | | | | | | | | | | | | Sometimes it is desirable to specifically disable one or several features directly on the 'zpool create' command line. $ zpool create -o feature@<feature>=disabled ... Original-patch-by: Turbo Fredriksson <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #3460 Closes #5142 Closes #5324
* Fix statechange-led.sh & unnecessary libdevmapper warningTony Hutter2016-10-252-20/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Fix autoreplace behaviour on statechange-led.sh script. ZED sends the following events on an auto-replace: 1. statechange: Disk goes UNAVAIL->ONLINE 2. statechange: Disk goes ONLINE->UNAVAIL 3. vdev_attach: Disk goes ONLINE Events 1-2 happen when ZED first attempts to do an auto-online. When that fails, ZED then tries an auto-replace, generating the vdev_attach event in #3. In the previous code, statechange-led was only looking at the UNAVAIL->ONLINE transition to turn off the LED. It ignored the #2 ONLINE->UNAVAIL transition, assuming it was just the "old" VDEV going offline. This is problematic, as a drive can go from ONLINE->UNAVAIL when it's malfunctioning, and we don't want to ignore that. This new patch correctly turns on the fault LED every time a drive becomes UNAVAIL. It also monitors vdev_attach events to trigger turning off the LED when an auto-replaced disk comes online. - Remove unnecessary libdevmapper warning with --with-config=kernel This fixes an unnecessary libdevmapper warning when building --with-config=kernel. Kernel code does not use libdevmapper, so the warning is not needed. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #2375 Closes #5312 Closes #5331
* Fix coverity defects: CID 147511, 147513cao2016-10-241-1/+1
| | | | | | | | CID 147511: Type:Dereference before null check CID 147513: Type:Dereference before null check Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: cao.xuewen <[email protected]> Closes #5306
* Turn on/off enclosure slot fault LED even when disk isn't presentTony Hutter2016-10-244-74/+62
| | | | | | | | | | | | | | | | | Previously when a drive faulted, the statechange-led.sh script would lookup the drive's LED sysfs entry in /sys/block/sd*/device/enclosure_device, and turn it on. During testing we noticed that if you pulled out a drive, or if the drive was so badly broken that it no longer appeared to Linux, that the /sys/block/sd* path would be removed, and the script could not lookup the LED entry. To fix this, this patch looks up the disks's more persistent "/sys/class/enclosure/X:X:X:X/Slot N" LED sysfs path at pool import. It then passes that path to the statechange-led script to use, rather than having the script look it up on the fly. This allows the script to turn on/off the slot LEDs even when the drive is missing. Closes #5309 Closes #2375
* Multipath autoreplace, control enclosure LEDs, event rate limitingTony Hutter2016-10-197-31/+216
| | | | | | | | | | | | | | | | | | | | | | | | | | 1. Enable multipath autoreplace support for FMA. This extends FMA autoreplace to work with multipath disks. This requires libdevmapper to be installed at build time. 2. Turn on/off fault LEDs when VDEVs become degraded/faulted/online Set ZED_USE_ENCLOSURE_LEDS=1 in zed.rc to have ZED turn on/off the enclosure LED for a drive when a drive becomes FAULTED/DEGRADED. Your enclosure must be supported by the Linux SES driver for this to work. The enclosure LED scripts work for multipath devices as well. The scripts will clear the LED when the fault is cleared. 3. Rate limit ZIO delay and checksum events so as not to flood ZED ZIO delay and checksum events are rate limited to 5/sec in the zfs module. Reviewed-by: Richard Laager <[email protected]> Reviewed by: Don Brady <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #2449 Closes #3017 Closes #5159
* Fix coverity defects: CID 147643, 152204, 49339GeLiXin2016-10-181-1/+2
| | | | | | | | | | | | | | | | | CID 147643: Type: String not null terminated - make sure that the string is null terminated before strlen and fprintf. CID 152204: Type: Copy into fixed size buffer - since strlcpy isn't availabe here, use strncpy and terminate the string manually. CID 49339: Type: Buffer not null terminated - since strlcpy isn't availabe here, terminate the string manually before fprintf. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: GeLiXin <[email protected]> Closes #5283
* Pass status_cbdata_t to print_status_config() and friendsHåkan Johansson2016-10-171-68/+66
| | | | | | | | | | | | | | | | | | First rename spare_cbdata_t cb -> spare_cb in print_status_config(), to free up cb. Using the structure removes the explicit parameters namewidth and name_flags from several functions. Also use status_cbdata_t for print_import_config(). This simplifies print_logs(). Remove the parameter 'verbose' for print_logs(). It does not really mean verbose, it selected between the print_status_config and print_import_config() paths. This selection is now done by cb_print_config of spare_cbdata_t. Reviewed-by: Tony Hutter <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Håkan Johansson <[email protected]> Closes #5259
* Fix coverity defects: CID 147606, 147609cao2016-10-121-1/+4
| | | | | | | | coverity scan CID:147606, Type:resource leak coverity scan CID:147609, Type:resource leak Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: cao.xuewen <[email protected]> Closes #5245
* Fix coverity defects: CID 147639GeLiXin2016-10-101-5/+6
| | | | | | | | | When array is passed as a parameter it degenerates into a pointer so the sizeof(path) in is_shorthand_path() and always get return value of 8, instead of the string length we want. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: GeLiXin <[email protected]> Closes #5198
* Fix file permissionsBrian Behlendorf2016-10-081-0/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The following new test cases need to have execute permissions set: userquota/groupspace_003_pos.ksh userquota/userquota_013_pos.ksh userquota/userspace_003_pos.ksh upgrade/upgrade_userobj_001_pos.ksh upgrade/setup.ksh upgrade/cleanup.ksh The following source files accidentally were marked executable: lib/libzpool/kernel.c lib/libshare/nfs.c lib/libzfs/libzfs_dataset.c lib/libzfs/libzfs_util.c tests/zfs-tests/cmd/rm_lnkcnt_zero_file/rm_lnkcnt_zero_file.c tests/zfs-tests/cmd/dir_rd_update/dir_rd_update.c cmd/zed/zed_exec.c module/icp/core/kcf_sched.c module/zfs/dsl_pool.c module/zfs/arc.c module/nvpair/nvpair.c man/man5/zfs-module-parameters.5 Reviewed-by: GeLiXin <[email protected]> Reviewed-by: Andreas Dilger <[email protected]> Reviewed-by: Jinshan Xiong <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #5241
* Fletcher4: Incremental updates and ctx calculationBrian Behlendorf2016-10-071-0/+78
|\ | | | | | | | | | | | | | | | | Fixes ABI issues with fletcher4 code, adds support for incremental updates, and adds ztest method for testing. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Chunwei Chen <[email protected]> Signed-off-by: Gvozden Neskovic <[email protected]> Closes #5164
| * Fletcher4: Incremental using SIMDGvozden Neskovic2016-10-051-0/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Combine incrementally computed fletcher4 checksums. Checksums are combined a posteriori, allowing for parallel computation on chunks to be implemented if required. The algorithm is general, and does not add changes in each SIMD implementation. New test in ztest verifies incremental fletcher computations. Checksum combining matrix for two buffers `a` and `b`, where `Ca` and `Cb` are respective fletcher4 checksums, `Cab` is combined checksum, `s` is size of buffer `b` (divided by sizeof(uint32_t)) is: Cab[A] = Cb[A] + Ca[A] Cab[B] = Cb[B] + Ca[B] + s * Ca[A] Cab[C] = Cb[C] + Ca[C] + s * Ca[B] + s(s+1)/2 * Ca[A] Cab[D] = Cb[D] + Ca[D] + s * Ca[C] + s(s+1)/2 * Ca[B] + s(s+1)(s+2)/6 * Ca[A] NOTE: this calculation overflows for larger buffers. Thus, internally, the calculation is performed on 8MiB chunks. Signed-off-by: Gvozden Neskovic <[email protected]>
* | Add python style checkingBrian Behlendorf2016-10-073-60/+53
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a make recipe for flake8 to enable python style checking. Ensure all python scripts pass flake8. Return an error code of 0 for arcstat.py -v and dbufstat.py -v. Add test cases for python scripts. Reviewed by: Richard Laager <[email protected]> Reviewed-by: Richard Elling <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Ian Lee <[email protected]> Closes #5230
| * | Correct exit code for dbufstat -v and arcstat -vGiuseppe Di Natale2016-10-062-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | Both scripts were returning an error code of 1 when using the -v argument. -v should exit with an error code of 0. Signed-off-by: Giuseppe Di Natale <[email protected]>
| * | Correct style in arcstat and arc_summaryGiuseppe Di Natale2016-10-062-58/+51
| | | | | | | | | | | | | | | | | | | | | Fix arcstat and arc_summary so they pass flake8 python code style checks. Signed-off-by: Giuseppe Di Natale <[email protected]>
* | | Add support for user/group dnode accounting & quotaJinshan Xiong2016-10-072-13/+78
| |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch tracks dnode usage for each user/group in the DMU_USER/GROUPUSED_OBJECT ZAPs. ZAP entries dedicated to dnode accounting have the key prefixed with "obj-" followed by the UID/GID in string format (as done for the block accounting). A new SPA feature has been added for dnode accounting as well as a new ZPL version. The SPA feature must be enabled in the pool before upgrading the zfs filesystem. During the zfs version upgrade, a "quotacheck" will be executed by marking all dnode as dirty. ZoL-bug-id: https://github.com/zfsonlinux/zfs/issues/3500 Signed-off-by: Jinshan Xiong <[email protected]> Signed-off-by: Johann Lombardi <[email protected]>
* | Fix coverity defects: CID 150953, 147603, 147610luozhengzheng2016-10-041-2/+11
|/ | | | | | | | | coverity scan CID:150953,type: uninitialized scalar variable coverity scan CID:147603,type: Resource leak coverity scan CID:147610,type: Resource leak Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: luozhengzheng <[email protected]> Closes #5209
* OpenZFS 4185 - add new cryptographic checksums to ZFS: SHA-512, Skein, Edon-RTony Hutter2016-10-031-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Reviewed by: George Wilson <[email protected]> Reviewed by: Prakash Surya <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Reviewed by: Richard Lowe <[email protected]> Approved by: Garrett D'Amore <[email protected]> Ported by: Tony Hutter <[email protected]> OpenZFS-issue: https://www.illumos.org/issues/4185 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/45818ee Porting Notes: This code is ported on top of the Illumos Crypto Framework code: https://github.com/zfsonlinux/zfs/pull/4329/commits/b5e030c8dbb9cd393d313571dee4756fbba8c22d The list of porting changes includes: - Copied module/icp/include/sha2/sha2.h directly from illumos - Removed from module/icp/algs/sha2/sha2.c: #pragma inline(SHA256Init, SHA384Init, SHA512Init) - Added 'ctx' to lib/libzfs/libzfs_sendrecv.c:zio_checksum_SHA256() since it now takes in an extra parameter. - Added CTASSERT() to assert.h from for module/zfs/edonr_zfs.c - Added skein & edonr to libicp/Makefile.am - Added sha512.S. It was generated from sha512-x86_64.pl in Illumos. - Updated ztest.c with new fletcher_4_*() args; used NULL for new CTX argument. - In icp/algs/edonr/edonr_byteorder.h, Removed the #if defined(__linux) section to not #include the non-existant endian.h. - In skein_test.c, renane NULL to 0 in "no test vector" array entries to get around a compiler warning. - Fixup test files: - Rename <sys/varargs.h> -> <varargs.h>, <strings.h> -> <string.h>, - Remove <note.h> and define NOTE() as NOP. - Define u_longlong_t - Rename "#!/usr/bin/ksh" -> "#!/bin/ksh -p" - Rename NULL to 0 in "no test vector" array entries to get around a compiler warning. - Remove "for isa in $($ISAINFO); do" stuff - Add/update Makefiles - Add some userspace headers like stdio.h/stdlib.h in places of sys/types.h. - EXPORT_SYMBOL *_Init/*_Update/*_Final... routines in ICP modules. - Update scripts/zfs2zol-patch.sed - include <sys/sha2.h> in sha2_impl.h - Add sha2.h to include/sys/Makefile.am - Add skein and edonr dirs to icp Makefile - Add new checksums to zpool_get.cfg - Move checksum switch block from zfs_secpolicy_setprop() to zfs_check_settable() - Fix -Wuninitialized error in edonr_byteorder.h on PPC - Fix stack frame size errors on ARM32 - Don't unroll loops in Skein on 32-bit to save stack space - Add memory barriers in sha2.c on 32-bit to save stack space - Add filetest_001_pos.ksh checksum sanity test - Add option to write psudorandom data in file_write utility
* Add parity generation/rebuild using 128-bits NEON for Aarch64Romain Dolbeau2016-10-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-use the framework established for SSE2, SSSE3 and AVX2. However, GCC is using FP registers on Aarch64, so unlike SSE/AVX2 we can't rely on the registers being left alone between ASM statements. So instead, the NEON code uses C variables and GCC extended ASM syntax. Note that since the kernel explicitly disable vector registers, they have to be locally re-enabled explicitly. As we use the variable's number to define the symbolic name, and GCC won't allow duplicate symbolic names, numbers have to be unique. Even when the code is not going to be used (e.g. the case for 4 registers when using the macro with only 2). Only the actually used variables should be declared, otherwise the build will fails in debug mode. This requires the replacement of the XOR(X,X) syntax by a new ZERO(X) macro, which does the same thing but without repeating the argument. And perhaps someday there will be a machine where there is a more efficient way to zero a register than XOR with itself. This affects scalar, SSE2, SSSE3 and AVX2 as they need the new macro. It's possible to write faster implementations (different scheduling, different unrolling, interleaving NEON and scalar, ...) for various cores, but this one has the advantage of fitting in the current state of the code, and thus is likely easier to review/check/merge. The only difference between aarch64-neon and aarch64-neonx2 is that aarch64-neonx2 unroll some functions some more. Reviewed-by: Gvozden Neskovic <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Romain Dolbeau <[email protected]> Closes #4801
* Fix coverity defects: CID 147448, 147449, 147450, 147453, 147454luozhengzheng2016-10-022-2/+2
| | | | | | | | | | | coverity scan CID:147448,type: unchecked return value coverity scan CID:147449,type: unchecked return value coverity scan CID:147450,type: unchecked return value coverity scan CID:147453,type: unchecked return value coverity scan CID:147454,type: unchecked return value Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: luozhengzheng <[email protected]> Closes #5206
* Fix coverity defects: CID 147536, 147537, 147538GeLiXin2016-09-301-4/+5
| | | | | | | | | | | | coverity scan CID:147536, type: Argument cannot be negative - may write or close fd which is negative coverity scan CID:147537, type: Argument cannot be negative - may call dup2 with a negative fd coverity scan CID:147538, type: Argument cannot be negative - may read or fchown with a negative fd Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: GeLiXin <[email protected]> Closes #5185
* raidz_test: respect wall timeGvozden Neskovic2016-09-302-10/+28
| | | | | | | | When timeout is specified (-t), stop worker threads in the middle of work units. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Gvozden Neskovic <[email protected]> Issue #5180 Closes #5190
* Fix coverity defects: CID 147443, 147656, 147655, 147441, 147653BearBabyLiu2016-09-292-2/+2
| | | | | | | | | | | | coverity scan CID:147443, Type: Buffer not null terminated coverity scan CID:147656, Type: Copy into fixed size buffer coverity scan CID:147655, Type: Copy into fixed size buffer coverity scan CID:147441, Type: Buffer not null terminated coverity scan CID:147653, Type: Copy into fixed size buffer Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: liuhuang <[email protected]> Closes #5165
* Fix coverity defects: CID 147610, 147608, 147607cao2016-09-292-16/+19
| | | | | | | | | | coverity scan CID:147610, Type: Resource leak. coverity scan CID:147608, Type: Resource leak. coverity scan CID:147607, Type: Resource leak. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: cao.xuewen <[email protected]> Closes #5143
* Fix coverity defects: CID 147602 147604cao2016-09-231-4/+8
| | | | | | | | | coverity scan CID:147604, Type: Resource leak. coverity scan CID:147602, Type: Resource leak. reason: safe_malloc calcvs, goto children but not free calcvs. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: cao.xuewen <[email protected]> Closes #5155
* Fix coverity defects: CID 147613 147614 147616 147617luozhengzheng2016-09-231-4/+8
| | | | | | | | | | coverity scan CID:147617,type: resource leaks coverity scan CID:147616,type: resource leaks coverity scan CID:147614,type: resource leaks coverity scan CID:147613,type: resource leaks Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: luozhengzheng <[email protected]> Closes #5150
* Fix coverity defectsluozhengzheng2016-09-223-14/+57
| | | | | | | | | | | | | | | | | | | | | | | 1.coverity scan CID:147445 function zfs_do_send in zfs_main.c Buffer not null terminated (BUFFER_SIZE_WARNING) 2.coverity scan CID:147443 function zfs_do_bookmark in zfs_main.c Buffer not null terminated (BUFFER_SIZE_WARNING) 3.coverity scan CID:147660 function main in zinject.c Passing string argv[0] of unknown size to strcpy By the way, the leak of g_zfs is fixed. 4.coverity scan CID: 147442 function make_disks in zpool_vdev.c Buffer not null terminated (BUFFER_SIZE_WARNING) 5.coverity scan CID: 147661 function main in dir_rd_update.c passing string cp1 of unknown size to strcpy Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: luozhengzheng <[email protected]> Closes #5130
* Fix coverity defectscao2016-09-202-1/+4
| | | | | | | | | | | | | | | | Fix coverity defects: coverity scan CID:147623, Type: Resource leak. coverity scan CID:147622, Type: Resource leak. reason: zpool_open zhp, but not zpool_close zhp. so resource leak. coverity scan CID:147621, Type: Resource fd leak. coverity scan CID:147620, Type: Resource fd leak. reason: do_write do_read open file fd,but exception not close fd. delete unuse definition DMU_OS_IS_L2COMPRESSIBLE. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: cao.xuewen <[email protected]> Closes #5137
* Change /etc/mtab to /proc/self/mountsslashdd2016-09-203-19/+20
| | | | | | | | | | | Fix misleading error message: "The /dev/zfs device is missing and must be created.", if /etc/mtab is missing. Reviewed-by: Richard Laager <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Eric Desrochers <[email protected]> Closes #4680 Closes #5029
* Fix Coverity defectsluozhengzheng2016-09-171-1/+1
| | | | | | | CID 147659, 150952 and 147645 Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: luozhengzheng <[email protected]> Closes #5103
* DLPX-40252 integrate EP-476 compressed zfs send/receiveDan Kimmel2016-09-132-10/+29
| | | | | | | | Authored by: Dan Kimmel <[email protected]> Reviewed by: Tom Caputi <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Ported by: David Quigley <[email protected]> Issue #5078
* OpenZFS 6950 - ARC should cache compressed dataGeorge Wilson2016-09-132-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Authored by: George Wilson <[email protected]> Reviewed by: Prakash Surya <[email protected]> Reviewed by: Dan Kimmel <[email protected]> Reviewed by: Matt Ahrens <[email protected]> Reviewed by: Paul Dagnelie <[email protected]> Reviewed by: Tom Caputi <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Ported by: David Quigley <[email protected]> This review covers the reading and writing of compressed arc headers, sharing data between the arc_hdr_t and the arc_buf_t, and the implementation of a new dbuf cache to keep frequently access data uncompressed. I've added a new member to l1 arc hdr called b_pdata. The b_pdata always hangs off the arc_buf_hdr_t (if an L1 hdr is in use) and points to the physical block for that DVA. The physical block may or may not be compressed. If compressed arc is enabled and the block on-disk is compressed, then the b_pdata will match the block on-disk and remain compressed in memory. If the block on disk is not compressed, then neither will the b_pdata. Lastly, if compressed arc is disabled, then b_pdata will always be an uncompressed version of the on-disk block. Typically the arc will cache only the arc_buf_hdr_t and will aggressively evict any arc_buf_t's that are no longer referenced. This means that the arc will primarily have compressed blocks as the arc_buf_t's are considered overhead and are always uncompressed. When a consumer reads a block we first look to see if the arc_buf_hdr_t is cached. If the hdr is cached then we allocate a new arc_buf_t and decompress the b_pdata contents into the arc_buf_t's b_data. If the hdr already has a arc_buf_t, then we will allocate an additional arc_buf_t and bcopy the uncompressed contents from the first arc_buf_t to the new one. Writing to the compressed arc requires that we first discard the b_pdata since the physical block is about to be rewritten. The new data contents will be passed in via an arc_buf_t (uncompressed) and during the I/O pipeline stages we will copy the physical block contents to a newly allocated b_pdata. When an l2arc is inuse it will also take advantage of the b_pdata. Now the l2arc will always write the contents of b_pdata to the l2arc. This means that when compressed arc is enabled that the l2arc blocks are identical to those stored in the main data pool. This provides a significant advantage since we can leverage the bp's checksum when reading from the l2arc to determine if the contents are valid. If the compressed arc is disabled, then we must first transform the read block to look like the physical block in the main data pool before comparing the checksum and determining it's valid. OpenZFS-issue: https://www.illumos.org/issues/6950 OpenZFS-commit: https://github.com/openzfs/openzfs/commit/7fc10f0 Issue #5078
* Fix memleak in zfs_do_* and zpool_do_*luozhengzheng2016-09-122-15/+58
| | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: luozhengzheng <[email protected]> Closes #5056
* Allow ZVOL bookmarks to be listed recursivelyloli10K2016-09-121-3/+3
| | | | | | | Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: loli10K <[email protected]> Closes #4503 Closes #5072