aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Enable PF_FSTRANS for ioctl secpolicy callbacks (#4571)Tim Chase2016-05-021-1/+4
| | | | | | | | | | At the very least, the zfs_secpolicy_write_perms ioctl security policy callback, which calls dsl_dataset_hold(), can require freeing memory and, therefore, re-enter ZFS. This patch enables PF_FSTRANS for all of the security policy callbacks similarly to the manner in which it's enabled for the actual ioctl callback. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4554
* module/.gitignore: Add *.dwo (#4580)Vitaut Bajaryn2016-05-021-0/+1
| | | | | | | These files get generated when CONFIG_DEBUG_INFO_DWARF4 is enabled in Linux .config. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4580
* Fix user namespaces uid/gid mappingBrian Behlendorf2016-04-301-4/+4
| | | | | | | | | | | As described in torvalds/linux@5f3a4a2 the &init_user_ns, and not the current user_ns, should be passed to posix_acl_from_xattr() and posix_acl_to_xattr(). Conveniently the init_user_ns is available through the init credential (kcred). Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Massimo Maggi <[email protected]> Closes #4177
* Add support for libtirpcBrian Behlendorf2016-04-289-132/+61
| | | | | | | | | | | | | | | | | | | | | While OpenSolaris libc and glibc both include XDR support, the musl libc does not in favor of depending on the BSD-licensed libtirpc library. Adding support is a simple matter of detecting the library, including the headers and linking against it. By default libtirpc will be checked for and if available used. Otherwise, configure will fall back to using the xdr implementation provided by libc if available. The options --with-tirpc/--without-tirpc can be used to disable this checking. In addition, the xdr_control() function has been simplied to only handle ZFSs specific use case. Original-patch-by: stf <[email protected]> Original-patch-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Signed-off-by: Carlo Landmeter <[email protected]> Closes #2254 Closes #4559
* Illumos 6844 - dnode_next_offset can detect fictional holesAlex Reece2016-04-272-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 6844 dnode_next_offset can detect fictional holes Reviewed by: Matthew Ahrens <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> dnode_next_offset is used in a variety of places to iterate over the holes or allocated blocks in a dnode. It operates under the premise that it can iterate over the blockpointers of a dnode in open context while holding only the dn_struct_rwlock as reader. Unfortunately, this premise does not hold. When we create the zio for a dbuf, we pass in the actual block pointer in the indirect block above that dbuf. When we later zero the bp in zio_write_compress, we are directly modifying the bp. The state of the bp is now inconsistent from the perspective of dnode_next_offset: the bp will appear to be a hole until zio_dva_allocate finally finishes filling it in. In the meantime, dnode_next_offset can detect a hole in the dnode when none exists. I was able to experimentally demonstrate this behavior with the following setup: 1. Create a file with 1 million dbufs. 2. Create a thread that randomly dirties L2 blocks by writing to the first L0 block under them. 3. Observe dnode_next_offset, waiting for it to skip over a hole in the middle of a file. 4. Do dnode_next_offset in a loop until we skip over such a non-existent hole. The fix is to ensure that it is valid to iterate over the indirect blocks in a dnode while holding the dn_struct_rwlock by passing the zio a copy of the BP and updating the actual BP in dbuf_write_ready while holding the lock. References: https://www.illumos.org/issues/6844 https://github.com/openzfs/openzfs/pull/82 DLPX-35372 Ported-by: Brian Behlendorf <[email protected]> Closes #4548
* Illumos 6659 - nvlist_free(NULL) is a no-opJosef 'Jeff' Sipek2016-04-2711-39/+19
| | | | | | | | | | | | | | | 6659 nvlist_free(NULL) is a no-op Reviewed by: Toomas Soome <[email protected]> Reviewed by: Marcel Telka <[email protected]> Approved by: Robert Mustacchi <[email protected]> References: https://www.illumos.org/issues/6659 https://github.com/illumos/illumos-gate/commit/aab83bb Ported-by: David Quigley <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4566
* Fix zfs_copies_001_pos/zfs_copies_004_negBrian Behlendorf2016-04-272-0/+5
| | | | | | | | Call block_device_wait when creating/destroying volumes in order to make the operations synchronous as expected by the test cases. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4560
* Fix 'zpool import' blkid device namesBrian Behlendorf2016-04-251-19/+123
| | | | | | | | | | | | | | | | | When importing a pool using the blkid cache only the device node path was added to the list of known paths for a device. This results in 'zpool import' always using the sdX names in preference to the 'path' name stored in the label. To fix the issue the blkid import path has been updated to add both the 'path', 'devid', and 'devname' names from the label to the known paths. A sanity check is done to ensure these paths do refer to the same device identified by blkid. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #4523 Closes #3043
* Disable efi_debug in --enable-debug buildsBrian Behlendorf2016-04-251-4/+0
| | | | | | | | | | Disable the additional EFI debugging in all builds. Some users run debug builds in production and the extra log messages can cause confusion. Beyond that the log messages are rarely useful. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Closes #4523
* Use udev for partition detectionBrian Behlendorf2016-04-254-46/+166
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When ZFS partitions a block device it must wait for udev to create both a device node and all the device symlinks. This process takes a variable length of time and depends on factors such how many links must be created, the complexity of the rules, etc. Complicating the situation further it is not uncommon for udev to create and then remove a link multiple times while processing the udev rules. Given the above, the existing scheme of waiting for an expected partition to appear by name isn't 100% reliable. At this point udev may still remove and recreate think link resulting in the kernel modules being unable to open the device. In order to address this the zpool_label_disk_wait() function has been updated to use libudev. Until the registered system device acknowledges that it in fully initialized the function will wait. Once fully initialized all device links are checked and allowed to settle for 50ms. This makes it far more likely that all the device nodes will exist when the kernel modules need to open them. For systems without libudev an alternate zpool_label_disk_wait() was updated to include a settle time. In addition, the kernel modules were updated to include retry logic for this ENOENT case. Due to the improved checks in the utilities it is unlikely this logic will be invoked. However, if the rare event it is needed it will prevent a failure. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Signed-off-by: Richard Laager <[email protected]> Closes #4523 Closes #3708 Closes #4077 Closes #4144 Closes #4214 Closes #4517
* Create unique partition labelsBrian Behlendorf2016-04-252-1/+28
| | | | | | | | | | | | | | | | | | | | When partitioning a device a name may be specified for each partition. Internally zfs doesn't use this partition name for anything so it has always just been set to "zfs". However this isn't optimal because udev will create symlinks using this name in /dev/disk/by-partlabel/. If the name isn't unique then all the links cannot be created. Therefore a random 64-bit value has been added to the partition label, i.e "zfs-1234567890abcdef". Additional information could be encoded here but since partitions may be reused that might result in confusion and it was decided against. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Hutter <[email protected]> Signed-off-by: Richard Laager <[email protected]> Closes #4517
* fix booting via dracut generated initramfsMatthew Thode2016-04-257-2/+216
| | | | | | | | | | | | | | | Dracut and Systemd updated how they integrate with each other, because of this our current integrations stopped working (around the time 4.1.13 came out). This patch addresses that issue and gets us booting again. Thanks to @Rudd-O for doing the work to get dracut working again and letting me submit this on his behalf. Signed-off-by: Manuel Amador (Rudd-O) <[email protected]> Signed-off-by: Matthew Thode <[email protected]> Closes #3605 Closes #4478
* Linux 4.5 compat: Use xattr_handler->name for aclChunwei Chen2016-04-253-20/+70
| | | | | | | | | | | | | | Linux 4.5 added member "name" to xattr_handler. xattr_handler which matches to whole name rather than prefix should use "name" instead of "prefix". Otherwise, kernel will return with EINVAL when it tries to resolve handlers. Also, we remove the strcmp checks when xattr_handler has name, because xattr_resolve_name will do the check for us. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4549 Closes #4537
* Add pn_alloc()/pn_free() functionsBrian Behlendorf2016-04-2110-32/+178
| | | | | | | | | | | | | | In order to remove the HAVE_PN_UTILS wrappers the pn_alloc() and pn_free() functions must be implemented. The existing illumos implementation were used for this purpose. The `flags` argument which was used in places wrapped by the HAVE_PN_UTILS condition has beed added back to zfs_remove() and zfs_link() functions. This removes a small point of divergence between the ZoL code and upstream. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4522
* Rework zpool import excluded devices checkNikolay Borisov2016-04-181-26/+16
| | | | | | | | | | | | | | | | | Current zpool import code skips directory entries which have prefixes similar to some system files on linux such as "fd", "core" etc. However, this means one cannot have one's zpools hosted inside files which are named e.g. core-1 or lp. Furthermore, apart from the string checks there is already which makes the zpool_open_func work only with regular files and block devices. To fix this problem remove most of the checks since they are redundant but leave the checks for the 'hpet' and 'watchdog' names. Furthermore, change the checks to strcmp which albeit less safe than strncmp allows to have devices whose names are prefixed by 'hpet' or 'watchdog'. Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4438
* Fix ZPL miswrite of default POSIX ACLNed Bass2016-04-184-3/+65
| | | | | | | | | | | | Commit 4967a3e introduced a typo that caused the ZPL to store the intended default ACL as an access ACL. Due to caching this problem may not become visible until the filesystem is remounted or the inode is evicted from the cache. Fix the typo and add a regression test. Signed-off-by: Ned Bass <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4520
* Fix inverted logic on none elevator comparisonColin Ian King2016-04-151-1/+1
| | | | | | | | | | Commit d1d7e2689db9e03f1 ("cstyle: Resolve C style issues") inverted the logic on the none elevator comparison. Fix this and make it cstyle warning clean. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4507
* remove sanity check in replacement testJinshan Xiong2016-04-133-9/+0
| | | | | | | | | | | | | In replacement test, it spawns a process to truncate a file background and make sure that the process exists 1 second later. However, the process may have finished its work and exited therefore it has the chance to report a false alarm. This patch just removed those sanity check. Signed-off-by: Jinshan Xiong <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4516
* Make zfs test easier to run in local installJinshan Xiong2016-04-123-22/+23
| | | | | | | | | | | | | | When ZFS is installed by 'make install', programs will be installed into '/usr/local'. ZFS test scripts can't locate programs 'zpool' that caused tests failure. Fix typo in help message. Add sanity check to for ksh and generate a useful error message. Signed-off-by: Jinshan Xiong <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4495
* Add zfs-tests for relatimeChunwei Chen2016-04-054-2/+75
| | | | | | | | | | Add atime_003_pos to test relatime=on, we do check_atime_updated twice, the first time should success and the second time should fail. We also modify atime_001_pos to do check_atime_updated twice and both times should succeed. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4482
* Make zfs mount according to relatime config in datasetChunwei Chen2016-04-053-3/+15
| | | | | | | | Also enable lazytime in mount.zfs Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4482
* Enable lazytime semantic for atimeChunwei Chen2016-04-053-72/+110
| | | | | | | | | | | | | | | | | | | | | | Linux 4.0 introduces lazytime. The idea is that when we update the atime, we delay writing it to disk for as long as it is reasonably possible. When lazytime is enabled, dirty_inode will be called with only I_DIRTY_TIME flag whenever i_atime is updated. So under such condition, we will set z_atime_dirty. We will only write it to disk if file is closed, inode is evicted or setattr is called. Ideally, we should also write it whenever SA is going to be updated, but it is left for future improvement. There's one thing that we should take care of now that we allow i_atime to be dirty. In original implementation, whenever SA is modified, zfs_inode_update will be called to overwrite every thing in inode. This will cause dirty i_atime to be discarded. We fix this by don't overwrite i_atime in zfs_inode_update. We only overwrite i_atime when allocating new inode or doing zfs_rezget with zfs_inode_update_new. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4482
* Fix atime handling and relatimeChunwei Chen2016-04-059-108/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The problem for atime: We have 3 places for atime: inode->i_atime, znode->z_atime and SA. And its handling is a mess. A huge part of mess regarding atime comes from zfs_tstamp_update_setup, zfs_inode_update, and zfs_getattr, which behave inconsistently with those three values. zfs_tstamp_update_setup clears z_atime_dirty unconditionally as long as you don't pass ATTR_ATIME. Which means every write(2) operation which only updates ctime and mtime will cause atime changes to not be written to disk. Also zfs_inode_update from write(2) will replace inode->i_atime with what's inside SA(stale). But doesn't touch z_atime. So after read(2) and write(2). You'll have i_atime(stale), z_atime(new), SA(stale) and z_atime_dirty=0. Now, if you do stat(2), zfs_getattr will actually replace i_atime with what's inside, z_atime. So you will have now you'll have i_atime(new), z_atime(new), SA(stale) and z_atime_dirty=0. These will all gone after umount. And you'll leave with a stale atime. The problem for relatime: We do have a relatime config inside ZFS dataset, but how it should interact with the mount flag MS_RELATIME is not well defined. It seems it wanted relatime mount option to override the dataset config by showing it as temporary in `zfs get`. But at the same time, `zfs set relatime=on|off` would also seems to want to override the mount option. Not to mention that MS_RELATIME flag is actually never passed into ZFS, so it never really worked. How Linux handles atime: The Linux kernel actually handles atime completely in VFS, except for writing it to disk. So if we remove the atime handling in ZFS, things would just work, no matter it's strictatime, relatime, noatime, or even O_NOATIME. And whenever VFS updates the i_atime, it will notify the underlying filesystem via sb->dirty_inode(). And also there's one thing to note about atime flags like MS_RELATIME and other flags like MS_NODEV, etc. They are mount point flags rather than filesystem(sb) flags. Since native linux filesystem can be mounted at multiple places at the same time, they can all have different atime settings. So these flags are never passed down to filesystem drivers. What this patch tries to do: We remove znode->z_atime, since we won't gain anything from it. We remove most of the atime handling and leave it to VFS. The only thing we do with atime is to write it when dirty_inode() or setattr() is called. We also add file_accessed() in zpl_read() since it's not provided in vfs_read(). After this patch, only the MS_RELATIME flag will have effect. The setting in dataset won't do anything. We will make zfstuil to mount ZFS with MS_RELATIME set according to the setting in dataset in future patch. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #4482
* Linux 4.6 compat: PAGE_CACHE_SIZE removalBrian Behlendorf2016-04-052-24/+23
| | | | | | | | | | | | | | | | | As described in torvalds/linux@4a2d057e the macros PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were originally introduced to make it possible to add bigger chunks to the page cache. This never panned out and it has therefore been removed from the kernel. ZFS has been updated to use the PAGE_{SIZE,SHIFT,MASK,ALIGN} macros and calls to page_cache_release() have been replaced with put_page(). There was no need to introduce a configure check for this because these interfaces have existed for a very long time. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4489
* Fix WANT_DEVNAME2DEVID configure errorBrian Behlendorf2016-04-012-6/+7
| | | | | | | | | | Accidentally introduced by commit e4023e4. The AM_CONDITIONAL cannot be located where it can be invoked conditionally, as in the `--with-config=user` case. Relocate it to the top level ZFS_AC_CONFIG macro along with the other AM_CONDITIONALs. Signed-off-by: Brian Behlendorf <[email protected]> Issue #4416
* Add support 32 bit FS_IOC32_{GET|SET}FLAGS compat ioctlsColin Ian King2016-03-311-1/+14
| | | | | | | | | | | We need 32 bit userspace FS_IOC32_GETFLAGS and FS_IOC32_SETFLAGS compat ioctls for systems such as powerpc64. We use the normal compat ioctl idiom as used by a variety of file systems to provide this support. Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4477
* Only build devname2devid when libudev headers are availableBrian Behlendorf2016-03-312-3/+12
| | | | | | | | | Accidentally introduced by commit 39fc0cb. The devname2devid utility which depends on libudev must only be built when libudev headers are available. This is accomplished through an AM_CONDITIONAL. Signed-off-by: Brian Behlendorf <[email protected]> Issue #4416
* Add support for devid and phys_path keys in vdev disk labelsDon Brady2016-03-3111-163/+476
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is foundational work for ZED. Updates a leaf vdev's persistent device strings on Linux platform * only applies for a dedicated leaf vdev (aka whole disk) * updated during pool create|add|attach|import * used for matching device matching during auto-{online,expand,replace} * stored in a leaf disk config label (i.e. alongside 'path' NVP) * can opt-out using env var ZFS_VDEV_DEVID_OPT_OUT=YES Some examples: path: '/dev/sdb1' devid: 'scsi-350000394a8ca4fbc-part1' phys_path: 'pci-0000:04:00.0-sas-0x50000394a8ca4fbf-lun-0' path: '/dev/mapper/mpatha' devid: 'dm-uuid-mpath-35000c5006304de3f' Signed-off-by: Don Brady <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #2856 Closes #3978 Closes #4416
* Expand EDQUOT variableAndriy Gapon2016-03-311-3/+3
| | | | | | | | | | Results in failures with ksh version 93v- 2014-06-25. This appears to not be an issue with ksh version 93u+ 2012-08-01. The expanded versions works correctly for both. Signed-off-by: Andriy Gapon <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4452
* zfs_main: fix `zfs userspace` squashing unresolved entriesPavel Boldin2016-03-301-4/+7
| | | | | | | | | | | | | | The `zfs userspace` squashes all entries with unresolved numeric values into a single output entry due to the comparsion always made by the string name which is empty in case of unresolved IDs. Fix this by falling to a numerical comparison when either one of string values is not found. This then compares any numerical values after all with a name resolved. Signed-off-by: Pavel Boldin <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4440
* Remove complicated libspl assert wrappersBrian Behlendorf2016-03-302-51/+36
| | | | | | | | | | Effectively provide our own version of assert()/verify() for use in user space. This minimizes our dependencies and aligns the user space assertion handling with what's used in the kernel. Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4449
* gcc build error: -Wbool-compare in metaslab.cDHE2016-03-301-1/+2
| | | | | | | | | | | When debugging is enabled on a very recent version of gcc (tested with 5.3.0), DVA_SET_GANG(dva, !!(flags)) fails because an assertion causes a comparison between what is technically a boolean and an integer. Signed-off-by: DHE <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4465
* Fix zpool_scrub_* test casesBrian Behlendorf2016-03-306-5/+29
| | | | | | | | | | | | | The zpool_scrub_002, zpool_scrub_003, zpool_scrub_004 test cases fail reliably when running against small pools or fast storage. This occurs because the scrub/resilver operation completes before subsequent commands can be run. A one second delay has been added to 10% of zio's in order to ensure the scrub/resilver operation will run for at least several seconds. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4450
* Use the correct macro to include backtraceCarlo Landmeter2016-03-291-2/+2
| | | | | | | | | | | execinfo.h and backtrace() are GNU extensions provided by glibc and not by gcc, see: http://www.gnu.org/software/libc/manual/html_mono/libc.html#Backtraces Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4453
* Move hrtime_t timestruc_t and timespec_tCarlo Landmeter2016-03-292-4/+5
| | | | | | | | | | | hrtime_t timestruc_t and timespec_t should have originally been included in sys/time.h so lets move them. longlong_t is not defined by any standard so change it to long long Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4459
* Set _DATE_FMT to '%+' if not defined in libspl/timestamp.cCarlo Landmeter2016-03-291-0/+4
| | | | | | Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4458
* Ensure correct return value typeCarlo Landmeter2016-03-291-1/+1
| | | | | | | | When compiling with musl libc the return type will be incorrect. Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4454
* Add missing fcntl.h to includes in mount_zfs.cCarlo Landmeter2016-03-291-0/+1
| | | | | | | | This is needed for musl libc Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4456
* Include sys/types.h in devid.hCarlo Landmeter2016-03-291-0/+1
| | | | | | | | This is needed for musl libc Signed-off-by: Carlo Landmeter <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4454
* Correct typo in spa_load_verify_metadata docsRichard Laager2016-03-291-1/+1
| | | | | | Signed-off-by: Richard Laager <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4471
* zloop.sh requires bashBrian Behlendorf2016-03-251-1/+1
| | | | | | | | | The zloop.sh script requires bash. It will require further improvements to be compatible with the alternatives such as dash. This resolves the ztest failures observed under Ubuntu in the automated tested. Signed-off-by: Brian Behlendorf <[email protected]> Closes #4441
* write_dirs: set_partition expects zero-based partition indecesAndriy Gapon2016-03-251-1/+4
| | | | | | | | ... despite partition names based 1-based. Signed-off-by: Andriy Gapon <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4446
* zfs_copies: do_vol_test must wait for deviceBrian Behlendorf2016-03-251-0/+2
| | | | | | | Occasionally zfs_copies_* tests which rely on do_vol_test() will fail because udev hasn't yet created the minor device. Wait for it. Signed-off-by: Brian Behlendorf <[email protected]>
* Add zloop.sh test scriptBrian Behlendorf2016-03-234-0/+216
| | | | | | | | | | | | | | | | | | | Add Chris Williamson's "new" zloop script so that it may be intergated with ZoLs automated testing. The original script may be found in the openzfs-build repository on Github. Minor modifications were made to the script so it can be run directly from the ZoL source tree or from installed packages. Additionally it was updated to use gdb instead of mdb to extact debugging information from a core dump. References: https://github.com/openzfs/openzfs-build/commit/7fb5d8b https://github.com/openzfs/openzfs-build/blob/master/ansible/roles/openzfs-jenkins-slave/files/usr/local/zloop.sh Signed-off-by: Brian Behlendorf <[email protected]> Closes #4441
* Fix zdb -e and zhack thread_init()Brian Behlendorf2016-03-213-18/+23
| | | | | | | | | | | | | | | | | This issue was caused by calling `thread_init()` and `thread_fini()` multiple times resulting in `kthread_key` being invalid. To resolve the issue the explicit calls to `thread_init()` and `thread_fini()` required by the `zpool` command have been moved in to the command. Consumers such as `zdb` and `zhack` perform the same initialized through `kernel_init()` and `kernel_fini()`. Resolving this issue allows multiple additional test cases to be enabled. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Tim Chase <[email protected]> Closes #4331
* Support for vectorized algorithms on x86Gvozden Neskovic2016-03-216-2/+583
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is initial support for x86 vectorized implementations of ZFS parity and checksum algorithms. For the compilation phase, configure step checks if toolchain supports relevant instruction sets. Each implementation must ensure that the code is not passed to compiler if relevant instruction set is not supported. For this purpose, following new defines are provided if instruction set is supported: - HAVE_SSE, - HAVE_SSE2, - HAVE_SSE3, - HAVE_SSSE3, - HAVE_SSE4_1, - HAVE_SSE4_2, - HAVE_AVX, - HAVE_AVX2. For detecting if an instruction set can be used in runtime, following functions are provided in (include/linux/simd_x86.h): - zfs_sse_available() - zfs_sse2_available() - zfs_sse3_available() - zfs_ssse3_available() - zfs_sse4_1_available() - zfs_sse4_2_available() - zfs_avx_available() - zfs_avx2_available() - zfs_bmi1_available() - zfs_bmi2_available() These function should be called once, on module load, or initialization. They are safe to use from user and kernel space. If an implementation is using more than single instruction set, both compiler and runtime support for all relevant instruction sets should be checked. Kernel fpu methods: - kfpu_begin() - kfpu_end() Use __get_cpuid_max and __cpuid_count from <cpuid.h> Both gcc and clang have support for these. They also handle ebx register in case it is used for PIC code. Signed-off-by: Gvozden Neskovic <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Chunwei Chen <[email protected]> Closes #4381
* Cleanup linkingRichard Yao2016-03-189-15/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I noticed during code review of zfsonlinux/zfs#4385 that the author of a commit had peppered the various Makefile.am files with `$(TIRPC_LIBS)` when putting it into `lib/libspl/Makefile.am` should have sufficed. Upon further examination, it seems that he had copied what we do with `$(ZLIB)`. We also have a bit of that with `-ldl` too. Unfortunately, what we do is wrong, so lets fix it to set a good example for future contributors. In addition, we have multiple `-lz` and `-luuid` passed to the compiler because each `AC_CHECK_LIB` adds it to `$LIBS`. That is somewhat annoying to see, so we switch to `AC_SEARCH_LIBS` to avoid it. This is consistent with the recommendation to use `AC_SEARCH_LIBS` over `AC_CHECK_LIB` by autotools upstream: https://www.gnu.org/software/autoconf/manual/autoconf-2.66/html_node/Libraries.html In an ideal world, this would translate into improvements in ELF's `DT_NEEDED` entries, but that is not the case because of a couple of bugs in libtool. The first bug causes libtool to overlink by using static link dependencies for dynamic linking: https://wiki.mageia.org/en/Overlinking_issues_in_packaging#libtool_issues The workaround for this should be to pass `-Wl,--as-needed` in `LDFLAGS`. That leads us to the second bug, where libtool passes `LDFLAGS` after the libraries are specified and `ld` will only honor `--as-needed` on libraries specified before it: https://sigquit.wordpress.com/2011/02/16/why-asneeded-doesnt-work-as-expected-for-your-libraries-on-your-autotools-project/ There are a few possible workarounds for the second bug. One is to either patch the compiler spec file to specify `-Wl,--as-needed` or pass `-Wl,--as-needed` via `CC` like `CC='gcc -Wl,--as-needed'` so that it is specified early. Another is to patch ltmain.sh like Gentoo does: https://gitweb.gentoo.org/repo/gentoo.git/tree/eclass/ELT-patches/as-needed Without one of those workarounds, this cleanup provides no benefit in terms of `DT_NEEDED` entry generation. It should still be an improvement because it nicely simplifies the code while encouraging good habits when patching autotools scripts. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4426
* Add support for s390[x].Dimitri John Ledkov2016-03-172-2/+17
| | | | | | | Signed-off-by: Dimitri John Ledkov <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #4425
* Disable zpool_add_004_pos test caseBrian Behlendorf2016-03-171-1/+1
| | | | | | | This test case add a zvol to as a vdev to an existing pool. This use case is currently known to be racy. Signed-off-by: Brian Behlendorf <[email protected]>
* Add the ZFS Test SuiteBrian Behlendorf2016-03-161243-1042/+89497
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the ZFS Test Suite and test-runner framework from illumos. This is a continuation of the work done by Turbo Fredriksson to port the ZFS Test Suite to Linux. While this work was originally conceived as a stand alone project integrating it directly with the ZoL source tree has several advantages: * Allows the ZFS Test Suite to be packaged in zfs-test package. * Facilitates easy integration with the CI testing. * Users can locally run the ZFS Test Suite to validate ZFS. This testing should ONLY be done on a dedicated test system because the ZFS Test Suite in its current form is destructive. * Allows the ZFS Test Suite to be run directly in the ZoL source tree enabled developers to iterate quickly during development. * Developers can easily add/modify tests in the framework as features are added or functionality is changed. The tests will then always be in sync with the implementation. Full documentation for how to run the ZFS Test Suite is available in the tests/README.md file. Warning: This test suite is designed to be run on a dedicated test system. It will make modifications to the system including, but not limited to, the following. * Adding new users * Adding new groups * Modifying the following /proc files: * /proc/sys/kernel/core_pattern * /proc/sys/kernel/core_uses_pid * Creating directories under / Notes: * Not all of the test cases are expected to pass and by default these test cases are disabled. The failures are primarily due to assumption made for illumos which are invalid under Linux. * When updating these test cases it should be done in as generic a way as possible so the patch can be submitted back upstream. Most existing library functions have been updated to be Linux aware, and the following functions and variables have been added. * Functions: * is_linux - Used to wrap a Linux specific section. * block_device_wait - Waits for block devices to be added to /dev/. * Variables: Linux Illumos * ZVOL_DEVDIR "/dev/zvol" "/dev/zvol/dsk" * ZVOL_RDEVDIR "/dev/zvol" "/dev/zvol/rdsk" * DEV_DSKDIR "/dev" "/dev/dsk" * DEV_RDSKDIR "/dev" "/dev/rdsk" * NEWFS_DEFAULT_FS "ext2" "ufs" * Many of the disabled test cases fail because 'zfs/zpool destroy' returns EBUSY. This is largely causes by the asynchronous nature of device handling on Linux and is expected, the impacted test cases will need to be updated to handle this. * There are several test cases which have been disabled because they can trigger a deadlock. A primary example of this is to recursively create zpools within zpools. These tests have been disabled until the root issue can be addressed. * Illumos specific utilities such as (mkfile) should be added to the tests/zfs-tests/cmd/ directory. Custom programs required by the test scripts can also be added here. * SELinux should be either is permissive mode or disabled when running the tests. The test cases should be updated to conform to a standard policy. * Redundant test functionality has been removed (zfault.sh). * Existing test scripts (zconfig.sh) should be migrated to use the framework for consistency and ease of testing. * The DISKS environment variable currently only supports loopback devices because of how the ZFS Test Suite expects partitions to be named (p1, p2, etc). Support must be added to generate the correct partition name based on the device location and name. * The ZFS Test Suite is part of the illumos code base at: https://github.com/illumos/illumos-gate/tree/master/usr/src/test Original-patch-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Olaf Faaland <[email protected]> Closes #6 Closes #1534