aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Add basic uio supportBrian Behlendorf2011-02-108-6/+325
| | | | | | | | This code originates in OpenSolaris and was modified by KQ Infotech to be compatible with Linux. While supporting uios in the short term is useful to get something working this is not an abstraction we want to keep. This code is expected to be short lived and removed as soon as all the remaining uio based APIs and updated.
* Add trivial acl helpersBrian Behlendorf2011-02-101-1/+120
| | | | | | | | | | | | The zfs acl code makes use of the two OpenSolaris helper functions acl_trivial_access_masks() and ace_trivial_common(). Since they are only called from zfs_acl.c I've brought them over from OpenSolaris and added them as static function to this file. This way I don't need to reimplement this functionality from scratch in the SPL. Long term once I take a more careful look at the acl implementation it may be the case that these functions really aren't needed. If that turns out to be the case they can then be removed.
* Remove dead ACL codeBrian Behlendorf2011-02-101-75/+1
| | | | | The following code was unused which caused gcc to complain. Since it was deadcode it has simply been removed.
* Remove zfs_parse_bootfs() supportBrian Behlendorf2011-02-101-55/+0
| | | | | | Remove unneeded bootfs functions. This support shouldn't be required for the Linux port, and even if it is it would need to be reworked to integrate cleanly with Linux.
* VFS: Wrap with HAVE_SHAREBrian Behlendorf2011-02-102-18/+22
| | | | | | | | | | Certain NFS/SMB share functionality is not yet in place. These functions used to be wrapped with the generic HAVE_ZPL to prevent them from being compiled. I still don't want them compiled but I'm working toward eliminating the use of HAVE_ZPL. So I'm just renaming the wrapper here to HAVE_SHARE. They still won't be compiled until all the share issues are worked through. Share support is the last missing piece from zfs_ioctl.c.
* Wrap with HAVE_MLSLABELBrian Behlendorf2011-02-101-0/+2
| | | | | | | The zfs_check_global_label() function is part of the HAVE_MLSLABEL support which was previously commented out by a HAVE_ZPL check. Since we're still deciding what to do about mls labels wrap it with the preexisting macro to keep it compiled out.
* Remove znode move functionalityBrian Behlendorf2011-02-101-184/+0
| | | | | | | | Unlike Solaris the Linux implementation embeds the inode in the znode, and has no use for a vnode. So while it's true that fragmention of the znode cache may occur it should not be worse than any of the other Linux FS inode caches. Until proven that this is a problem it's just added complexity we don't need.
* Conserve stack in zfs_mkdir()Brian Behlendorf2011-02-101-1/+3
| | | | | Move the sa_attrs array from the stack to the heap to minimize stack space usage.
* Conserve stack in zfs_sa_upgrade()Brian Behlendorf2011-02-101-4/+8
| | | | | | As always under Linux stack space is at a premium. Relocate two 20 element sa_bulk_attr_t arrays in zfs_sa_upgrade() from the stack to the heap.
* Export required vfs/vn symbolsBrian Behlendorf2011-02-106-29/+153
|
* Add HAVE_SCANSTAMPBrian Behlendorf2011-02-101-2/+6
| | | | | This functionality is not supported under Linux, perhaps it will be some day if it's decided it's useful.
* Add initial rw_uio functions to the dmuBrian Behlendorf2011-02-042-5/+117
| | | | | | | These functions were dropped originally because I felt they would need to be rewritten anyway to avoid using uios. However, this patch readds then with they dea they can just be reworked and the uio bits dropped.
* Remove HAVE_ZPL from commands and librariesBrian Behlendorf2011-02-047-139/+0
| | | | | Thanks to the previous few commits we can now build all of the user space commands and libraries with support for the zpl.
* Documentation updatesBrian Behlendorf2011-02-046-17/+17
| | | | | Minor Linux specific documentation updates to the comments and man pages.
* Minimal libshare infrastructureBrian Behlendorf2011-02-0449-234/+24
| | | | | | | | | | | | | | | | | | ZFS even under Solaris does not strictly require libshare to be available. The current implementation attempts to dlopen() the library to access the needed symbols. If this fails libshare support is simply disabled. This means that on Linux we only need the most minimal libshare implementation. In fact just enough to prevent the build from failing. Longer term we can decide if we want to implement a libshare library like Solaris. At best this would be an abstraction layer between ZFS and NFS/SMB. Alternately, we can drop libshare entirely and directly integrate ZFS with Linux's NFS/SMB. Finally the bare bones user-libshare.m4 test was dropped. If we do decide to implement libshare at some point it will surely be as part of this package so the check is not needed.
* Add 'zfs mount' supportBrian Behlendorf2011-02-045-132/+169
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | By design the zfs utility is supposed to handle mounting and unmounting a zfs filesystem. We could allow zfs to do this directly. There are system calls available to mount/umount a filesystem. And there are library calls available to manipulate /etc/mtab. But there are a couple very good reasons not to take this appraoch... for now. Instead of directly calling the system and library calls to (u)mount the filesystem we fork and exec a (u)mount process. The principle reason for this is to delegate the responsibility for locking and updating /etc/mtab to (u)mount(8). This ensures maximum portability and ensures the right locking scheme for your version of (u)mount will be used. If we didn't do this we would have to resort to an autoconf test to determine what locking mechanism is used. The downside to using mount(8) instead of mount(2) is that we lose the exact errno which was returned by the kernel. The return code from mount(8) provides some insight in to what went wrong but it not quite as good. For the moment this is translated as a best guess in to a errno for the higher layers of zfs. In the long term a shared library called libmount is under development which provides a common API to address the locking and errno issues. Once the standard mount utility has been updated to use this library we can then leverage it. Until then this is the only safe solution. http://www.kernel.org/pub/linux/utils/util-linux/libmount-docs/index.html
* Open up libzfs_run_process/libzfs_load_moduleBrian Behlendorf2011-01-282-2/+9
| | | | | | | Recently helper functions were added to libzfs_util to load a kernel module or execute a process. Initially this functionality was limited to libzfs but it has become clear there will be other consumers. This change opens up the interface so it may be used where appropriate.
* Disable umount.zfs helperBrian Behlendorf2011-01-281-37/+103
| | | | | | | | | | | | | | | | | For the moment, the only advantage in registering a umount helper would be to automatically unshare a zfs filesystem. Since under Linux this would be unexpected (but nice) behavior there is no harm in disabling it. This is desirable because the 'zfs unmount' path invokes the system umount. This is done to ensure correct mtab locking but has the side effect that the umount.zfs helper would be called if it exists. By default this helper calls back in to zfs to do the unmount on Solaris which we don't want under Linux. Once libmount is available and we have a safe way to correctly lock and update the /etc/mtab file we can reconsider the need for a umount helper. Using libmount is the prefered solution.
* Enable mount.zfs helperBrian Behlendorf2011-01-281-32/+188
| | | | | | | | | | | | | | | | | | While not strictly required to mount a zfs filesystem using a mount helper has certain advantages. First, we need it if we want to honor the mount behavior as found on Solaris. As part of the mount we need to validate that the dataset has the legacy mount property set if we are using 'mount' instead of 'zfs mount'. Secondly, by using a mount helper we can automatically load the zpl kernel module. This way you can just issue a 'mount' or 'zfs mount' and it will just work. Finally, it gives us common hook in user space to add any zfs specific mount options we might want. At the moment we don't have any but now the infrastructure is at least in place.
* Autoconf selinux supportBrian Behlendorf2011-01-2849-16/+678
| | | | | | | | | | | | | | | | | If libselinux is detected on your system at configure time link against it. This allows us to use a library call to detect if selinux is enabled and if it is to pass the mount option: "context=\"system_u:object_r:file_t:s0" For now this is required because none of the existing selinux policies are aware of the zfs filesystem type. Because of this they do not properly enable xattr based labeling even though zfs supports all of the required hooks. Until distro's add zfs as a known xattr friendly fs type we must use mntpoint labeling. Alternately, end users could modify their existing selinux policy with a little guidance.
* Fix ZVOL rename minor devicesBrian Behlendorf2011-01-071-3/+9
| | | | | | | During a rename we need to be careful to destroy and create a new minor for the ZVOL _only_ if the rename succeeded. The previous code would both destroy you minor device unconditionally, it would also fail to create the new minor device on success.
* Fix minor compiler warningsBrian Behlendorf2011-01-068-65/+68
| | | | | | | These compiler warnings were introduced when code which was previously #ifdef'ed out by HAVE_ZPL was re-added for use by the posix layer. All of the following changes should be obviously correct and will cause no semantic changes.
* Add missing mkdirp prototypeBrian Behlendorf2010-12-141-0/+34
| | | | | | For while now mkdirp has been built as part of libspl however the protoype was never added to libgen.h. This went unnoticed until enabling the mount support which uses mkdirp().
* Use cv_timedwait_interruptible in arcBrian Behlendorf2010-12-142-3/+4
| | | | | | | | | | | The issue is that cv_timedwait() sleeps uninterruptibly to block signals and avoid waking up early. Under Linux this counts against the load average keeping it artificially high. This change allows the arc to sleep interruptibly which mean it may be woken up early due to a signal. Normally this means some extra care must be taken to handle a potential signal. But for the arcs usage of cv_timedwait() there is no harm in waking up before the timeout expires so no extra handling is required.
* Fix block device-related issues in zdb.Ricardo M. Correia2010-12-145-23/+58
| | | | | | | | | | | Specifically, this fixes the two following errors in zdb when a pool is composed of block devices: 1) 'Value too large for defined data type' when running 'zdb <dataset>'. 2) 'character device required' when running 'zdb -l <block-device>'. Signed-off-by: Ricardo M. Correia <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Enable rrwlock.c compilationBrian Behlendorf2010-12-071-3/+0
| | | | | | With the addition of the thread specific data interfaces to the SPL it is safe to enable compilation of the re-enterant read reader/writer locks.
* Refresh autogen.sh productsBrian Behlendorf2010-12-072-6/+6
| | | | | | | | | Refresh the autogen.sh products based on the versions which are installed by default in the GA RHEL6.0 release. autoconf (GNU Autoconf) 2.63 automake (GNU automake) 1.11.1 ltmain.sh (GNU libtool) 2.2.6b
* Remove partition from vdev name in zfault.shNed Bass2010-11-291-1/+1
| | | | | | | | | As of the 0.5.2 tag, names of whole-disk vdevs must be specified to the command line tools without partition identifiers. This commit fixes a 'zpool online' command in zfault.sh that incorrectly includes he partition in the vdev name, causing test 9 to fail. Signed-off-by: Brian Behlendorf <[email protected]>
* Skip /dev/hpet during 'zpool import'zfs-0.5.2Brian Behlendorf2010-11-121-1/+2
| | | | | | | | | | | | | | | | | | | If libblkid does not contain ZFS support, then 'zpool import' will scan all block devices in /dev/ to determine which ones are components of a ZFS filesystem. It does this by opening all the devices and stat'ing them to determine which ones are block devices. If the device turns out not to be a block device it is skipped. Usually, this whole process is pretty harmless (although slow). But there are certain devices in /dev/ which must be handled in a very specific way or your system may crash. For example, if /dev/watchdog is simply opened the watchdog timer will be started and your system will panic when the timer expires. It turns out the /dev/hpet causes similiar problems although only when accessed under a virtual machine. For some reason accessing /dev/hpet causes qemu to crash. To address this issue this commit adds /dev/hpet to the device blacklist, it will be skipped solely based on its name.
* Add '-ts' options to zconfig.sh/zfault.sh usageBrian Behlendorf2010-11-112-2/+4
| | | | | | | When adding this functionality originally the options to only run specific tests (-t), or conversely skip specific tests (-s) were omitted from the usage page. This commit adds the missing documentation.
* Remove spl/zfs modules as part of cleanupBrian Behlendorf2010-11-114-0/+4
| | | | | | | | | | The idea behind the '-c' flag is to cleanup everything from a previous test run which might cause the test script to fail. This should also include removing the previously loaded module. This makes it a little easier to run 'zconfig.sh -c', however remember this is a test script and it will take all of your other zpools offline for the purposes of the test. This notion has also been extended to the default 'make check' behavior.
* Unconditionally load core kernel modulesBrian Behlendorf2010-11-112-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Loading and unloading the zlib modules as part of the zfs.sh script has proven a little problematic for a few reasons. * First, your kernel may not need to load either zlib_inflate or zlib_deflate. This functionality may be built directly in to your kernel. It depends entirely on what your distribution decided was the right thing to do. * Second, even if you do manage to load the correct modules you may not be able to unload them. There may other consumers of the modules with a reference preventing the unload. To avoid both of these issues the test scripts have been updated to attempt to unconditionally load all modules listed in KERNEL_MODULES. If the module is successfully loaded you must have needed it. If the module can't be loaded that almost certainly means either it is built in to your kernel or is already being used by another consumer. In both cases this is not an issue and we can move on to the spl/zfs modules. Finally, by removing these kernel modules from the MODULES list we ensure they are never unloaded during 'zfs.sh -u'. This avoids the issue of the script failing because there is another consumer using the module we were not aware of. In other words the script restricts unloading modules to only the spl/zfs modules. Closes #78
* Fix for access beyond end of device errorNed Bass2010-11-105-11/+13
| | | | | | | | | | | | | | | | | | | | | | This commit fixes a sign extension bug affecting l2arc devices. Extremely large offsets may be passed down to the low level block device driver on reads, generating errors similar to attempt to access beyond end of device sdbi1: rw=14, want=36028797014862705, limit=125026959 The unwanted sign extension occurrs because the function arc_read_nolock() stores the offset as a daddr_t, a 32-bit signed int type in the Linux kernel. This offset is then passed to zio_read_phys() as a uint64_t argument, causing sign extension for values of 0x80000000 or greater. To avoid this, we store the offset in a uint64_t. This change also changes a few daddr_t struct members to uint64_t in the libspl headers to avoid similar bugs cropping up in the future. We also add an ASSERT to __vdev_disk_physio() to check for invalid offsets. Closes #66 Signed-off-by: Brian Behlendorf <[email protected]>
* Linux 2.6.36 compat, use fops->unlocked_ioctl()Brian Behlendorf2010-11-101-6/+5
| | | | | | | | | As of linux-2.6.36 the last in-tree consumer of fops->ioctl() has been removed and thus fops()->ioctl() has also been removed. The replacement hook is fops->unlocked_ioctl() which has existed in kernel since 2.6.12. Since the ZFS code only contains support back to 2.6.18 vintage kernels, I'm not adding an autoconf check for this and simply moving everything to use fops->unlocked_ioctl().
* Linux 2.6.36 compat, blk_* macros removedBrian Behlendorf2010-11-101-0/+10
| | | | | | | Most of the blk_* macros were removed in 2.6.36. Ostensibly this was done to improve readability and allow easier grepping. However, from a portability stand point the macros are helpful. Therefore the needed macros are redefined here if they are missing from the kernel.
* Linux 2.6.36 compat, synchronous bio flagBrian Behlendorf2010-11-106-13/+330
| | | | | | | | | | | | | | The name of the flag used to mark a bio as synchronous has changed again in the 2.6.36 kernel due to the unification of the BIO_RW_* and REQ_* flags. The new flag is called REQ_SYNC. To simplify checking this flag I have introduced the vdev_disk_dio_is_sync() helper function. Based on the results of several new autoconf tests it uses the correct mask to check for a synchronous bio. Preferred interface for flagging a synchronous bio: 2.6.12-2.6.29: BIO_RW_SYNC 2.6.30-2.6.35: BIO_RW_SYNCIO 2.6.36-2.6.xx: REQ_SYNC
* Linux 2.6.36 compat, use REQ_FAILFAST_MASKBrian Behlendorf2010-11-105-13/+329
| | | | | | | | | | | | | | | | As of linux-2.6.36 the BIO_RW_FAILFAST and REQ_FAILFAST flags have been unified under the REQ_* names. These flags always had to be kept in-sync so this is a nice step forward, unfortunately it means we need to be careful to only use the new unified flags when the BIO_RW_* flags are not defined. Additional autoconf checks were added for this and if it is ever unclear which method to use no flags are set. This is safe but may result in longer delays before a disk is failed. Perferred interface for setting FAILFAST on a bio: 2.6.12-2.6.27: BIO_RW_FAILFAST 2.6.28-2.6.35: BIO_RW_FAILFAST_{DEV|TRANSPORT|DRIVER} 2.6.36-2.6.xx: REQ_FAILFAST_{DEV|TRANSPORT|DRIVER}
* Remove inconsistent use of EOPNOTSUPPNed Bass2010-11-101-1/+1
| | | | | | | | | | | Commit 3ee56c292bbcd7e6b26e3c2ad8f0e50eee236bcc changed an ENOTSUP return value in one location to ENOTSUPP to fix user programs seeing an invalid ioctl() error code. However, use of ENOTSUP is widespread in the zfs module. Instead of changing all of those uses, we fixed the ENOTSUP definition in the SPL to be consistent with user space. The changed return value in the above commit is therefore no longer needed, so this commit reverses it to maintain consistency. Signed-off-by: Brian Behlendorf <[email protected]>
* Add lustre zpios-test workloadBrian Behlendorf2010-11-083-20/+88
| | | | | | | | The lustre zpios-test simulates a reasonable lustre workload. It will create 128 threads, the same as a Lustre OSS, and then 4096 individual objects. Each objects is 16MiB in size and will be written/read in 1MiB from a random thread. This is fundamentally how we expect Lustre to behave for large IO intensive workloads.
* Prep for 0.5.2 tagBrian Behlendorf2010-11-081-1/+1
| | | | Update META file to prep for 0.5.2 tag.
* Replace custom zpool configs with generic configsBrian Behlendorf2010-11-0817-276/+261
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To streamline testing I have in the past added several custom configs to the zpool-config directory. This change reverts those custom configs and replaces them with three generic config which can do the same thing. The generic config behavior can be set by setting various environment variables when calling either the zpool-create.sh or zpios.sh scripts. For example if you wanted to create and test a single 4-disk Raid-Z2 configuration using disks [A-D]1 with dedicated ZIL and L2ARC devices you could run the following. $ ZIL="log A2" L2ARC="cache B2" RANKS=1 CHANNELS=4 LEVEL=2 \ zpool-create.sh -c zpool-raidz $ zpool status tank pool: tank state: ONLINE scan: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz2-0 ONLINE 0 0 0 A1 ONLINE 0 0 0 B1 ONLINE 0 0 0 C1 ONLINE 0 0 0 D1 ONLINE 0 0 0 logs A2 ONLINE 0 0 0 cache B2 ONLINE 0 0 0 errors: No known data errors
* Make rollbacks fail gracefullyNed Bass2010-11-082-1/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Support for rolling back datasets require a functional ZPL, which we currently do not have. The zfs command does not check for ZPL support before attempting a rollback, and in preparation for rolling back a zvol it removes the minor node of the device. To prevent the zvol device node from disappearing after a failed rollback operation, this change wraps the zfs_do_rollback() function in an #ifdef HAVE_ZPL and returns ENOSYS in the absence of a ZPL. This is consistent with the behavior of other ZPL dependent commands such as mount. The orginal error message observed with this bug was rather confusing: internal error: Unknown error 524 Aborted This was because zfs_ioc_rollback() returns ENOTSUP if we don't HAVE_ZPL, but Linux actually has no such error code. It should instead return EOPNOTSUPP, as that is how ENOTSUP is defined in user space. With that we would have gotten the somewhat more helpful message cannot rollback 'tank/fish': unsupported version This is rather a moot point with the above changes since we will no longer make that ioctl call without a ZPL. But, this change updates the error code just in case. Signed-off-by: Brian Behlendorf <[email protected]>
* Increate zio write interrupt thread count.Brian Behlendorf2010-11-081-1/+1
| | | | | | | Increasing the default zio_wr_int thread count from 8 to 16 improves write performence by 13% on large systems. More testing need to be done but I suspect the ideal tuning here is ZTI_BATCH() with a minimum of 8 threads.
* Shorten zio_* thread namesBrian Behlendorf2010-11-082-3/+2
| | | | | | Linux kernel thread names are expected to be short. This change shortens the zio thread names to 10 characters leaving a few chracters to append the /<cpuid> to which the thread is bound. For example: z_wr_iss/0.
* Fix panic mounting unformatted zvolNed Bass2010-10-291-0/+4
| | | | | | | | | | On some older kernels, i.e. 2.6.18, zvol_ioctl_by_inode() may get passed a NULL file pointer if the user tries to mount a zvol without a filesystem on it. This change adds checks to prevent a null pointer dereference. Closes #73. Signed-off-by: Brian Behlendorf <[email protected]>
* Call modprobe with absolute pathNed Bass2010-10-221-1/+1
| | | | | | | | | Some sudo configurations may not include /sbin in the PATH. libzfs_load_module() currently does not call modprobe with an absolute path, so it may fail under such configurations if called under sudo. This change adds the absolute path to modprobe so we no longer rely on how PATH is set. Signed-off-by: Brian Behlendorf <[email protected]>
* Fix intermittent 'zpool add' failuresNed Bass2010-10-221-15/+27
| | | | | | | | | | | | | | | Creating whole-disk vdevs can intermittently fail if a udev-managed symlink to the disk partition is already in place. To avoid this, we now remove any such symlink before partitioning the disk. This makes zpool_label_disk_wait() truly wait for the new link to show up instead of returning if it finds an old link still in place. Otherwise there is a window between when udev deletes and recreates the link during which access attempts will fail with ENOENT. Also, clean up a comment about waiting for udev to create symlinks. It no longer needs to describe the special cases for the link names, since that is now handled in a separate helper function. Signed-off-by: Brian Behlendorf <[email protected]>
* Add zconfig test for adding and removing vdevsNed Bass2010-10-221-0/+101
| | | | | | | | | | | | | This test performs a sanity check of the zpool add and remove commands. It tests adding and removing both a cache disk and a log disk to and from a zpool. Usage of both a shorthand device path and a full path is covered. The test uses a scsi_debug device as the disk to be added and removed. This is done so that zpool will see it as a whole disk and partition it, which it does not currently done for loopback devices. We want to verify that the manipulation done to whole disks paths to hide the parition information does not break the add/remove interface. Signed-off-by: Brian Behlendorf <[email protected]>
* Remove solaris-specific code from make_leaf_vdev()Ned Bass2010-10-221-37/+0
| | | | | | | Portability between Solaris and Linux isn't really an issue for us anymore, and removing sections like this one helps simplify the code. Signed-off-by: Brian Behlendorf <[email protected]>
* Support shorthand names with zpool removeNed Bass2010-10-221-64/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | zpool status displays abbreviated vdev names without leading path components and, in the case of whole disks, without partition information. Also, the zpool subcommands 'create' and 'add' support using shorthand devices names without qualified paths. Prior to this change, however, removing a device generally required specifying its name as it is stored in the vdev label. So while zpool status might list a cache disk with a name like A16, removing it would require a full path such as /dev/disk/zpool/A16-part1, which is non-intuitive. This change adds support for shorthand device names with the remove subcommand so one can simply type, for example, zpool remove tank A16 A consequence of this change is that including the partition information when removing a whole-disk vdev now results in an error. While this is arguably the correct behavior, it is a departure from how zpool previously worked in this project. This change removes the only reference to ctd_check_path(), so that function is also removed to avoid compiler warnings. Signed-off-by: Brian Behlendorf <[email protected]>