aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Correct man page section numbers for LinuxRichard Laager2011-04-193-22/+22
| | | | Signed-off-by: Brian Behlendorf <[email protected]>
* Set -Wno-unused-but-set-variable globallyBrian Behlendorf2011-04-1920-62/+83
| | | | | | | | | | | | | As of gcc-4.6 the option -Wunused-but-set-variable is enabled by default. While this is a useful warning there are numerous places in the ZFS code when a variable is set and then only checked in an ASSERT(). To avoid having to update every instance of this in the code we now set -Wno-unused-but-set-variable to suppress the warning. Additionally, when building with --enable-debug and -Werror set these warning also become fatal. We can reevaluate the suppression of these error at a later time if it becomes an issue. For now we are basically just reverting to the previous gcc behavior.
* Fix gcc configure warningsBrian Behlendorf2011-04-1910-45/+45
| | | | | | | | | | | | | | | | | | Newer versions of gcc are getting smart enough to detect the sloppy syntax used for the autoconf tests. It is now generating warnings for unused/undeclared variables. Newer version of gcc even have the -Wunused-but-set-variable option set by default. This isn't a problem except when -Werror is set and they get promoted to an error. In this case the autoconf test will return an incorrect result which will result in a build failure latter on. To handle this I'm tightening up many of the autoconf tests to explicitly mark variables as unused to suppress the gcc warning. Remember, all of the autoconf code can never actually be run we just want to get a clean build error to detect which APIs are available. Never using a variable is absolutely fine for this. Closes #176
* Fix gcc compiler warning, parse_option()Brian Behlendorf2011-04-191-0/+1
| | | | | | | | | | | | | | | When compiling ZFS in user space gcc-4.6.0 correctly identifies the variable 'value' as being set but never used. This generates a warning and a build failure when using --enable-debug. Once again this is correct but I'm reluctant to remove 'value' because we are breaking the string in to name/value pairs. While it is not used now there's a good chance it will be soon and I'd rather not have to reinvent this. To suppress the warning with just as a VERIFY(). This was observed under Fedora 15. cmd/mount_zfs/mount_zfs.c: In function ‘parse_option’: cmd/mount_zfs/mount_zfs.c:112:21: error: variable ‘value’ set but not used [-Werror=unused-but-set-variable]
* Fix gcc compiler warning, dsl_pool_create()Brian Behlendorf2011-04-191-2/+2
| | | | | | | | | | | | | | When compiling ZFS in user space gcc-4.6.0 correctly identifies the variable 'os' as being set but never used. This generates a warning and a build failure when using --enable-debug. However, the code is correct we only want to use 'os' for the kernel space builds. To suppress the warning the call was wrapped with a VERIFY() which has the nice side effect of ensuring the 'os' actually never is NULL. This was observed under Fedora 15. module/zfs/dsl_pool.c: In function ‘dsl_pool_create’: module/zfs/dsl_pool.c:229:12: error: variable ‘os’ set but not used [-Werror=unused-but-set-variable]
* Linux 2.6.39 compat, invalidate_inodes()Brian Behlendorf2011-04-191-1/+1
| | | | | | | | | Update code to use the spl_invalidate_inodes() wrapper. This hides some of the complexity of determining if invalidate_inodes() was exported, and if so what is its prototype. The second argument of spl_invalidate_inodes() determined the behavior of how dirty inodes are handled. By passing a zero we are indicated that we want those inodes to be treated as busy and skipped.
* Autogen refresh for kernel-insert-inode-locked.m4Brian Behlendorf2011-04-185-0/+5
| | | | | | Several Makefile.in's were accidentally not updated when the kernel-insert-inode-locked.m4 check was added. This change simply refreshes the missed files.
* Fix rebuildable RPMs for el6/ch5zfs-0.6.0-rc3Brian Behlendorf2011-04-081-2/+6
| | | | | | | When rebuilding the source RPM under el5 you need to append the target_cpu. However, under el6/ch5 things are packaged correctly and the arch is already part of kver. For this reason it also needs to be stripped from kver when setting kverpkg.
* Align closing fi in mount-zfs.shNed Bass2011-04-081-1/+1
|
* Use consistent indentation in mount-zfs.shNed Bass2011-04-081-1/+1
|
* Fix a couple commentsRichard Laager2011-04-072-2/+2
| | | | Signed-off-by: Brian Behlendorf <[email protected]>
* Linux 2.6.29 compat, credentialsBrian Behlendorf2011-04-071-3/+3
| | | | | | | The .sync_fs fix as applied did not use the updated SPL credential API. This broke builds on Debian Lenny, this change applies the needed fix to use the portable API. The original credential changes are part of commit 81e97e21872a9c38ad66c37fafe1436ee25abee3.
* Prep zfs-0.6.0-rc3 tagBrian Behlendorf2011-04-071-1/+1
| | | | Create the third 0.6.0 release candidate tag (rc3).
* Update zfs.fedora init scriptManuel Amador (Rudd-O)2011-04-071-51/+178
| | | | | | | | | | | | | | Apply all of Rudd-O's changes for the Fedora init script. The initial init script was one I threw together based on Rudd-O's original work. It worked for me but it has some flaws. Rudd-O has invested considerable time updating it to be significantly smarter. It now handles using ZFS as your root filesystem plus various other quirks. Since he is familiar with the right way to do things on Fedora and has tested this init script we are integrating all of his changes. Signed-off-by: Brian Behlendorf <[email protected]>
* Permit both mountpoint=legacy and mountpoint=/ in initrdManuel Amador (Rudd-O)2011-04-071-1/+7
| | | | Signed-off-by: Brian Behlendorf <[email protected]>
* Added .gitignore for mount.zfs and zvol_idManuel Amador (Rudd-O)2011-04-072-0/+2
| | | | Signed-off-by: Brian Behlendorf <[email protected]>
* Fix ASSERTION(!dsl_pool_sync_context(tx->tx_pool))Brian Behlendorf2011-04-071-0/+13
| | | | | | | | | | Disable the normal reclaim path for the txg_sync thread. This ensures the thread will never enter dmu_tx_assign() which can otherwise occur due to direct reclaim. If this is allowed to happen the system can deadlock. Direct reclaim call path: ->shrink_icache_memory->prune_icache->dispose_list-> clear_inode->zpl_clear_inode->zfs_inactive->dmu_tx_assign
* Add direct+indirect ARC reclaimBrian Behlendorf2011-04-071-0/+59
| | | | | | | | | | | | | | | | | | | | | Under OpenSolaris all memory reclaim is done asyncronously. Under Linux memory reclaim is done asynchronously _and_ synchronously. When a process allocates memory with GFP_KERNEL it explicitly allows the kernel to do reclaim on its behalf to satify the allocation. If that GFP_KERNEL allocation fails the kernel may take more drastic measures to reclaim the memory such as killing user space processes. This was observed to happen with ZFS because the ARC could consume a large fraction of the system memory but no synchronous reclaim could be performed on it. The result was GFP_KERNEL allocations could fail resulting in OOM events, and only moments latter the arc_reclaim thread would free unused memory from the ARC. This change leaves the arc_thread in place to manage the fundamental ARC behavior. But it adds a synchronous (direct) reclaim path for the ARC which can be called when memory is badly needed. It also adds an asynchronous (indirect) reclaim path which is called much more frequently to prune the ARC slab caches.
* Add missing arcstatsBrian Behlendorf2011-04-071-8/+20
| | | | | | | | | | | | The following useful values were missing the arcstats. This change adds them in to provide greater visibility in to the arcs behavior. arc_no_grow 4 0 arc_tempreserve 4 0 arc_loaned_bytes 4 0 arc_meta_used 4 624774592 arc_meta_limit 4 400785408 arc_meta_max 4 625594176
* Call d_instantiate before unlocking inodeBrian Behlendorf2011-04-073-27/+12
| | | | | | | | | Under Linux a dentry referencing an inode must be instantiated before the inode is unlocked. To accomplish this without overly modifing the core ZFS code the dentry it passed via the vattr_t. There are cases such as replay when a dentry is not available. In which case it is obviously not initialized at inode creation time, if a dentry is needed it will be spliced as when required via d_lookup().
* Fix `make distclean` for `./configure --with-config=userBrian Behlendorf2011-04-051-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Making distclean in module make[1]: Entering directory `/zfs/module' make -C SUBDIRS=`pwd` clean make: Entering an unknown directory make: *** SUBDIRS=/zfs/module: No such file or directory. Stop. When using --with-config=user the 'distclean' target would fail because it assumes the kernel configuration infrastrure is set up. This is not the case, nor does it need to be, because the '--with-config=user' option will prune the entire ./module subtree from SUBDIRS. This prevents most build rules from operating in the ./module directory. However, the 'dist*' rules will still traverse this directory because it is listed in DIST_SUBDIRS. This is correct because we need to ensure the dist rules package the directory contents regardless of the configuration for the 'dist' rule. The correct way to handle this is to only invoke the kernel build system as part of the 'clean' rule when CONFIG_KERNEL_TRUE is set. Initial fix provided by Darik Horn <[email protected]>. This commit is a slightly refined form of the original.
* Call udevadm trigger more safelyNed Bass2011-04-052-2/+2
| | | | | | | | | | | | | | | | | | | | | | | Some udev hooks are not designed to be idempotent, so calling udevadm trigger outside of the distribution's initialization scripts can have unexpected (and potentially dangerous) side effects. For example, the system time may change or devices may appear multiple times. See Ubuntu launchpad bug 320200 and this mailing list post for more details: https://lists.ubuntu.com/archives/ubuntu-devel/2009-January/027260.html To avoid these problems we call udevadm trigger with --action=change --subsystem-match=block. The first argument tells udev just to refresh devices, and make sure everything's as it should be. The second argument limits the scope to block devices, so devices belonging to other subsystems cannot be affected. This doesn't fix the problem on older udev implementations that don't provide udevadm but instead have udevtrigger as a standalone program. In this case the above options aren't available so there's no way to call call udevtrigger safely. But we can live with that since this issue only exists in optional test and helper scripts, and most zfs-on-linux users are running newer systems anyways.
* Update CHAOS 5 PackagingBrian Behlendorf2011-03-311-4/+4
| | | | | | The CHAOS 5 kernels are now packaged identially to the RHEL6 kernels. Therefore we can simply use the RHEL6 rules in the spec file when building packages.
* Fix libzpool cv_* build errorBrian Behlendorf2011-03-311-1/+2
| | | | | | | | | | | | | This build failure was accidentally introduced by previous commit bfd214a which fixed the load average. Unfortunately, the wrapper for cv_wait_interruptible was not available in the zfs_context.h user compatibility code. I failed to notice this because I didn't rebuild everything cleanly before committing. undefined reference to `cv_wait_interruptible' collect2: ld returned 1 exit status Closes #181
* Fix inflated load averageBrian Behlendorf2011-03-311-2/+2
| | | | | | | | | | | | | | | Kernel threads which sleep uninterruptibly on Linux are marked in the (D) state. These threads are usually in the process of performing IO and are thus counted against the load average. The txg_quiesce and txg_sync threads were always sleeping uninterruptibly and thus inflating the load average. This change makes them sleep interruptibly. Some care is required however because these threads may now be woken early by signals. In this case the callers are all careful to check that the required conditions are met after waking up. If we're woken early due to a signal they will simply go back to sleep. In this case these changes are safe. Closes #175
* Spec file compat, %{datadir}Fajar A. Nugraha2011-03-251-1/+1
| | | | | | | | The dracut change caused an error during "make rpm". The cause is simple, RHEL5 does not recognize the %{datarootdir} macro in zfs.spec. It was changed to %{datadir} which fixes the build. Signed-off-by: Brian Behlendorf <[email protected]>
* Set cmd paths in udev rules using --prefixBrian Behlendorf2011-03-257-19/+37
| | | | | | | | | | | | | The udev/rules.d scripts must use absolute paths to their support binaries. However, where those binaries get installed depends on what --prefix was set to when the package was configured. This change makes the udev/rules.d helpers to *.in files which are processed by configure. This allows them to be dynamically updated to include the specified --prefix. Additionally, this change updates 60-zvol.rules to handle both the 'add' and 'change' actions. This ensures that that all valid zvol devices are correctly linked.
* Fixes to enable zvol symlink creationFajar A. Nugraha2011-03-242-3/+6
| | | | | | | | | | | | This commit fixes issue on https://github.com/behlendorf/zfs/issues/#issue/172 Changes: - update BLKZNAME to use _IOR instead of _IO. Kernel 2.6.32 allows read parameters (copy_to_user) with _IO, while newer kernels (tested Archlinux's 2.6.37 kernel) enforces _IOR (which is correct) - fix return code and message on error Signed-off-by: Brian Behlendorf <[email protected]>
* Linux 2.6.29 compat, .freeze_fs/.unfreeze_fsBrian Behlendorf2011-03-221-2/+0
| | | | | | The .freeze_fs/.unfreeze_fs hooks were not added until Linux 2.6.29 Since these hooks are currently unused they are being removed to allow support of older kernels.
* Linux 2.6.29 compat, credentialsBrian Behlendorf2011-03-223-81/+81
| | | | | | | As of Linux 2.6.29 a clean credential API was added to the Linux kernel. Previously the credential was embedded in the task_struct. Because the SPL already has considerable support for handling this API change the ZPL code has been updated to use the Solaris credential API.
* Linux 2.6.28 compat, insert_inode_locked()Brian Behlendorf2011-03-2251-1/+158
| | | | | | | Added insert_inode_locked() helper function, prior to this most callers used insert_inode_hash(). The older method doesn't check for collisions in the inode_hashtable but it still acceptible for use. Fallback to using insert_inode_hash() when insert_inode_locked() is unavailable.
* Linux 2.6.27 compat, blk_queue_stackable()Brian Behlendorf2011-03-221-0/+11
| | | | | | | The blk_queue_stackable() queue flag was added in 2.6.27 to handle dm stacking drivers. Prior to this request stacking drivers were detected by checking (q->request_fn == NULL), for earlier kernels we revert to this legacy behavior.
* Linux compat, umount2(2) flagsBrian Behlendorf2011-03-221-2/+17
| | | | | | | | Older glibc <sys/mount.h> headers did not define all the available umount2(2) flags. Both MNT_FORCE and MNT_DETACH are supported in the kernel back to 2.4.11 so we define them correctly if they are missing. Closes #95
* Fix evict() deadlockBrian Behlendorf2011-03-222-4/+22
| | | | | | | | | | | | | | | | | | | | | Now that KM_SLEEP is not defined as GFP_NOFS there is the possibility of synchronous reclaim deadlocks. These deadlocks never existed in the original OpenSolaris code because all memory reclaim on Solaris is done asyncronously. Linux does both synchronous (direct) and asynchronous (indirect) reclaim. This commit addresses a deadlock caused by inode eviction. A KM_SLEEP allocation may trigger direct memory reclaim and shrink the inode cache. This can occur while a mutex in the array of ZFS_OBJ_HOLD mutexes is held. Through the ->shrink_icache_memory()->evict()->zfs_inactive()-> zfs_zinactive() call path the same mutex may be reacquired resulting in a deadlock. To avoid this deadlock the process must not reacquire the mutex when it is already holding it. This is a reasonable fix for now but longer term the ZFS_OBJ_HOLD mutex locking should be reevaluated. This infrastructure already prevents us from ever using the Linux lock dependency analysis tools, and it may limit scalability.
* Use KM_PUSHPAGE instead of KM_SLEEPBrian Behlendorf2011-03-223-9/+9
| | | | | | | | | | | | | | | | | | | | | | | It used to be the case that all KM_SLEEP allocations were GFS_NOFS. Unfortunately this often resulted in the kernel being unable to reclaim the ARC, inode, and dentry caches in a timely manor. The fix was to make KM_SLEEP a GFP_KERNEL allocation in the SPL. However, this increases the posibility of deadlocking the system on a zfs write thread. If a zfs write thread attempts to perform an allocation it may trigger synchronous reclaim. This reclaim may attempt to flush dirty data/inode to disk to free memory. Unforunately, this write cannot finish because the write thread which would handle it is holding the previous transaction open. Deadlock. To avoid this all allocations in the zfs write thread path must use KM_PUSHPAGE which prohibits synchronous reclaim for that thread. In this way forward progress in ensured. The risk with this change is I missed updating an allocation for the write threads leaving an increased posibility of deadlock. If any deadlocks remain they will be unlikely but we'll have to make sure they all get fixed.
* Merge branch 'dracut'Brian Behlendorf2011-03-2281-148/+4178
|\
| * Add dracut supportManuel Amador (Rudd-O)2011-03-1719-8/+1355
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To simplify the process of using zfs as your root filesystem a zfs-drucat sub-package has been added. This sub-package adds a zfs dracut module which allows your initramfs to be rebuilt with zfs support. The process for doing this is still complicated but there is clearly interest from the community about getting this working well and documented. This should help lay some of the groundwork. Longer term these changes should be pushed in the upstream dracut package. Once that occurs this subpackage will no longer be required for new systems, however we may want to conditionally build this package in the future for systems running older dracut versions. Signed-off-by: Brian Behlendorf <[email protected]>
| * Add init scriptsBrian Behlendorf2011-03-1761-113/+2707
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To support automatically mounting your zfs on filesystem on boot a basic init script is needed. Unfortunately, every distribution has their own idea of the _right_ way to do things. Rather than write one very complicated portable init script, which would be invariably replaced by the distributions own anyway. I have instead added support to provide multiple distribution specific init scripts. The correct init script for your distribution will be selected by ZFS_AC_DEFAULT_PACKAGE which will set DEFAULT_INIT_SCRIPT. During 'make install' the correct script for your system will be installed from zfs/etc/init.d/zfs.DEFAULT_INIT_SCRIPT to the usual /etc/init.d/zfs location. Currently, there is zfs.fedora and a more generic zfs.lsb init script. Hopefully, the distribution maintainers who know best how they want their init scripts to function will feedback their approved versions to be included in the project. This change does not consider upstart jobs but I'm not at all opposed to add that sort of thing.
| * Register .remount_fs handlerBrian Behlendorf2011-03-153-1/+52
| | | | | | | | | | | | | | | | | | Register the missing .remount_fs handler. This handler isn't strictly required because the VFS does a pretty good job updating most of the MS_* flags. However, there's no harm in using the hook to call the registered zpl callback for various MS_* flags. Additionaly, this allows us to lay the ground work for more complicated argument parsing in the future.
| * Register .sync_fs handlerBrian Behlendorf2011-03-153-9/+27
| | | | | | | | | | | | | | | | | | Register the missing .sync_fs handler. This is a noop in most cases because the usual requirement is that sync just be initiated. As part of the DMU's normal transaction processing txgs will be frequently synced. However, when the 'wait' flag is set the requirement is that .sync_fs must not return until the data is safe on disk. With the addition of the .sync_fs handler this is now properly implemented.
| * Strip 'zfsutil,remount' from /etc/mtabBrian Behlendorf2011-03-151-15/+28
| | | | | | | | | | | | | | | | When updating /etc/mtab we should be careful and strip certain options. In particular, we need to strip 'zfsutil' because if we don't the mount utility will helpfull provide it to the mount helper when we issue mount(8) again. This subverts the check that the caller is zfs(8) and not mount(8).
| * Always allow '-o remount,ro'Brian Behlendorf2011-03-151-3/+10
| | | | | | | | | | | | | | | | | | | | | | Allow the mount(8) utility to always operate on all datasets when remounting them read-only. This critical for rc.sysinit/umountroot which remounts the root filesystem read-only during shutdown to ensure everything is correctly flushed to disk. Fix minor typo, the check to set zfsutil should use the bitwise '&'. I must have accidentally hit the adjacent '*' and obviously neither the compiler or my code review caught this. Fix it now.
* | Fix 'LDFLAGS=-Wl,--as-needed' build errorBrian Behlendorf2011-03-184-3/+3
| | | | | | | | | | | | | | | | | | | | Compiling with 'LDFLAGS=-Wl,--as-needed' exposed the fact that there were some library linking problems introduced by mount_zfs. In particular, the libzfs library does use nvpair symbols, and mount_zfs contains no dependencies on libzpool. Closes #161 Closes #162
* | Fix getcwd() warningBrian Behlendorf2011-03-181-1/+3
|/ | | | | | | | | | | | | New versions glibc declare getcwd() with the warn_unused_result attribute. This results in a warning because the updated mount helper was not checking this return value. This issue was fixed by checking the return type and in the case of an error simply returning the passed dataset. One possible, but unlikely, error would be having your cwd directory unlinked while the mount command was running. cmd/mount_zfs/mount_zfs.c: In function ‘parse_dataset’: cmd/mount_zfs/mount_zfs.c:223:2: error: ignoring return value of ‘getcwd’, declared with attribute warn_unused_result
* Don't set I/O Scheduler for PartitionsBrian Behlendorf2011-03-101-0/+4
| | | | | | | | | ZFS should only change the i/o scheduler for a disk when it has ownership of the whole disk. This is basically the same logic as adjusting the write cache behavior on a disk. This change updates the vdev disk code to skip partitions when setting the i/o scheduler. Closes #152
* Check for trailing '/' in mount.zfsBrian Behlendorf2011-03-101-2/+6
| | | | | | | | | | | | | | | | When run with a root '/' cwd the mount.zfs helper would strip not only the '/' but also the next character from the dataset name. For example, '/tank' was changed to 'ank' instead of just 'tank'. Originally, this was done for the '/tmp' cwd case where we needed to strip the '/' following the cwd. For example '/tmp/tank' needed to remove the '/tmp' cwd plus 1 character for the '/'. This change fixes the problem by checking the cwd and if it ends in a '/' it does not strip and extra character. Otherwise it will strip the next character. I believe this should only ever be true for the root directory. Closes #148
* Prep zfs-0.6.0-rc2 tagzfs-0.6.0-rc2Brian Behlendorf2011-03-091-1/+1
| | | | Create the second 0.6.0 release candidate tag (rc2).
* Print mount/umount errorsBrian Behlendorf2011-03-093-9/+18
| | | | | | | | | | | | | | | | Because we are dependent of the system mount/umount utilities to ensure correct mtab locking, we should not suppress their error output. During a successful mount/umount they will be silent, but during a failure the error message they print is the only sure way to know why a mount failed. This is because the (u)mount(8) return code does not contain the result of the system call issued. The only way to clearly idenify why thing failed is to rely on the error message printed by the tool. Longer term once libmount is available we can issue the mount/umount system calls within the tool and still be ensured correct mtab locking. Closed #107
* Fix mount helperBrian Behlendorf2011-03-0911-448/+1224
| | | | | | | | | | | | | | | | | Several issues related to strange mount/umount behavior were reported and this commit should address most of them. The original idea was to put in place a zfs mount helper (mount.zfs). This helper is used to enforce 'legacy' mount behavior, and perform any extra mount argument processing (selinux, zfsutil, etc). This helper wasn't ready for the 0.6.0-rc1 release but with this change it's functional but needs to extensively tested. This change addresses the following open issues. Closes #101 Closes #107 Closes #113 Closes #115 Closes #119
* Fix O_APPEND CorruptionBrian Behlendorf2011-03-091-0/+1
| | | | | | | | | | | | | | | | | | | Due to an uninitialized variable files opened with O_APPEND may overwrite the start of the file rather than append to it. This was introduced accidentally when I removed the Solaris vnodes. The zfs_range_lock_writer() function used to key off zf->z_vnode to determine if a znode_t was for a zvol of zpl object. With the removal of vnodes this was replaced by the flag zp->z_is_zvol. This flag was used to control the append behavior for range locks. Unfortunately, this value was never properly initialized after the vnode removal. However, because most of memory is usually zeros it happened to be set correctly most of the time making the bug appear racy. Properly initializing zp->z_is_zvol to zero completely resolves the problem with O_APPEND. Closes #126