openzfs/zfs.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Fix evict() deadlock	Brian Behlendorf	2011-03-22	2	-4/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now that KM_SLEEP is not defined as GFP_NOFS there is the possibility of synchronous reclaim deadlocks. These deadlocks never existed in the original OpenSolaris code because all memory reclaim on Solaris is done asyncronously. Linux does both synchronous (direct) and asynchronous (indirect) reclaim. This commit addresses a deadlock caused by inode eviction. A KM_SLEEP allocation may trigger direct memory reclaim and shrink the inode cache. This can occur while a mutex in the array of ZFS_OBJ_HOLD mutexes is held. Through the ->shrink_icache_memory()->evict()->zfs_inactive()-> zfs_zinactive() call path the same mutex may be reacquired resulting in a deadlock. To avoid this deadlock the process must not reacquire the mutex when it is already holding it. This is a reasonable fix for now but longer term the ZFS_OBJ_HOLD mutex locking should be reevaluated. This infrastructure already prevents us from ever using the Linux lock dependency analysis tools, and it may limit scalability.
*	Use KM_PUSHPAGE instead of KM_SLEEP	Brian Behlendorf	2011-03-22	3	-9/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It used to be the case that all KM_SLEEP allocations were GFS_NOFS. Unfortunately this often resulted in the kernel being unable to reclaim the ARC, inode, and dentry caches in a timely manor. The fix was to make KM_SLEEP a GFP_KERNEL allocation in the SPL. However, this increases the posibility of deadlocking the system on a zfs write thread. If a zfs write thread attempts to perform an allocation it may trigger synchronous reclaim. This reclaim may attempt to flush dirty data/inode to disk to free memory. Unforunately, this write cannot finish because the write thread which would handle it is holding the previous transaction open. Deadlock. To avoid this all allocations in the zfs write thread path must use KM_PUSHPAGE which prohibits synchronous reclaim for that thread. In this way forward progress in ensured. The risk with this change is I missed updating an allocation for the write threads leaving an increased posibility of deadlock. If any deadlocks remain they will be unlikely but we'll have to make sure they all get fixed.
*	Merge branch 'dracut'	Brian Behlendorf	2011-03-22	81	-148/+4178
\|\
\| *	Add dracut support	Manuel Amador (Rudd-O)	2011-03-17	19	-8/+1355
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To simplify the process of using zfs as your root filesystem a zfs-drucat sub-package has been added. This sub-package adds a zfs dracut module which allows your initramfs to be rebuilt with zfs support. The process for doing this is still complicated but there is clearly interest from the community about getting this working well and documented. This should help lay some of the groundwork. Longer term these changes should be pushed in the upstream dracut package. Once that occurs this subpackage will no longer be required for new systems, however we may want to conditionally build this package in the future for systems running older dracut versions. Signed-off-by: Brian Behlendorf <[email protected]>
\| *	Add init scripts	Brian Behlendorf	2011-03-17	61	-113/+2707
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support automatically mounting your zfs on filesystem on boot a basic init script is needed. Unfortunately, every distribution has their own idea of the _right_ way to do things. Rather than write one very complicated portable init script, which would be invariably replaced by the distributions own anyway. I have instead added support to provide multiple distribution specific init scripts. The correct init script for your distribution will be selected by ZFS_AC_DEFAULT_PACKAGE which will set DEFAULT_INIT_SCRIPT. During 'make install' the correct script for your system will be installed from zfs/etc/init.d/zfs.DEFAULT_INIT_SCRIPT to the usual /etc/init.d/zfs location. Currently, there is zfs.fedora and a more generic zfs.lsb init script. Hopefully, the distribution maintainers who know best how they want their init scripts to function will feedback their approved versions to be included in the project. This change does not consider upstart jobs but I'm not at all opposed to add that sort of thing.
\| *	Register .remount_fs handler	Brian Behlendorf	2011-03-15	3	-1/+52
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Register the missing .remount_fs handler. This handler isn't strictly required because the VFS does a pretty good job updating most of the MS_* flags. However, there's no harm in using the hook to call the registered zpl callback for various MS_* flags. Additionaly, this allows us to lay the ground work for more complicated argument parsing in the future.
\| *	Register .sync_fs handler	Brian Behlendorf	2011-03-15	3	-9/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Register the missing .sync_fs handler. This is a noop in most cases because the usual requirement is that sync just be initiated. As part of the DMU's normal transaction processing txgs will be frequently synced. However, when the 'wait' flag is set the requirement is that .sync_fs must not return until the data is safe on disk. With the addition of the .sync_fs handler this is now properly implemented.
\| *	Strip 'zfsutil,remount' from /etc/mtab	Brian Behlendorf	2011-03-15	1	-15/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When updating /etc/mtab we should be careful and strip certain options. In particular, we need to strip 'zfsutil' because if we don't the mount utility will helpfull provide it to the mount helper when we issue mount(8) again. This subverts the check that the caller is zfs(8) and not mount(8).
\| *	Always allow '-o remount,ro'	Brian Behlendorf	2011-03-15	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Allow the mount(8) utility to always operate on all datasets when remounting them read-only. This critical for rc.sysinit/umountroot which remounts the root filesystem read-only during shutdown to ensure everything is correctly flushed to disk. Fix minor typo, the check to set zfsutil should use the bitwise '&'. I must have accidentally hit the adjacent '*' and obviously neither the compiler or my code review caught this. Fix it now.
* \|	Fix 'LDFLAGS=-Wl,--as-needed' build error	Brian Behlendorf	2011-03-18	4	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Compiling with 'LDFLAGS=-Wl,--as-needed' exposed the fact that there were some library linking problems introduced by mount_zfs. In particular, the libzfs library does use nvpair symbols, and mount_zfs contains no dependencies on libzpool. Closes #161 Closes #162
* \|	Fix getcwd() warning	Brian Behlendorf	2011-03-18	1	-1/+3
\|/ \| \| \| \| \| \| \| \| \| \| \| \|	New versions glibc declare getcwd() with the warn_unused_result attribute. This results in a warning because the updated mount helper was not checking this return value. This issue was fixed by checking the return type and in the case of an error simply returning the passed dataset. One possible, but unlikely, error would be having your cwd directory unlinked while the mount command was running. cmd/mount_zfs/mount_zfs.c: In function ‘parse_dataset’: cmd/mount_zfs/mount_zfs.c:223:2: error: ignoring return value of ‘getcwd’, declared with attribute warn_unused_result
*	Don't set I/O Scheduler for Partitions	Brian Behlendorf	2011-03-10	1	-0/+4
\| \| \| \| \| \| \| \| \|	ZFS should only change the i/o scheduler for a disk when it has ownership of the whole disk. This is basically the same logic as adjusting the write cache behavior on a disk. This change updates the vdev disk code to skip partitions when setting the i/o scheduler. Closes #152
*	Check for trailing '/' in mount.zfs	Brian Behlendorf	2011-03-10	1	-2/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When run with a root '/' cwd the mount.zfs helper would strip not only the '/' but also the next character from the dataset name. For example, '/tank' was changed to 'ank' instead of just 'tank'. Originally, this was done for the '/tmp' cwd case where we needed to strip the '/' following the cwd. For example '/tmp/tank' needed to remove the '/tmp' cwd plus 1 character for the '/'. This change fixes the problem by checking the cwd and if it ends in a '/' it does not strip and extra character. Otherwise it will strip the next character. I believe this should only ever be true for the root directory. Closes #148
*	Prep zfs-0.6.0-rc2 tagzfs-0.6.0-rc2	Brian Behlendorf	2011-03-09	1	-1/+1
\| \| \| \|	Create the second 0.6.0 release candidate tag (rc2).
*	Print mount/umount errors	Brian Behlendorf	2011-03-09	3	-9/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because we are dependent of the system mount/umount utilities to ensure correct mtab locking, we should not suppress their error output. During a successful mount/umount they will be silent, but during a failure the error message they print is the only sure way to know why a mount failed. This is because the (u)mount(8) return code does not contain the result of the system call issued. The only way to clearly idenify why thing failed is to rely on the error message printed by the tool. Longer term once libmount is available we can issue the mount/umount system calls within the tool and still be ensured correct mtab locking. Closed #107
*	Fix mount helper	Brian Behlendorf	2011-03-09	11	-448/+1224
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several issues related to strange mount/umount behavior were reported and this commit should address most of them. The original idea was to put in place a zfs mount helper (mount.zfs). This helper is used to enforce 'legacy' mount behavior, and perform any extra mount argument processing (selinux, zfsutil, etc). This helper wasn't ready for the 0.6.0-rc1 release but with this change it's functional but needs to extensively tested. This change addresses the following open issues. Closes #101 Closes #107 Closes #113 Closes #115 Closes #119
*	Fix O_APPEND Corruption	Brian Behlendorf	2011-03-09	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Due to an uninitialized variable files opened with O_APPEND may overwrite the start of the file rather than append to it. This was introduced accidentally when I removed the Solaris vnodes. The zfs_range_lock_writer() function used to key off zf->z_vnode to determine if a znode_t was for a zvol of zpl object. With the removal of vnodes this was replaced by the flag zp->z_is_zvol. This flag was used to control the append behavior for range locks. Unfortunately, this value was never properly initialized after the vnode removal. However, because most of memory is usually zeros it happened to be set correctly most of the time making the bug appear racy. Properly initializing zp->z_is_zvol to zero completely resolves the problem with O_APPEND. Closes #126
*	Conserve stack in zfs_setattr()	Brian Behlendorf	2011-03-09	1	-1/+6
\| \| \| \| \| \| \| \|	Move 'bulk' and 'xattr_bulk' from the stack to the heap to minimize stack space usage. These two arrays consumed 448 bytes on the stack and have been replaced by two 8 byte points for a total stack space saving of 432 bytes. The zfs_setattr() path had been previously observed to overrun the stack in certain circumstances.
*	Range lock performance improvements	Brian Behlendorf	2011-03-08	2	-6/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original range lock implementation had to be modified by commit 8926ab7 because it was unsafe on Linux. In particular, calling cv_destroy() immediately after cv_broadcast() is dangerous because the waiters may still be asleep. Thus the following cv_destroy() will free memory which may still be in use. This was fixed by updating cv_destroy() to block on waiters but this in turn introduced a deadlock. The deadlock was resolved with the use of a taskq to move the offending free outside the range lock. This worked well but using the taskq for the free resulted in a serious performace hit. This is somewhat ironic because at the time I felt using the taskq might improve things by making the free asynchronous. This patch refines the original fix and moves the free from the taskq to a private free list. Then items which must be free'd are simply inserted in to the list. When the range lock is dropped it's safe to free the items. The list is walked and all rl_t entries are freed. This change improves small cached read performance by 26x. This was expected because for small reads the number of locking calls goes up significantly. More surprisingly this change significantly improves large cache read performance. This probably attributable to better cpu/memory locality. Very likely the same processor which allocated the memory is now freeing it. bs ext3 zfs zfs+fix faster ---------------------------------------------- 512 435 3 79 26x 1k 820 7 160 22x 2k 1536 14 305 21x 4k 2764 28 572 20x 8k 3788 50 1024 20x 16k 4300 86 1843 21x 32k 4505 138 2560 18x 64k 5324 252 3891 15x 128k 5427 276 4710 17x 256k 5427 413 5017 12x 512k 5427 497 5324 10x 1m 5427 521 5632 10x Closes #142
*	Add zfs_open()/zfs_close()	Brian Behlendorf	2011-03-08	3	-1/+105
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the original implementation the zfs_open()/zfs_close() hooks were dropped for simplicity. This was functional but not 100% correct with the expected ZFS sematics. Updating and re-adding the zfs_open()/zfs_close() hooks resolves the following issues. 1) The ZFS_APPENDONLY file attribute is once again honored. While there are still no Linux tools to set/clear these attributes once there are it should behave correctly. 2) Minimal virus scan file attribute hooks were added. Once again this support in disabled but the infrastructure is back in place. 3) Most importantly correctly handle assigning files which were opened syncronously to the intent log. Without this change O_SYNC modifications could be lost during a system crash even though they were marked synchronous.
*	Set stat->st_dev and statfs->f_fsid	Brian Behlendorf	2011-03-07	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Filesystems like ZFS must use what the kernel calls an anonymous super block. Basically, this is just a filesystem which is not backed by a single block device. Normally this block device's dev_t is stored in the super block. For anonymous super blocks a unique reserved dev_t is assigned as part of get_sb(). This sb->s_dev must then be set in the returned stat structures as stat->st_dev. This allows userspace utilities to easily detect the boundries of a specific filesystem. Tools such as 'du' depend on this for proper accounting. Additionally, under OpenSolaris the statfs->f_fsid is set to the device id. To preserve consistency with OpenSolaris we also set the fsid to the device id. Other Linux filesystem (ext) set the fsid to a unique value determined by the filesystems uuid. This value is unique but maintains no relationship to the device id. This may be desirable when exporting NFS filesystem because it minimizes to chance of a client observing the same fsid from two different servers. Closes #140
*	Make Missing Modules.symvers Fatal	Brian Behlendorf	2011-03-07	2	-0/+36
\| \| \| \| \| \| \| \| \| \| \|	Detect early on in configure if the Modules.symvers file is missing. Without this file there will be build failures later and it's best to catch this early and provide a useful error. In this case the most likely problem is the kernel-devel packages are not installed. It may also be possible that they are using an unbuilt custom kernel in which case they must build the kernel first. Closes #127
*	Make CONFIG_PREEMPT Fatal	Brian Behlendorf	2011-03-07	2	-0/+155
\| \| \| \| \| \|	Until support is added for preemptible kernels detect this at configure time and make it fatal. Otherwise, it is possible to have a successful build and kernel modules with flakey behavior.
*	Add missing libspl+libzpool libs to libzfs	Brian Behlendorf	2011-03-03	4	-6/+12
\| \| \| \| \| \| \| \| \| \|	The libspl and libzpool libraries were missing from the libzfs Makefile.am. They should be explicitly listed to avoid build issues when compiling static libraries and binaries. Additionally, ensure libzpool is built before libzfs because libzfs is dependent on libzpool. This was also exposed as an issue when forcing static linking.
*	Use Linux ATTR_ versions	Brian Behlendorf	2011-03-03	1	-2/+2
\| \| \| \| \| \|	The AT_ versions of these macros are used on Solaris and while they map to their Linux equivilants the code has been updated to use the ATTR_ versions.
*	Conserve stack in zfs_setattr()	Brian Behlendorf	2011-03-02	1	-44/+39
\| \| \| \| \| \| \| \|	Move 'tmpxvattr' from the stack to the heap to minimize stack space usage. This is enough to get us below the 1024 byte stack frame warning. That however is still a large stack frame and it should be further reduced by moving the 'bulk' and 'xattr_bulk' sa_bulk_attr_t variables to the heap in a future patch.
*	Drop HAVE_XVATTR macros	Brian Behlendorf	2011-03-02	9	-188/+536
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When I began work on the Posix layer it immediately became clear to me that to integrate cleanly with the Linux VFS certain Solaris specific things would have to go. One of these things was to elimate as many Solaris specific types from the ZPL layer as possible. They would be replaced with their Linux equivalents. This would not only be good for performance, but for the general readability and health of the code. The Solaris and Linux VFS are different beasts and should be treated as such. Most of the code remains common for constructing transactions and such, but there are subtle and important differenced which need to be repsected. This policy went quite for for certain types such as the vnode_t, and it initially seemed to be working out well for the vattr_t. There was a relatively small amount of related xvattr_t code I was forced to comment out with HAVE_XVATTR. But it didn't look that hard to come back soon and replace it all with a native Linux type. However, after going doing this path with xvattr some distance it clear that this code was woven in the ZPL more deeply than I thought. In particular its hooks went very deep in to the ZPL replay code and replacing it would not be as easy as I originally thought. Rather than continue persuing replacing and removing this code I've taken a step back and reevaluted things. This commit reverts many of my previous commits which removed xvattr related code. It restores much of the code to its original upstream state and now relies on improved xvattr_t support in the zfs package itself. The result of this is that much of the code which I had commented out, which accidentally broke things like replay, is now back in place and working. However, there may be a small performance impact for getattr/setattr operations because they now require a translation from native Linux to Solaris types. For now that's a price I'm willing to pay. Once everything is completely functional we can revisting the issue of removing the vattr_t/xvattr_t types. Closes #111
*	Add xvattr support	Brian Behlendorf	2011-03-02	7	-1/+339
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With the removal of the minimal xvattr support from the spl this support needs to be replaced in the zfs package. This is fairly easily accomplished by directly adding portions of the sys/vnode.h header from OpenSolaris. These xvattr additions have been placed in the sys/xvattr.h header file and included as needed where simply a sys/vnode.h was included before. In additon to the xvattr types and helper macros two functions were also included. The xva_init() and xva_getxoptattr() functions were included as static inline functions in xvattr.h. They are simple enough and it was simpler to place them here rather than in their own .c file.
*	Remove caller_context_t	Brian Behlendorf	2011-03-02	1	-12/+7
\| \| \| \| \|	Remove the remaining callers of caller_context_t. This type has been removed because it is not needed for the Linux port.
*	Add the zpool and filesystem versions	Darik Horn	2011-02-28	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \|	Print the supported zpool and filesystem versions at module load time. This change removes an ambiguity and adds information that system administrators care about. The phrase "ZFS pool version %s" is the same as zpool upgrade -v so that the operator is familiar with the message. ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5 ZFS: Unloaded module v0.6.0 Signed-off-by: Brian Behlendorf <[email protected]>
*	Fix set block scheduler warnings	Brian Behlendorf	2011-02-25	1	-6/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There were two cases when attempting to set the vdev block device scheduler which would causes console warnings. The first case was when the vdev used a loop, ram, dm, or other such device which doesn't support a configurable scheduler. In these cases attempting to set a scheduler is pointless and can be safely skipped. The secord case is slightly more troubling. We were seeing transient cases where setting the elevator would return -EFAULT. On retry everything is fine so there appears to be a small window where this is possible. To handle that case we silently retry up to three times before reporting the warning. In all of the above cases the warning is harmless and at worse you may see slightly different performance characteristics from one or more of your vdevs.
*	Use udev to create /dev/zvol/[dataset_name] links	Fajar A. Nugraha	2011-02-25	10	-9/+761
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit allows zvols with names longer than 32 characters, which fixes issue on https://github.com/behlendorf/zfs/issues/#issue/102. Changes include: - use /dev/zd* device names for zvol, where * is the device minor (include/sys/fs/zfs.h, module/zfs/zvol.c). - add BLKZNAME ioctl to get dataset name from userland (include/sys/fs/zfs.h, module/zfs/zvol.c, cmd/zvol_id). - add udev rule to create /dev/zvol/[dataset_name] and the legacy /dev/[dataset_name] symlink. For partitions on zvol, it will create /dev/zvol/[dataset_name]-part* (etc/udev/rules.d/60-zvol.rules, cmd/zvol_id). Signed-off-by: Brian Behlendorf <[email protected]>
*	Add the new blkdev_compat.h header to the DIST target.	Darik Horn	2011-02-24	2	-2/+4
\| \| \| \|	Signed-off-by: Brian Behlendorf <[email protected]>
*	Remove rdev packing	Brian Behlendorf	2011-02-23	1	-35/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove custom code to pack/unpack dev_t's. Under Linux all dev_t's are an unsigned 32-bit value even on 64-bit platforms. The lower 20 bits are used for the minor number and the upper 12 for the major number. This means if your importing a pool from Solaris you may get strange major/minor numbers. But it doesn't really matter because even if we add compatibility code to translate the encoded Solaris major/minor they won't do you any good under Linux. You will still need to recreate the dev_t with a major/minor which maps to reserved major numbers used under Linux. Dropping this code also resolves 32-bit builds by removing the offending 32-bit compatibility code.
*	Use correct ASSERT3* variant	Brian Behlendorf	2011-02-23	1	-1/+1
\| \| \| \| \| \|	ASSERT3P should be used instead of ASSERT3U when comparing pointers. Using ASSERT3U with the cast causes a compiler warning for 32-bit builds which is fatal with --enable-debug.
*	Increase fragment size to block size	Brian Behlendorf	2011-02-23	1	-4/+8
\| \| \| \| \| \| \| \| \| \| \| \|	The underlying storage pool actually uses multiple block size. Under Solaris frsize (fragment size) is reported as the smallest block size we support, and bsize (block size) as the filesystem's maximum block size. Unfortunately, under Linux the fragment size and block size are often used interchangeably. Thus we are forced to report both of them as the filesystem's maximum block size. Closes #112
*	Fix 'statement with no effect' warning	Brian Behlendorf	2011-02-23	1	-1/+1
\| \| \| \| \| \| \| \|	Because the secpolicy_* macros are all currently defined to (0). And because the caller of this function does not check the return code. The compiler complains that this statement has no effect which is correct and OK. To suppress the warning explictly cast the result to (void).
*	Fix uninitialized variable	Brian Behlendorf	2011-02-23	1	-1/+1
\| \| \| \| \| \|	It was possible for rc to be unitialized in the parse_options() function which triggered a compiler warning. Ensure rc is always initialized.
*	Fix enum compiler warning	Brian Behlendorf	2011-02-23	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Generally it's a good idea to use enums for switch statements, but in this case it causes warning because the enum is really a set of flags. These flags are OR'ed together in some cases resulting in values which are not part of the original enum. This causes compiler warning such as this about invalid cases. error: case value ‘33’ not in enumerated type ‘zprop_source_t’ To handle this we simply case the enum to an int for the switch statement. This leaves all other enum type checking in place and effectively disabled these warnings.
*	Linux 2.6.38 compat, blkdev_get_by_path()	Brian Behlendorf	2011-02-23	49	-3/+159
\| \| \| \| \| \| \| \| \| \| \|	The open_bdev_exclusive() function has been replaced (again) by the more generic blkdev_get_by_path() function. Additionally, the counterpart function close_bdev_exclusive() has been replaced by blkdev_put(). Because these functions are more generic versions of the functions they replaced the compatibility macro must add the FMODE_EXCL mask to ensure they are exclusive. Closes #114
*	Linux 2.6.x compat, blkdev_compat.h	Brian Behlendorf	2011-02-23	7	-64/+69
\| \| \| \| \| \| \| \| \| \|	For legacy reasons the zvol.c and vdev_disk.c Linux compatibility code ended up in sys/blkdev.h and sys/vdev_disk.h headers. While there are worse places for this code to live it should be in a linux/blkdev_compat.h header. This change moves this block device Linux compatibility code in to the linux/blkdev_compat.h header and updates all the correct #include locations. This is not a functional change or bug fix, it is just code cleanup.
*	Prep zfs-0.6.0-rc1 tagzfs-0.6.0-rc1	Brian Behlendorf	2011-02-18	1	-1/+1
\| \| \| \| \| \| \| \|	Create the first 0.6.0 release candidate tag (rc1). The Posix layer is now functional and passes fstest and several other test suites cleanly. We now need this release candidate tag to broaden the test coverage before we can release the official zfs-0.6.0.
*	Merge branch 'zpl'	Brian Behlendorf	2011-02-18	128	-6848/+7785
\|\
\| *	Use provided uid/gid for setattr	Brian Behlendorf	2011-02-17	2	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When changing the uid/gid of a file via zfs_setattr() use the Posix id passed in iattr->ia_uid/gid. While the zfs_fuid_create() code already had the fuid support disabled for Linux it was returning the uid/gid from the credential. With this change the 'chown' command which relies on setxattr is now working properly. Also remove a little stray white space which was in front of zfs_update_inode() call and the end of zfs_setattr().
\| *	Fix symlink(2) inode reference count	Brian Behlendorf	2011-02-17	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Under Linux sys_symlink(2) should result in a inode being created with one reference for the inode itself, and a second reference on the inode which is held by the new dentry. Under Solaris this appears not to be the case. Their zfs_symlink() handler drops the inode reference before returning. The result of this under Linux is that the reference count for symlinks is always one smaller than it should have been. This results in a BUG() when the symlink is unlinked. To handle this the Linux port now keeps the inode reference which differs from the Solaris behavior. This results in correct reference counts. Closes #96
\| *	Use -zfs_readlink() error	Brian Behlendorf	2011-02-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The zfs_readlink() function returns a Solaris positive error value and that needs to be converted to a Linux negative error value. While in this case nothing would actually go wrong, it's still incorrect and should be fixed if for no other reason than clarity.
\| *	Improve 'zpool import' safety	Brian Behlendorf	2011-02-17	1	-9/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are three improvements here to 'zpool import' proposed by Fajar in Github issue #98. They are all good so I'm commiting all three. 1) Add descriptions for "hpet" and "core" blacklist entries. 2) Add "core" to the blacklist, as described in the issue accessing this device will crash Xen dom0. 3) Refine probing behavior to use fstatat64(). This allows us to determine if a device is a block device or a regular file without having to open it. This is the safest appraoch when probing /dev/ because the simple act of opening a device may have unexpected consequences. Closes #98
\| *	Fix readlink(2)	Brian Behlendorf	2011-02-16	2	-28/+36
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch addresses three issues related to symlinks. 1) Revert the zfs_follow_link() function to a modified version of the original zfs_readlink(). The only changes from the original OpenSolaris version relate to using Linux types. For the moment this means no vnode's and no zfsvfs_t. The caller zpl_follow_link() was also updated accordingly. This change was reverted because it was slightly gratuitious. 2) Update zpl_follow_link() to use local variables for the link buffer. I'd forgotten that iov.iov_base is updated by uiomove() so after the call to zfs_readlink() it can not longer be used. We need our own private copy of the link pointer. 3) Allocate MAXPATHLEN instead of MAXPATHLEN+1. By default MAXPATHLEN is 4096 bytes which is a full page, adding one to it pushes it slightly over a page. That means you'll likely end up allocating 2 pages which is wasteful of memory and possibly slightly slower.
\| *	Update 'zfs.sh -u' to umount all zfs filesystems	Brian Behlendorf	2011-02-16	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before it is safe to unload the zfs module stack all mounted zfs filesystems must be unmounted. If they are not unmounted, there will be references held on the modules and the stack cannot be removed. To handle this have 'zfs.sh -u' which is used by all of the test scripts umount all zfs filesystem before attempting to unload the module stack.
\| *	Suppress share error on mount	Brian Behlendorf	2011-02-16	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Until code is added to support automatically sharing datasets we should return success instead of failure. This prevents the command line tools from returning a non-zero error code. While a user likely won't notice this, test scripts like zconfig.sh do and correctly fail because of it.