openzfs/zfs.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Linux 3.3 compat, sops->show_options()	Brian Behlendorf	2012-02-03	21	-0/+21
\| \| \| \| \| \| \| \| \| \|	The second argument of sops->show_options() was changed from a 'struct vfsmount ' to a 'struct dentry '. Add an autoconf check to detect the API change and then conditionally define the expected interface. In either case we are only interested in the zfs_sb_t. Signed-off-by: Brian Behlendorf <[email protected]> Closes #549
*	Cleanup ZFS debug infrastructure	Brian Behlendorf	2012-02-02	1	-4/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Historically the internal zfs debug infrastructure has been scattered throughout the code. Since we expect to start making more use of this code this patch performs some cleanup. * Consolidate the zfs debug infrastructure in the zfs_debug.[ch] files. This includes moving the zfs_flags and zfs_recover variables, plus moving the zfs_panic_recover() function. * Remove the existing unused functionality in zfs_debug.c and replace it with code which correctly utilized the spl logging infrastructure. * Remove the __dprintf() function from zfs_ioctl.c. This is dead code, the dprintf() functionality in the kernel relies on the spl log support. * Remove dprintf() from hdr_recl(). This wasn't particularly useful and was missing the required format specifier anyway. * Subsequent patches should unify the dprintf() and zfs_dbgmsg() functions. Signed-off-by: Brian Behlendorf <[email protected]>
*	Ignore dataset if the dds_type is DMU_OST_OTHER	Prakash Surya	2012-01-19	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	Since the zpios and potentially other ZFS tests use the DMU_OST_OTHER type to label their datasets, the zpool and zfs commands should gracefully handle this type when it is encountered. This patch modifies the commands' behavior to ignore any datasets with a dds_type of DMU_OST_OTHER. Signed-off-by: Prakash Surya <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #536
*	Allow GPT+EFI vdev replacement in boot pools.	Darik Horn	2012-01-18	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Commit zfsonlinux/zfs@57a4eddc4d5e1e6c10d8d7dcf87a9fc27398adcd allows the bootfs property to be set on any pool, but does not accommodate subsequent vdev changes. For example: # zpool replace rpool /dev/sda /dev/sdb operation not supported on this type of pool property 'bootfs' is not supported on EFI labeled devices For non-Solaris builds, disable the check that emits this error. Signed-off-by: Brian Behlendorf <[email protected]>
*	Combine libraries: spl, avl, efi, share, unicode.	Darik Horn	2012-01-17	20	-404/+148
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	These libraries, which are an artifact of the ZoL development process, conflict with packages that are already in distribution: * libspl: SPL Programming Language * libavl: AVL for Linux * libefi: GRUB And these libraries are potential conflicts: * libshare: the Linux Mount Manager * libunicode: Perl and Python Recompose these five ZoL components into the four libraries that are conventionally provided by Solaris and FreeBSD systems: + libnvpair + libuutil + libzpool + libzfs This change resolves the name conflict, makes ZoL more compatible with existing software that uses autotools to detect ZFS, and allows pkg-zfs to better reflect the official Debian kFreeBSD packaging. Signed-off-by: Brian Behlendorf <[email protected]> Closes: #430
*	Add overlay(-O) mount option support	Suman Chakravartula	2012-01-12	2	-2/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Linux supports mounting over non-empty directories by default. In Solaris this is not the case and -O option is required for zfs mount to mount a zfs filesystem over a non-empty directory. For compatibility, I've added support for -O option to mount zfs filesystems over non-empty directories if the user wants to, just like in Solaris. I've defined MS_OVERLAY to record it in the flags variable if the -O option is supplied. The flags variable passes through a few functions and its checked before performing the empty directory check in zfs_mount function. If -O is given, the check is not performed. Signed-off-by: Brian Behlendorf <[email protected]> Closes #473
*	Treat /dev/vd* as whole disks	Richard Laager	2012-01-11	1	-0/+6
\| \| \| \| \| \| \|	Correctly detect /dev/vd devices as whole disks and attempt to create an EFI partition table. Signed-off-by: Brian Behlendorf <[email protected]>
*	Linux 3.1 compat, super_block->s_shrink	Brian Behlendorf	2012-01-11	21	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The Linux 3.1 kernel has introduced the concept of per-filesystem shrinkers which are directly assoicated with a super block. Prior to this change there was one shared global shrinker. The zfs code relied on being able to call the global shrinker when the arc_meta_limit was exceeded. This would cause the VFS to drop references on a fraction of the dentries in the dcache. The ARC could then safely reclaim the memory used by these entries and honor the arc_meta_limit. Unfortunately, when per-filesystem shrinkers were added the old interfaces were made unavailable. This change adds support to use the new per-filesystem shrinker interface so we can continue to honor the arc_meta_limit. The major benefit of the new interface is that we can now target only the zfs filesystem for dentry and inode pruning. Thus we can minimize any impact on the caching of other filesystems. In the context of making this change several other important issues related to managing the ARC were addressed, they include: * The dnlc_reduce_cache() function which was called by the ARC to drop dentries for the Posix layer was replaced with a generic zfs_prune_t callback. The ZPL layer now registers a callback to drop these dentries removing a layering violation which dates back to the Solaris code. This callback can also be used by other ARC consumers such as Lustre. arc_add_prune_callback() arc_remove_prune_callback() * The arc_reduce_dnlc_percent module option has been changed to arc_meta_prune for clarity. The dnlc functions are specific to Solaris's VFS and have already been largely eliminated already. The replacement tunable now represents the number of bytes the prune callback will request when invoked. * Less aggressively invoke the prune callback. We used to call this whenever we exceeded the arc_meta_limit however that's not strictly correct since it results in over zeleous reclaim of dentries and inodes. It is now only called once the arc_meta_limit is exceeded and every effort has been made to evict other data from the ARC cache. * More promptly manage exceeding the arc_meta_limit. When reading meta data in to the cache if a buffer was unable to be recycled notify the arc_reclaim thread to invoke the required prune. * Added arcstat_prune kstat which is incremented when the ARC is forced to request that a consumer prune its cache. Remember this will only occur when the ARC has no other choice. If it can evict buffers safely without invoking the prune callback it will. * This change is also expected to resolve the unexpect collapses of the ARC cache. This would occur because when exceeded just the arc_meta_limit reclaim presure would be excerted on the arc_c value via arc_shrink(). This effectively shrunk the entire cache when really we just needed to reclaim meta data. Signed-off-by: Brian Behlendorf <[email protected]> Closes #466 Closes #292
*	Linux 3.2 compat: set_nlink()	Darik Horn	2011-12-16	21	-0/+21
\| \| \| \| \| \| \| \| \| \| \|	Directly changing inode->i_nlink is deprecated in Linux 3.2 by commit SHA: bfe8684869601dacfcb2cd69ef8cfd9045f62170 Use the new set_nlink() kernel function instead. Signed-off-by: Brian Behlendorf <[email protected]> Closes: #462
*	Add make rule for building Arch Linux packages	Prakash Surya	2011-12-14	21	-0/+126
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Added the necessary build infrastructure for building packages compatible with the Arch Linux distribution. As such, one can now run: $ ./configure $ make pkg # Alternatively, one can run 'make arch' as well on the Arch Linux machine to create two binary packages compatible with the pacman package manager, one for the zfs userland utilities and another for the zfs kernel modules. The new packages can then be installed by running: # pacman -U $package.pkg.tar.xz In addition, source-only packages suitable for an Arch Linux chroot environment or remote builder can also be build using the 'sarch' make rule. NOTE: Since the source dist tarball is created on the fly from the head of the build tree, it's MD5 hash signature will be continually influx. As a result, the md5sum variable was intentionally omitted from the PKGBUILD files, and the '--skipinteg' makepkg option is used. This may or may not have any serious security implications, as the source tarball is not being downloaded from an outside source. Signed-off-by: Prakash Surya <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #491
*	Illumos #734: Use taskq_dispatch_ent() interface	Garrett D'Amore	2011-12-14	1	-35/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It has been observed that some of the hottest locks are those of the zio taskqs. Contention on these locks can limit the rate at which zios are dispatched which limits performance. This upstream change from Illumos uses new interface to the taskqs which allow them to utilize a prealloc'ed taskq_ent_t. This removes the need to perform an allocation at dispatch time while holding the contended lock. This has the effect of improving system performance. Reviewed by: Albert Lee <[email protected]> Reviewed by: Richard Lowe <[email protected]> Reviewed by: Alexey Zaytsev <[email protected]> Reviewed by: Jason Brian King <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Adam Leventhal <[email protected]> Approved by: Gordon Ross <[email protected]> References to Illumos issue: https://www.illumos.org/issues/734 Ported-by: Prakash Surya <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #482
*	Added comments for libshare's NFS functions.	Gunnar Beutner	2011-12-05	1	-1/+57
\| \| \| \| \| \| \|	Some of the functions' purpose wasn't immediately obvious without additional explanations. This commit adds these missing comments. Signed-off-by: Brian Behlendorf <[email protected]>
*	Allow leading digits in userquota/groupquota names	Suman Chakravartula	2011-11-21	1	-39/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While setting/getting userquota and groupquota properties, the input was not treated as a possible username or groupname if it had a leading digit. While useradd in linux recommends the regexp [a-z_][a-z0-9_-]*[$]? , it is not enforced. This causes problem for usernames with leading digits in them. We need to be able to support getting and setting properties for this unconventional but possible input category I've updated the code to validate the username or groupname directly via the API. Also, note that I moved this validation to the beginning before the check for SID names with @. This also supports usernames with @ character in them which are valid. Only when input with @ is not a valid username, it is interpreted as a potential SID name. Signed-off-by: Brian Behlendorf <[email protected]> Closes #428
*	Limit maximum ashift value to 12	Brian Behlendorf	2011-11-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While we initially allowed you to set your ashift as large as 17 (SPA_MAXBLOCKSIZE) that is actually unsafe. What wasn't considered at the time is that each uberblock written to the vdev label ring buffer will be of this size. Now the buffer is statically sized to 128k and we need to be able to fit several uberblocks in it. With a large ashift that becomes a problem. Therefore I'm reducing the maximum configurable ashift value to 12. This is large enough for the 4k sector drives and small enough that we can still keep the most recent 32 uberblock in the vdev label ring buffer. Signed-off-by: Brian Behlendorf <[email protected]> Closes #425
*	Simplify BDI integration	Brian Behlendorf	2011-11-08	21	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Update the code to use the bdi_setup_and_register() helper to simplify the bdi integration code. The updated code now just registers the bdi during mount and destroys it during unmount. The only complication is that for 2.6.32 - 2.6.33 kernels the helper wasn't available so in these cases the zfs code must provide it. Luckily the bdi_setup_and_register() function is trivial. Signed-off-by: Brian Behlendorf <[email protected]> Closes #367
*	Make libefi-created GPT compatible with gptfdisk	Zachary Bedell	2011-09-26	1	-9/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GPT's created by libefi set the HeaderSize attribute in the GPT header to 512 -- size of the GPT header INCLUDING the 420 padding bytes at the end. Most other tools set the size to 92 -- size of the actual header itself excluding the padding. Most tools check the recorded HeaderSize when verifying CRC, but gptfdisk hardcodes 92 and thus reports CRC verification problems for full-disk vdevs created IE with `zpool create pool sdc`. This commit changes libefi's behavior for GPT creation and also fixes several edge cases where libefi's behavior was similar (though in an incompatible manner) to gptfdisk. Libefi assumed HeaderSize was always 512 even if the GPT recorded a different value. Sanity checks of the GPT headersize read from disk were added before applying checksum calculation -- this will prevent segfault in cases of bogus on-disk values. Zpools created with the resuling libefi are verified as correct both by parted and gptfdisk. Also pool have been tested to import correctly on ZFS on Linux as well as Solaris Express 11 livecd. Signed-off-by: Zachary Bedell <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #344
*	Autogen refresh for udev changes	Brian Behlendorf	2011-08-08	21	-0/+63
\| \| \| \| \| \| \| \|	Run autogen.sh using the same autotools versions as upstream: * autoconf-2.63 * automake-1.11.1 * libtool-2.2.6b
*	Add backing_device_info per-filesystem	Brian Behlendorf	2011-08-04	21	-0/+21
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	For a long time now the kernel has been moving away from using the pdflush daemon to write 'old' dirty pages to disk. The primary reason for this is because the pdflush daemon is single threaded and can be a limiting factor for performance. Since pdflush sequentially walks the dirty inode list for each super block any delay in processing can slow down dirty page writeback for all filesystems. The replacement for pdflush is called bdi (backing device info). The bdi system involves creating a per-filesystem control structure each with its own private sets of queues to manage writeback. The advantage is greater parallelism which improves performance and prevents a single filesystem from slowing writeback to the others. For a long time both systems co-existed in the kernel so it wasn't strictly required to implement the bdi scheme. However, as of Linux 2.6.36 kernels the pdflush functionality has been retired. Since ZFS already bypasses the page cache for most I/O this is only an issue for mmap(2) writes which must go through the page cache. Even then adding this missing support for newer kernels was overlooked because there are other mechanisms which can trigger writeback. However, there is one critical case where not implementing the bdi functionality can cause problems. If an application handles a page fault it can enter the balance_dirty_pages() callpath. This will result in the application hanging until the number of dirty pages in the system drops below the dirty ratio. Without a registered backing_device_info for the filesystem the dirty pages will not get written out. Thus the application will hang. As mentioned above this was less of an issue with older kernels because pdflush would eventually write out the dirty pages. This change adds a backing_device_info structure to the zfs_sb_t which is already allocated per-super block. It is then registered when the filesystem mounted and unregistered on unmount. It will not be registered for mounted snapshots which are read-only. This change will result in flush-<pool> thread being dynamically created and destroyed per-mounted filesystem for writeback. Signed-off-by: Brian Behlendorf <[email protected]> Closes #174
*	Use libzfs_run_process() in libshare.	Gunnar Beutner	2011-08-01	1	-59/+29
\| \| \| \| \| \| \| \|	This should simplify the code a bit by re-using existing code to fork and exec a process. Signed-off-by: Brian Behlendorf <[email protected]> Issue #190
*	Use /dev/null for stdout/stderr in libzfs_run_process().	Gunnar Beutner	2011-08-01	1	-3/+10
\| \| \| \| \| \| \| \| \| \| \| \|	Simply closing the stdout and/or stderr file descriptors for the child process can have bad side effects if for example the child writes to stdout/stderr after open()ing a file. The open() call might have returned the same file descriptor one would usually expect for stdout/stderr (1 and 2), thereby causing mis-directed writes. Signed-off-by: Brian Behlendorf <[email protected]> Issue #190
*	Call exportfs -v once for NFS shares.	James H	2011-08-01	1	-90/+74
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	At the moment we call exportfs -v every time we check whether an NFS share is active. This happens every time you run a zfs or zpool command, making them extremely slow when you have a lot of exports. The time taken is approx O(n2) of the number of shares. This commit stores the output from exportfs -v in a temporary file and use this to speed up subsequent accesses. This mechanism is still too slow - if you have tens of thousands of NFS shares it will still be painful running ANY zfs/zpool command. Signed-off-by: Gunnar Beutner <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #341
*	Illumos #278: get rid zfs of python and pyzfs dependencies	Alexander Stetsenko	2011-08-01	1	-0/+189
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove all python and pyzfs dependencies for consistency and to ensure full functionality even in a mimimalist environment. Reviewed by: [email protected] Reviewed by: [email protected] Reviewed by: [email protected] Reviewed by: [email protected] Approved by: [email protected] References to Illumos issue and patch: - https://www.illumos.org/issues/278 - https://github.com/illumos/illumos-gate/commit/1af68beac3 Signed-off-by: Brian Behlendorf <[email protected]> Issue #340 Issue #160 Signed-off-by: Brian Behlendorf <[email protected]>
*	Illumos #1092: zfs refratio property	Matt Ahrens	2011-08-01	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add a "REFRATIO" property, which is the compression ratio based on data referenced. For snapshots, this is the same as COMPRESSRATIO, but for filesystems/volumes, the COMPRESSRATIO is based on the data "USED" (ie, includes blocks in children, but not blocks shared with the origin). This is needed to figure out how much space a filesystem would use if it were not compressed (ignoring snapshots). Reviewed by: George Wilson <[email protected]> Reviewed by: Adam Leventhal <[email protected]> Reviewed by: Dan McDonald <[email protected]> Reviewed by: Richard Elling <[email protected]> Reviewed by: Mark Musante <[email protected]> Reviewed by: Garrett D'Amore <[email protected]> Approved by: Garrett D'Amore <[email protected]> References to Illumos issue and patch: - https://www.illumos.org/issues/1092 - https://github.com/illumos/illumos-gate/commit/187d6ac08a Signed-off-by: Brian Behlendorf <[email protected]> Issue #340
*	Provide a rc.d script for archlinuxzfs-0.6.0-rc5	Kyle Fuller	2011-07-11	21	-0/+21
\| \| \| \| \| \| \| \| \| \| \|	Unlike most other Linux distributions archlinux installs its init scripts in /etc/rc.d insead of /etc/init.d. This commit provides an archlinux rc.d script for zfs and extends the build infrastructure to ensure it get's installed in the correct place. Signed-off-by: Brian Behlendorf <[email protected]> Closes #322
*	Add proper library versioning	Brian Behlendorf	2011-07-06	16	-13/+43
\| \| \| \| \| \| \| \|	The zfs libraries were never properly versioned. Since the API has remained static for quite some time this we never an issue. However, going forward they should be versioned. This commit versions all of the libraries to 1.0.0. From here on out this version must be updated to reflect changes to the library.
*	Link libshare directly to libzfs	Gunnar Beutner	2011-07-06	3	-157/+18
\| \| \| \| \| \| \| \| \| \| \| \| \|	Drop usage of dlopen/dlsym for libshare. There is no need to do this because the zfs packages provide libshare. Unlike on Solaris we are guaranteed it will be available. This avoids possible problems with hardcoding the libshare path in the code (e.g. when users specify a different install path via configure options). It additionally simplifies the code which is good for maintainability. Signed-off-by: Brian Behlendorf <[email protected]>
*	Implemented sharing datasets via NFS using libshare.	Gunnar Beutner	2011-07-06	10	-18/+2375
\| \| \| \| \| \| \| \|	The sharenfs and sharesmb properties depend on the libshare library to export datasets via NFS and SMB. This commit implements the base libshare functionality as well as support for managing NFS shares. Signed-off-by: Brian Behlendorf <[email protected]>
*	Fix implicit declaration of 'mkdirp'	Brian Behlendorf	2011-07-01	2	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	The lib/libspl/include/libgen.h header file was being mistakenly left out of the 'make dist' tarball. It just happens this doesn't cause a build failure when creating packages because the system libgen/h is included instead. This simply results in the following warning due to the missing forward declaration of mkdirp(). ../../lib/libzfs/libzfs_mount.c:417:3: warning: implicit declaration of function 'mkdirp' [-Wimplicit-function-declaration]
*	Linux compat 2.6.39: mount_nodev()	Brian Behlendorf	2011-07-01	23	-13/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The .get_sb callback has been replaced by a .mount callback in the file_system_type structure. When using the new interface the caller must now use the mount_nodev() helper. Unfortunately, the new interface no longer passes the vfsmount down to the zfs layers. This poses a problem for the existing implementation because we currently save this pointer in the super block for latter use. It provides our only entry point in to the namespace layer for manipulating certain mount options. This needed to be done originally to allow commands like 'zfs set atime=off tank' to work properly. It also allowed me to keep more of the original Solaris code unmodified. Under Solaris there is a 1-to-1 mapping between a mount point and a file system so this is a fairly natural thing to do. However, under Linux they many be multiple entries in the namespace which reference the same filesystem. Thus keeping a back reference from the filesystem to the namespace is complicated. Rather than introduce some ugly hack to get the vfsmount and continue as before. I'm leveraging this API change to update the ZFS code to do things in a more natural way for Linux. This has the upside that is resolves the compatibility issue for the long term and fixes several other minor bugs which have been reported. This commit updates the code to remove this vfsmount back reference entirely. All modifications to filesystem mount options are now passed in to the kernel via a '-o remount'. This is the expected Linux mechanism and allows the namespace to properly handle any options which apply to it before passing them on to the file system itself. Aside from fixing the compatibility issue, removing the vfsmount has had the benefit of simplifying the code. This change which fairly involved has turned out nicely. Closes #246 Closes #217 Closes #187 Closes #248 Closes #231
*	Linux compat 2.6.39: security_inode_init_security()	Brian Behlendorf	2011-07-01	20	-0/+20
\| \| \| \| \| \| \| \| \| \| \|	The security_inode_init_security() function now takes an additional qstr argument which must be passed in from the dentry if available. Passing a NULL is safe when no qstr is available the relevant security checks will just be skipped. Closes #246 Closes #217 Closes #187
*	Tear down and flush the mmap region	Prasad Joshi	2011-06-27	20	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The inode eviction should unmap the pages associated with the inode. These pages should also be flushed to disk to avoid the data loss. Therefore, use truncate_setsize() in evict_inode() to release the pagecache. The API truncate_setsize() was added in 2.6.35 kernel. To ensure compatibility with the old kernel, the patch defines its own truncate_setsize function. Signed-off-by: Prasad Joshi <[email protected]> Closes #255
*	Add "ashift" property to zpool create	Christian Kohlschütter	2011-06-17	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Some disks with internal sectors larger than 512 bytes (e.g., 4k) can suffer from bad write performance when ashift is not configured correctly. This is caused by the disk not reporting its actual sector size, but a sector size of 512 bytes. The drive may behave this way for compatibility reasons. For example, the WDC WD20EARS disks are known to exhibit this behavior. When creating a zpool, ZFS takes that wrong sector size and sets the "ashift" property accordingly (to 9: 1<<9=512), whereas it should be set to 12 for 4k sectors (1<<12=4096). This patch allows an adminstrator to manual specify the known correct ashift size at 'zpool create' time. This can significantly improve performance in certain cases. However, it will have an impact on your total pool capacity. See the updated ashift property description in the zpool.8 man page for additional details. Valid values for the ashift property range from 9 to 17 (512B-128KB). Additionally, you may set the ashift to 0 if you wish to auto-detect the sector size based on what the disk reports, this is the default behavior. The most common ashift values are 9 and 12. Example: zpool create -o ashift=12 tank raidz2 sda sdb sdc sdd Closes #280 Original-patch-by: Richard Laager <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
*	Always check -Wno-unused-but-set-variable gcc support	Brian Behlendorf	2011-06-14	20	-20/+20
\| \| \| \| \| \| \| \| \| \| \|	The previous commit 8a7e1ceefa430988c8f888ca708ab307333b4464 wasn't quite right. This check applies to both the user and kernel space build and as such we must make sure it runs regardless of what the --with-config option is set too. For example, if --with-config=kernel then the autoconf test does not run and we generate build warnings when compiling the kernel packages.
*	Check for -Wno-unused-but-set-variable gcc support	Brian Behlendorf	2011-06-14	20	-9/+49
\| \| \| \| \| \| \| \| \| \| \| \| \|	Gcc versions 4.3.2 and earlier do not support the compiler flag -Wno-unused-but-set-variable. This can lead to build failures on older Linux platforms such as Debian Lenny. Since this is an optional build argument this changes add a new autoconf check for the option. If it is supported by the installed version of gcc then it is used otherwise it is omited. See commit's 12c1acde76683108441827ae9affba1872f3afe5 and 79713039a2b6e0ed223d141b4a8a8455f282d2f2 for the reason the -Wno-unused-but-set-variable options was originally added.
*	Fix 'zfs send -D' segfault	Brian Behlendorf	2011-06-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Sending pools with dedup results in a segfault due to a Solaris portability issue. Under Solaris the pipe(2) library call creates a bidirectional data channel. Unfortunately, on Linux pipe(2) call creates unidirection data channel. The fix is to use the socketpair(2) function to create the expected bidirectional channel. Seth Heeren did the original leg work on this issue for zfs-fuse. We finally just rediscovered the same portability issue and dfurphy was able to point me at the original issue for the fix. Closes #268
*	Fix 'zfs set volsize=N pool/dataset'	Brian Behlendorf	2011-05-02	20	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This change fixes a kernel panic which would occur when resizing a dataset which was not open. The objset_t stored in the zvol_state_t will be set to NULL when the block device is closed. To avoid this issue we pass the correct objset_t as the third arg. The code has also been updated to correctly notify the kernel when the block device capacity changes. For 2.6.28 and newer kernels the capacity change will be immediately detected. For earlier kernels the capacity change will be detected when the device is next opened. This is a known limitation of older kernels. Online ext3 resize test case passes on 2.6.28+ kernels: $ dd if=/dev/zero of=/tmp/zvol bs=1M count=1 seek=1023 $ zpool create tank /tmp/zvol $ zfs create -V 500M tank/zd0 $ mkfs.ext3 /dev/zd0 $ mkdir /mnt/zd0 $ mount /dev/zd0 /mnt/zd0 $ df -h /mnt/zd0 $ zfs set volsize=800M tank/zd0 $ resize2fs /dev/zd0 $ df -h /mnt/zd0 Original-patch-by: Fajar A. Nugraha <[email protected]> Closes #68 Closes #84
*	Add zpl_export.c to the list of targets.	Alejandro R. Sedeño	2011-04-29	2	-0/+2
\|
*	Correct MAXUID	Brian Behlendorf	2011-04-29	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The uid_t on most systems is in fact and unsigned 32-bit value. This is almost always correct, however you could compile your kernel to use an unsigned 16-bit value for uid_t. In practice I've never encountered a distribution which does this so I'm willing to overlook this corner case for now. Closes #165
*	Implemented NFS export_operations.	Gunnar Beutner	2011-04-29	20	-0/+20
\| \| \| \| \|	Implemented the required NFS operations for exporting ZFS datasets using the in-kernel NFS daemon.
*	Use gethostid in the Linux convention.	Darik Horn	2011-04-25	2	-11/+2
\| \| \| \| \| \| \| \| \| \|	Disable the gethostid() override for Solaris behavior because Linux systems implement the POSIX standard in a way that allows a negative result. Mask the gethostid() result to the lower four bytes, like coreutils does in /usr/bin/hostid, to prevent junk bits or sign-extension on systems that have an eight byte long type. This can cause a spurious hostid mismatch that prevents zpool import on 64-bit systems.
*	Fix 32-bit MAXOFFSET_T definition	Brian Behlendorf	2011-04-22	1	-7/+2
\| \| \| \| \| \| \| \| \| \| \|	Having MAXOFFSET_T defined to 0x7fffffffl was artificially limiting the maximum file size on 32-bit systems. In reality MAXOFFSET_T is used when working with 'long long' types and as such we now define it as LLONG_MAX. This resolves the 2GB file size limit for files and additionally allows zvols greater than 2GB on 32-bit systems. Closes #136 Closes #81
*	Support IEC base-2 prefixes	Richard Laager	2011-04-19	1	-4/+7
\| \| \| \|	Signed-off-by: Brian Behlendorf <[email protected]>
*	Set -Wno-unused-but-set-variable globally	Brian Behlendorf	2011-04-19	9	-34/+43
\| \| \| \| \| \| \| \| \| \| \| \| \|	As of gcc-4.6 the option -Wunused-but-set-variable is enabled by default. While this is a useful warning there are numerous places in the ZFS code when a variable is set and then only checked in an ASSERT(). To avoid having to update every instance of this in the code we now set -Wno-unused-but-set-variable to suppress the warning. Additionally, when building with --enable-debug and -Werror set these warning also become fatal. We can reevaluate the suppression of these error at a later time if it becomes an issue. For now we are basically just reverting to the previous gcc behavior.
*	Linux 2.6.28 compat, insert_inode_locked()	Brian Behlendorf	2011-03-22	20	-0/+20
\| \| \| \| \| \| \|	Added insert_inode_locked() helper function, prior to this most callers used insert_inode_hash(). The older method doesn't check for collisions in the inode_hashtable but it still acceptible for use. Fallback to using insert_inode_hash() when insert_inode_locked() is unavailable.
*	Linux compat, umount2(2) flags	Brian Behlendorf	2011-03-22	1	-2/+17
\| \| \| \| \| \| \| \|	Older glibc <sys/mount.h> headers did not define all the available umount2(2) flags. Both MNT_FORCE and MNT_DETACH are supported in the kernel back to 2.4.11 so we define them correctly if they are missing. Closes #95
*	Merge branch 'dracut'	Brian Behlendorf	2011-03-22	20	-0/+20
\|\
\| *	Add init scripts	Brian Behlendorf	2011-03-17	20	-0/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To support automatically mounting your zfs on filesystem on boot a basic init script is needed. Unfortunately, every distribution has their own idea of the _right_ way to do things. Rather than write one very complicated portable init script, which would be invariably replaced by the distributions own anyway. I have instead added support to provide multiple distribution specific init scripts. The correct init script for your distribution will be selected by ZFS_AC_DEFAULT_PACKAGE which will set DEFAULT_INIT_SCRIPT. During 'make install' the correct script for your system will be installed from zfs/etc/init.d/zfs.DEFAULT_INIT_SCRIPT to the usual /etc/init.d/zfs location. Currently, there is zfs.fedora and a more generic zfs.lsb init script. Hopefully, the distribution maintainers who know best how they want their init scripts to function will feedback their approved versions to be included in the project. This change does not consider upstart jobs but I'm not at all opposed to add that sort of thing.
* \|	Fix 'LDFLAGS=-Wl,--as-needed' build error	Brian Behlendorf	2011-03-18	2	-0/+3
\|/ \| \| \| \| \| \| \| \| \|	Compiling with 'LDFLAGS=-Wl,--as-needed' exposed the fact that there were some library linking problems introduced by mount_zfs. In particular, the libzfs library does use nvpair symbols, and mount_zfs contains no dependencies on libzpool. Closes #161 Closes #162
*	Print mount/umount errors	Brian Behlendorf	2011-03-09	2	-8/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because we are dependent of the system mount/umount utilities to ensure correct mtab locking, we should not suppress their error output. During a successful mount/umount they will be silent, but during a failure the error message they print is the only sure way to know why a mount failed. This is because the (u)mount(8) return code does not contain the result of the system call issued. The only way to clearly idenify why thing failed is to rely on the error message printed by the tool. Longer term once libmount is available we can issue the mount/umount system calls within the tool and still be ensured correct mtab locking. Closed #107
*	Fix mount helper	Brian Behlendorf	2011-03-09	2	-16/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Several issues related to strange mount/umount behavior were reported and this commit should address most of them. The original idea was to put in place a zfs mount helper (mount.zfs). This helper is used to enforce 'legacy' mount behavior, and perform any extra mount argument processing (selinux, zfsutil, etc). This helper wasn't ready for the 0.6.0-rc1 release but with this change it's functional but needs to extensively tested. This change addresses the following open issues. Closes #101 Closes #107 Closes #113 Closes #115 Closes #119