| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
The vmtruncate_range() support has been removed from the kernel in
favor of using the fallocate method in the file_operations table.
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #784
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The export_operations member ->encode_fh() has been updated to
take both the child and parent inodes. This interface used to
take the child dentry and a bool describing if the parent is needed.
NOTE: While updating this code I noticed that we do not currently
cleanly handle the case where we're passed a connectable parent.
This code should be audited to make sure we're doing the right thing.
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #784
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, zpool online -e (dynamic vdev expansion) doesn't work on
whole disks because we're invoking ioctl(BLKRRPART) from userspace
while ZFS still has a partition open on the disk, which results in
EBUSY.
This patch moves the BLKRRPART invocation from the zpool utility to the
module. Specifically, this is done just before opening the device in
vdev_disk_open() which is called inside vdev_reopen(). This requires
jumping through some hoops to get to the disk device from the partition
device, and to make sure we can still open the partition after the
BLKRRPART call.
Note that this new code path is triggered on dynamic vdev expansion
only; other actions, like creating a new pool, are unchanged and still
call BLKRRPART from userspace.
This change also depends on API changes which are available in 2.6.37
and latter kernels. The build system has been updated to detect this,
but there is no compatibility mode for older kernels. This means that
online expansion will NOT be available in older kernels. However, it
will still be possible to expand the vdev offline.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #808
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
torvalds/linux@adc0e91ab142abe93f5b0d7980ada8a7676231fe introduced
introduced d_make_root() as a replacement for d_alloc_root(). Further
commits appear to have removed d_alloc_root() from the Linux source
tree. This causes the following failure:
error: implicit declaration of function 'd_alloc_root'
[-Werror=implicit-function-declaration]
To correct this we update the code to use the current d_make_root()
interface for readability. Then we introduce an autotools check
to determine if d_make_root() is available. If it isn't then we
define some compatibility logic which used the older d_alloc_root()
interface.
Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #776
|
|
|
|
|
|
|
|
|
|
| |
The mode argument of iops->create()/mkdir()/mknod() was changed from
an 'int' to a 'umode_t'. To prevent a compiler warning an autoconf
check was added to detect the API change and then correctly set a
zpl_umode_t typedef. There is no functional change.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #701
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow rigorous (and expensive) tx validation to be enabled/disabled
indepentantly from the standard zfs debugging. When enabled these
checks ensure that all txs are constructed properly and that a dbuf
is never dirtied without taking the correct tx hold.
This checking is particularly helpful when adding new dmu consumers
like Lustre. However, for established consumers such as the zpl
with no known outstanding tx construction problems this is just
overhead.
--enable-debug-dmu-tx - Enable/disable validation of each tx as
--disable-debug-dmu-tx it is constructed. By default validation
is disabled due to performance concerns.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for the .zfs control directory. This was accomplished
by leveraging as much of the existing ZFS infrastructure as posible
and updating it for Linux as required. The bulk of the core
functionality is now all there with the following limitations.
*) The .zfs/snapshot directory automount support requires a 2.6.37
or newer kernel. The exception is RHEL6.2 which has backported
the d_automount patches.
*) Creating/destroying/renaming snapshots with mkdir/rmdir/mv
in the .zfs/snapshot directory works as expected. However,
this functionality is only available to root until zfs
delegations are finished.
* mkdir - create a snapshot
* rmdir - destroy a snapshot
* mv - rename a snapshot
The following issues are known defeciences, but we expect them to
be addressed by future commits.
*) Add automount support for kernels older the 2.6.37. This should
be possible using follow_link() which is what Linux did before.
*) Accessing the .zfs/snapshot directory via NFS is not yet possible.
The majority of the ground work for this is complete. However,
finishing this work will require resolving some lingering
integration issues with the Linux NFS kernel server.
*) The .zfs/shares directory exists but no futher smb functionality
has yet been implemented.
Contributions-by: Rohan Puri <[email protected]>
Contributiobs-by: Andrew Barnes <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #173
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow a source rpm to be rebuilt with debugging enabled. This
avoids the need to have to manually modify the spec file. By
default debugging is still largely disabled. To enable specific
debugging features use the following options with rpmbuild.
'--with debug' - Enables ASSERTs
# For example:
$ rpmbuild --rebuild --with debug zfs-modules-0.6.0-rc6.src.rpm
Additionally, ZFS_CONFIG has been added to zfs_config.h for
packages which build against these headers. This is critical
to ensure both zfs and the dependant package are using the same
prototype and structure definitions.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DISCARD (REQ_DISCARD, BLKDISCARD) is useful for thin provisioning.
It allows ZVOL clients to discard (unmap, trim) block ranges from
a ZVOL, thus optimizing disk space usage by allowing a ZVOL to
shrink instead of just grow.
We can't use zfs_space() or zfs_freesp() here, since these functions
only work on regular files, not volumes. Fortunately we can use the
low-level function dmu_free_long_range() which does exactly what we
want.
Currently the discard operation is not added to the log. That's not
a big deal since losing discard requests cannot result in data
corruption. It would however result in disk space usage higher than
it should be. Thus adding log support to zvol_discard() is probably
a good idea for a future improvement.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently only the (FALLOC_FL_PUNCH_HOLE) flag combination is
supported, since it's the only one that matches the behavior of
zfs_space(). This makes it pretty much useless in its current
form, but it's a start.
To support other flag combinations we would need to modify
zfs_space() to make it more flexible, or emulate the desired
functionality in zpl_fallocate().
Signed-off-by: Brian Behlendorf <[email protected]>
Issue #334
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The recent zvol improvements have changed default suggested alignment
for zvols from 512b (default) to 8k (zvol blocksize). Because of this
the zconfig.sh tests which create paritions are now generating a
warning about non-optimal alignments.
This change updates the need zconfig.sh tests such that a partition
will be properly aligned. In the process, it shifts from using the
sfdisk utility to the parted utility to create partitions. It also
moves the creation of labels, partitions, and filesystems in to
generic functions in common.sh.in.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Linux block device queue subsystem exposes a number of configurable
settings described in Linux block/blk-settings.c. The defaults for these
settings are tuned for hard drives, and are not optimized for ZVOLs. Proper
configuration of these options would allow upper layers (I/O scheduler) to
take better decisions about write merging and ordering.
Detailed rationale:
- max_hw_sectors is set to unlimited (UINT_MAX). zvol_write() is able to
handle writes of any size, so there's no reason to impose a limit. Let the
upper layer decide.
- max_segments and max_segment_size are set to unlimited. zvol_write() will
copy the requests' contents into a dbuf anyway, so the number and size of
the segments are irrelevant. Let the upper layer decide.
- physical_block_size and io_opt are set to the ZVOL's block size. This
has the potential to somewhat alleviate issue #361 for ZVOLs, by warning
the upper layers that writes smaller than the volume's block size will be
slow.
- The NONROT flag is set to indicate this isn't a rotational device.
Although the backing zpool might be composed of rotational devices, the
resulting ZVOL often doesn't exhibit the same behavior due to the COW
mechanisms used by ZFS. Setting this flag will prevent upper layers from
making useless decisions (such as reordering writes) based on incorrect
assumptions about the behavior of the ZVOL.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
zvol_write() assumes that the write request must be written to stable storage
if rq_is_sync() is true. Unfortunately, this assumption is incorrect. Indeed,
"sync" does *not* mean what we think it means in the context of the Linux
block layer. This is well explained in linux/fs.h:
WRITE: A normal async write. Device will be plugged.
WRITE_SYNC: Synchronous write. Identical to WRITE, but passes down
the hint that someone will be waiting on this IO
shortly.
WRITE_FLUSH: Like WRITE_SYNC but with preceding cache flush.
WRITE_FUA: Like WRITE_SYNC but data is guaranteed to be on
non-volatile media on completion.
In other words, SYNC does not *mean* that the write must be on stable storage
on completion. It just means that someone is waiting on us to complete the
write request. Thus triggering a ZIL commit for each SYNC write request on a
ZVOL is unnecessary and harmful for performance. To make matters worse, ZVOL
users have no way to express that they actually want data to be written to
stable storage, which means the ZIL is broken for ZVOLs.
The request for stable storage is expressed by the FUA flag, so we must
commit the ZIL after the write if the FUA flag is set. In addition, we must
commit the ZIL before the write if the FLUSH flag is set.
Also, we must inform the block layer that we actually support FLUSH and FUA.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The second argument of sops->show_options() was changed from a
'struct vfsmount *' to a 'struct dentry *'. Add an autoconf check
to detect the API change and then conditionally define the expected
interface. In either case we are only interested in the zfs_sb_t.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #549
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Linux 3.1 kernel has introduced the concept of per-filesystem
shrinkers which are directly assoicated with a super block. Prior
to this change there was one shared global shrinker.
The zfs code relied on being able to call the global shrinker when
the arc_meta_limit was exceeded. This would cause the VFS to drop
references on a fraction of the dentries in the dcache. The ARC
could then safely reclaim the memory used by these entries and
honor the arc_meta_limit. Unfortunately, when per-filesystem
shrinkers were added the old interfaces were made unavailable.
This change adds support to use the new per-filesystem shrinker
interface so we can continue to honor the arc_meta_limit. The
major benefit of the new interface is that we can now target
only the zfs filesystem for dentry and inode pruning. Thus we
can minimize any impact on the caching of other filesystems.
In the context of making this change several other important
issues related to managing the ARC were addressed, they include:
* The dnlc_reduce_cache() function which was called by the ARC
to drop dentries for the Posix layer was replaced with a generic
zfs_prune_t callback. The ZPL layer now registers a callback to
drop these dentries removing a layering violation which dates
back to the Solaris code. This callback can also be used by
other ARC consumers such as Lustre.
arc_add_prune_callback()
arc_remove_prune_callback()
* The arc_reduce_dnlc_percent module option has been changed to
arc_meta_prune for clarity. The dnlc functions are specific to
Solaris's VFS and have already been largely eliminated already.
The replacement tunable now represents the number of bytes the
prune callback will request when invoked.
* Less aggressively invoke the prune callback. We used to call
this whenever we exceeded the arc_meta_limit however that's not
strictly correct since it results in over zeleous reclaim of
dentries and inodes. It is now only called once the arc_meta_limit
is exceeded and every effort has been made to evict other data from
the ARC cache.
* More promptly manage exceeding the arc_meta_limit. When reading
meta data in to the cache if a buffer was unable to be recycled
notify the arc_reclaim thread to invoke the required prune.
* Added arcstat_prune kstat which is incremented when the ARC
is forced to request that a consumer prune its cache. Remember
this will only occur when the ARC has no other choice. If it
can evict buffers safely without invoking the prune callback
it will.
* This change is also expected to resolve the unexpect collapses
of the ARC cache. This would occur because when exceeded just the
arc_meta_limit reclaim presure would be excerted on the arc_c
value via arc_shrink(). This effectively shrunk the entire cache
when really we just needed to reclaim meta data.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #466
Closes #292
|
|
|
|
|
|
|
|
|
|
|
| |
Directly changing inode->i_nlink is deprecated in Linux 3.2 by commit
SHA: bfe8684869601dacfcb2cd69ef8cfd9045f62170
Use the new set_nlink() kernel function instead.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes: #462
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Added the necessary build infrastructure for building packages
compatible with the Arch Linux distribution. As such, one can now run:
$ ./configure
$ make pkg # Alternatively, one can run 'make arch' as well
on the Arch Linux machine to create two binary packages compatible with
the pacman package manager, one for the zfs userland utilities and
another for the zfs kernel modules. The new packages can then be
installed by running:
# pacman -U $package.pkg.tar.xz
In addition, source-only packages suitable for an Arch Linux chroot
environment or remote builder can also be build using the 'sarch' make
rule.
NOTE: Since the source dist tarball is created on the fly from the head
of the build tree, it's MD5 hash signature will be continually influx.
As a result, the md5sum variable was intentionally omitted from the
PKGBUILD files, and the '--skipinteg' makepkg option is used. This may
or may not have any serious security implications, as the source tarball
is not being downloaded from an outside source.
Signed-off-by: Prakash Surya <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #491
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update the code to use the bdi_setup_and_register() helper to
simplify the bdi integration code. The updated code now just
registers the bdi during mount and destroys it during unmount.
The only complication is that for 2.6.32 - 2.6.33 kernels the
helper wasn't available so in these cases the zfs code must
provide it. Luckily the bdi_setup_and_register() function
is trivial.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #367
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running the zconfig.sh, zpios-sanity.sh, and zfault.sh
from the installed packages the 90-zfs.rules can cause failures.
These will occur because the test suite assumes it has full
control over loading/unloading the module stack. If the stack
gets asynchronously loaded by the udev rule the test suite
will treat it as a failure. Resolve the issue by disabling
the offending rule during the tests and enabling it on exit.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
|
| |
Run autogen.sh using the same autotools versions as upstream:
* autoconf-2.63
* automake-1.11.1
* libtool-2.2.6b
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a long time now the kernel has been moving away from using the
pdflush daemon to write 'old' dirty pages to disk. The primary reason
for this is because the pdflush daemon is single threaded and can be
a limiting factor for performance. Since pdflush sequentially walks
the dirty inode list for each super block any delay in processing can
slow down dirty page writeback for all filesystems.
The replacement for pdflush is called bdi (backing device info). The
bdi system involves creating a per-filesystem control structure each
with its own private sets of queues to manage writeback. The advantage
is greater parallelism which improves performance and prevents a single
filesystem from slowing writeback to the others.
For a long time both systems co-existed in the kernel so it wasn't
strictly required to implement the bdi scheme. However, as of
Linux 2.6.36 kernels the pdflush functionality has been retired.
Since ZFS already bypasses the page cache for most I/O this is only
an issue for mmap(2) writes which must go through the page cache.
Even then adding this missing support for newer kernels was overlooked
because there are other mechanisms which can trigger writeback.
However, there is one critical case where not implementing the bdi
functionality can cause problems. If an application handles a page
fault it can enter the balance_dirty_pages() callpath. This will
result in the application hanging until the number of dirty pages in
the system drops below the dirty ratio.
Without a registered backing_device_info for the filesystem the
dirty pages will not get written out. Thus the application will hang.
As mentioned above this was less of an issue with older kernels because
pdflush would eventually write out the dirty pages.
This change adds a backing_device_info structure to the zfs_sb_t
which is already allocated per-super block. It is then registered
when the filesystem mounted and unregistered on unmount. It will
not be registered for mounted snapshots which are read-only. This
change will result in flush-<pool> thread being dynamically created
and destroyed per-mounted filesystem for writeback.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #174
|
|
|
|
|
|
|
|
|
|
|
| |
Unlike most other Linux distributions archlinux installs its
init scripts in /etc/rc.d insead of /etc/init.d. This commit
provides an archlinux rc.d script for zfs and extends the
build infrastructure to ensure it get's installed in the
correct place.
Signed-off-by: Brian Behlendorf <[email protected]>
Closes #322
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The .get_sb callback has been replaced by a .mount callback
in the file_system_type structure. When using the new
interface the caller must now use the mount_nodev() helper.
Unfortunately, the new interface no longer passes the vfsmount
down to the zfs layers. This poses a problem for the existing
implementation because we currently save this pointer in the
super block for latter use. It provides our only entry point
in to the namespace layer for manipulating certain mount options.
This needed to be done originally to allow commands like
'zfs set atime=off tank' to work properly. It also allowed me
to keep more of the original Solaris code unmodified. Under
Solaris there is a 1-to-1 mapping between a mount point and a
file system so this is a fairly natural thing to do. However,
under Linux they many be multiple entries in the namespace
which reference the same filesystem. Thus keeping a back
reference from the filesystem to the namespace is complicated.
Rather than introduce some ugly hack to get the vfsmount and
continue as before. I'm leveraging this API change to update
the ZFS code to do things in a more natural way for Linux.
This has the upside that is resolves the compatibility issue
for the long term and fixes several other minor bugs which
have been reported.
This commit updates the code to remove this vfsmount back
reference entirely. All modifications to filesystem mount
options are now passed in to the kernel via a '-o remount'.
This is the expected Linux mechanism and allows the namespace
to properly handle any options which apply to it before passing
them on to the file system itself.
Aside from fixing the compatibility issue, removing the
vfsmount has had the benefit of simplifying the code. This
change which fairly involved has turned out nicely.
Closes #246
Closes #217
Closes #187
Closes #248
Closes #231
|
|
|
|
|
|
|
|
|
|
|
| |
The security_inode_init_security() function now takes an additional
qstr argument which must be passed in from the dentry if available.
Passing a NULL is safe when no qstr is available the relevant
security checks will just be skipped.
Closes #246
Closes #217
Closes #187
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The inode eviction should unmap the pages associated with the inode.
These pages should also be flushed to disk to avoid the data loss.
Therefore, use truncate_setsize() in evict_inode() to release the
pagecache.
The API truncate_setsize() was added in 2.6.35 kernel. To ensure
compatibility with the old kernel, the patch defines its own
truncate_setsize function.
Signed-off-by: Prasad Joshi <[email protected]>
Closes #255
|
|
|
|
|
|
|
|
|
|
|
| |
The previous commit 8a7e1ceefa430988c8f888ca708ab307333b4464 wasn't
quite right. This check applies to both the user and kernel space
build and as such we must make sure it runs regardless of what
the --with-config option is set too.
For example, if --with-config=kernel then the autoconf test does
not run and we generate build warnings when compiling the kernel
packages.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gcc versions 4.3.2 and earlier do not support the compiler flag
-Wno-unused-but-set-variable. This can lead to build failures
on older Linux platforms such as Debian Lenny. Since this is
an optional build argument this changes add a new autoconf check
for the option. If it is supported by the installed version of
gcc then it is used otherwise it is omited.
See commit's 12c1acde76683108441827ae9affba1872f3afe5 and
79713039a2b6e0ed223d141b4a8a8455f282d2f2 for the reason the
-Wno-unused-but-set-variable options was originally added.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When your kernel is built with kernel stack tracing enabled and you
have the debugfs filesystem mounted. Then the zfs.sh script will clear
the worst observed kernel stack depth on module load and check the worst
case usage on module removal. If the stack depth ever exceeds 7000
bytes the full stack will be printed for debugging. This is dangerously
close to overrunning the default 8k stack.
This additional advisory debugging is particularly valuable when running
the regression tests on a kernel built with 16k stacks. In this case,
almost no matter how bad the stack overrun is you will see be able to
get a clean stack trace for debugging. Since the worst case stack usage
can be highly variable it's helpful to always check the worst case usage.
|
|
|
|
|
|
|
|
| |
If a pool was not cleanly exported passing the -f flag may be required
at 'zpool import' time. Since this test is simply validating that the
pool can be successfully imported in the absense of the cache file
always pass the -f to ensure it succeeds. This failure was observed
under RHEL6.1.
|
|
|
|
|
|
|
|
|
|
| |
Just like zconfig.sh the zpios-sanity.sh tests should run in a
sanatized environment. This ensures they never conflict with an
installed /etc/zfs/zpool.cache file.
This commit additionally improves the -c cleanup option. It now
removes the modules stack if loaded and destroys relevant md devices.
This behavior is now identical to zconfig.sh.
|
|
|
|
|
|
|
|
|
|
| |
Generally I don't approve of just adding an arbitrary delay to
avoid a problem but in this case I'm going to let it slide. We
may need to delay briefly after 'zpool destroy' returns to ensure
the loopback devices are closed. If they aren't closed than
losetup -d will not be able to destroy them. Unfortunately,
there's no easy state the check so we'll have to make due with
a simple delay.
|
|
|
|
|
|
|
|
| |
We should always unload zpios.ko on exit. This ensures
that subsequent calls to 'zfs.sh -u' from other utilities
will be able to unload the module stack and properly
cleanup. This is important for the the --cleanup option
which can be passed to zconfig.sh and zfault.sh.
|
|
|
|
|
|
| |
The zpios-sanity.sh script should return failure when any
of the individual zpios.sh tests fail. The previous code
would always return success suppressing real failures.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change fixes a kernel panic which would occur when resizing
a dataset which was not open. The objset_t stored in the
zvol_state_t will be set to NULL when the block device is closed.
To avoid this issue we pass the correct objset_t as the third arg.
The code has also been updated to correctly notify the kernel
when the block device capacity changes. For 2.6.28 and newer
kernels the capacity change will be immediately detected. For
earlier kernels the capacity change will be detected when the
device is next opened. This is a known limitation of older
kernels.
Online ext3 resize test case passes on 2.6.28+ kernels:
$ dd if=/dev/zero of=/tmp/zvol bs=1M count=1 seek=1023
$ zpool create tank /tmp/zvol
$ zfs create -V 500M tank/zd0
$ mkfs.ext3 /dev/zd0
$ mkdir /mnt/zd0
$ mount /dev/zd0 /mnt/zd0
$ df -h /mnt/zd0
$ zfs set volsize=800M tank/zd0
$ resize2fs /dev/zd0
$ df -h /mnt/zd0
Original-patch-by: Fajar A. Nugraha <[email protected]>
Closes #68
Closes #84
|
|
|
|
|
| |
Implemented the required NFS operations for exporting ZFS datasets
using the in-kernel NFS daemon.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change should have occured when we commited the new udev
rules for zvols. Basically, the test script is just out of date.
We need to update it to use the /dev/zvol/ device names, and
to expect the more common -partN suffixes.
I added a udev_trigger() call in zconfig_partition() and
zconfig_zvol_device_stat() to ensure that all the udev rules have
run before. This ensures the devices are available to subsequent
commands and closes a small race.
Finally, I was forced added a small 'sleep 1' to test 10. I
was observing occassional failures in my VM due to the device
still claiming to be busy. Delaying betwen the various methods
of adding/removing a vdev avoids the issue.
Closes #207
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some udev hooks are not designed to be idempotent, so calling udevadm
trigger outside of the distribution's initialization scripts can have
unexpected (and potentially dangerous) side effects. For example, the
system time may change or devices may appear multiple times. See Ubuntu
launchpad bug 320200 and this mailing list post for more details:
https://lists.ubuntu.com/archives/ubuntu-devel/2009-January/027260.html
To avoid these problems we call udevadm trigger with --action=change
--subsystem-match=block. The first argument tells udev just to refresh
devices, and make sure everything's as it should be. The second
argument limits the scope to block devices, so devices belonging to
other subsystems cannot be affected.
This doesn't fix the problem on older udev implementations that don't
provide udevadm but instead have udevtrigger as a standalone program.
In this case the above options aren't available so there's no way to
call call udevtrigger safely. But we can live with that since this
issue only exists in optional test and helper scripts, and most
zfs-on-linux users are running newer systems anyways.
|
|
|
|
|
|
|
| |
Added insert_inode_locked() helper function, prior to this most callers
used insert_inode_hash(). The older method doesn't check for collisions
in the inode_hashtable but it still acceptible for use. Fallback to
using insert_inode_hash() when insert_inode_locked() is unavailable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support automatically mounting your zfs on filesystem on boot
a basic init script is needed. Unfortunately, every distribution
has their own idea of the _right_ way to do things. Rather than
write one very complicated portable init script, which would be
invariably replaced by the distributions own anyway. I have
instead added support to provide multiple distribution specific
init scripts.
The correct init script for your distribution will be selected
by ZFS_AC_DEFAULT_PACKAGE which will set DEFAULT_INIT_SCRIPT.
During 'make install' the correct script for your system will
be installed from zfs/etc/init.d/zfs.DEFAULT_INIT_SCRIPT to the
usual /etc/init.d/zfs location.
Currently, there is zfs.fedora and a more generic zfs.lsb init
script. Hopefully, the distribution maintainers who know best
how they want their init scripts to function will feedback their
approved versions to be included in the project.
This change does not consider upstart jobs but I'm not at all
opposed to add that sort of thing.
|
|
|
|
|
|
|
|
|
|
|
| |
The open_bdev_exclusive() function has been replaced (again) by the
more generic blkdev_get_by_path() function. Additionally, the
counterpart function close_bdev_exclusive() has been replaced by
blkdev_put(). Because these functions are more generic versions
of the functions they replaced the compatibility macro must add
the FMODE_EXCL mask to ensure they are exclusive.
Closes #114
|
|
|
|
|
|
|
|
|
| |
Before it is safe to unload the zfs module stack all mounted
zfs filesystems must be unmounted. If they are not unmounted,
there will be references held on the modules and the stack cannot
be removed. To handle this have 'zfs.sh -u' which is used by all
of the test scripts umount all zfs filesystem before attempting
to unload the module stack.
|
|
|
|
|
|
| |
The new prefered inteface for evicting an inode from the inode cache
is the ->evict_inode() callback. It replaces both the ->delete_inode()
and ->clear_inode() callbacks which were previously used for this.
|
|
|
|
|
|
|
|
|
| |
The fsync() callback in the file_operations structure used to take
3 arguments. The callback now only takes 2 arguments because the
dentry argument was determined to be unused by all consumers. To
handle this a compatibility prototype was added to ensure the right
prototype is used. Our implementation never used the dentry argument
either so it's just a matter of using the right prototype.
|
|
|
|
|
|
|
| |
The const keyword was added to the 'struct xattr_handler' in the
generic Linux super_block structure. To handle this we define an
appropriate xattr_handler_t typedef which can be used. This was
the preferred solution because it keeps the code clean and readable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ZFS even under Solaris does not strictly require libshare to be
available. The current implementation attempts to dlopen() the
library to access the needed symbols. If this fails libshare
support is simply disabled.
This means that on Linux we only need the most minimal libshare
implementation. In fact just enough to prevent the build from
failing. Longer term we can decide if we want to implement a
libshare library like Solaris. At best this would be an abstraction
layer between ZFS and NFS/SMB. Alternately, we can drop libshare
entirely and directly integrate ZFS with Linux's NFS/SMB.
Finally the bare bones user-libshare.m4 test was dropped. If we
do decide to implement libshare at some point it will surely be
as part of this package so the check is not needed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If libselinux is detected on your system at configure time link
against it. This allows us to use a library call to detect if
selinux is enabled and if it is to pass the mount option:
"context=\"system_u:object_r:file_t:s0"
For now this is required because none of the existing selinux
policies are aware of the zfs filesystem type. Because of this
they do not properly enable xattr based labeling even though
zfs supports all of the required hooks.
Until distro's add zfs as a known xattr friendly fs type we
must use mntpoint labeling. Alternately, end users could modify
their existing selinux policy with a little guidance.
|
|
|
|
|
|
|
|
|
| |
As of the 0.5.2 tag, names of whole-disk vdevs must be specified to
the command line tools without partition identifiers. This commit
fixes a 'zpool online' command in zfault.sh that incorrectly includes
he partition in the vdev name, causing test 9 to fail.
Signed-off-by: Brian Behlendorf <[email protected]>
|
|
|
|
|
|
|
| |
When adding this functionality originally the options to only
run specific tests (-t), or conversely skip specific tests (-s)
were omitted from the usage page. This commit adds the missing
documentation.
|
|
|
|
|
|
|
|
|
|
| |
The idea behind the '-c' flag is to cleanup everything from a
previous test run which might cause the test script to fail.
This should also include removing the previously loaded module.
This makes it a little easier to run 'zconfig.sh -c', however
remember this is a test script and it will take all of your
other zpools offline for the purposes of the test. This notion
has also been extended to the default 'make check' behavior.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Loading and unloading the zlib modules as part of the zfs.sh
script has proven a little problematic for a few reasons.
* First, your kernel may not need to load either zlib_inflate
or zlib_deflate. This functionality may be built directly in
to your kernel. It depends entirely on what your distribution
decided was the right thing to do.
* Second, even if you do manage to load the correct modules you
may not be able to unload them. There may other consumers
of the modules with a reference preventing the unload.
To avoid both of these issues the test scripts have been updated to
attempt to unconditionally load all modules listed in KERNEL_MODULES.
If the module is successfully loaded you must have needed it. If
the module can't be loaded that almost certainly means either it is
built in to your kernel or is already being used by another consumer.
In both cases this is not an issue and we can move on to the spl/zfs
modules.
Finally, by removing these kernel modules from the MODULES list
we ensure they are never unloaded during 'zfs.sh -u'. This avoids
the issue of the script failing because there is another consumer
using the module we were not aware of. In other words the script
restricts unloading modules to only the spl/zfs modules.
Closes #78
|