aboutsummaryrefslogtreecommitdiffstats
path: root/include/sys/zfs_sa.h
Commit message (Collapse)AuthorAgeFilesLines
* Illumos 5027 - zfs large block supportMatthew Ahrens2015-05-111-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 5027 zfs large block support Reviewed by: Alek Pinchuk <[email protected]> Reviewed by: George Wilson <[email protected]> Reviewed by: Josef 'Jeff' Sipek <[email protected]> Reviewed by: Richard Elling <[email protected]> Reviewed by: Saso Kiselkov <[email protected]> Reviewed by: Brian Behlendorf <[email protected]> Approved by: Dan McDonald <[email protected]> References: https://www.illumos.org/issues/5027 https://github.com/illumos/illumos-gate/commit/b515258 Porting Notes: * Included in this patch is a tiny ISP2() cleanup in zio_init() from Illumos 5255. * Unlike the upstream Illumos commit this patch does not impose an arbitrary 128K block size limit on volumes. Volumes, like filesystems, are limited by the zfs_max_recordsize=1M module option. * By default the maximum record size is limited to 1M by the module option zfs_max_recordsize. This value may be safely increased up to 16M which is the largest block size supported by the on-disk format. At the moment, 1M blocks clearly offer a significant performance improvement but the benefits of going beyond this for the majority of workloads are less clear. * The illumos version of this patch increased DMU_MAX_ACCESS to 32M. This was determined not to be large enough when using 16M blocks because the zfs_make_xattrdir() function will fail (EFBIG) when assigning a TX. This was immediately observed under Linux because all newly created files must have a security xattr created and that was failing. Therefore, we've set DMU_MAX_ACCESS to 64M. * On 32-bit platforms a hard limit of 1M is set for blocks due to the limited virtual address space. We should be able to relax this one the ABD patches are merged. Ported-by: Brian Behlendorf <[email protected]> Closes #354
* cstyle: Resolve C style issuesMichael Kjorling2013-12-181-2/+2
| | | | | | | | | | | | | | | | | | The vast majority of these changes are in Linux specific code. They are the result of not having an automated style checker to validate the code when it was originally written. Others were caused when the common code was slightly adjusted for Linux. This patch contains no functional changes. It only refreshes the code to conform to style guide. Everyone submitting patches for inclusion upstream should now run 'make checkstyle' and resolve any warning prior to opening a pull request. The automated builders have been updated to fail a build if when 'make checkstyle' detects an issue. Signed-off-by: Brian Behlendorf <[email protected]> Closes #1821
* Implement SA based xattrsBrian Behlendorf2011-11-281-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current ZFS implementation stores xattrs on disk using a hidden directory. In this directory a file name represents the xattr name and the file contexts are the xattr binary data. This approach is very flexible and allows for arbitrarily large xattrs. However, it also suffers from a significant performance penalty. Accessing a single xattr can requires up to three disk seeks. 1) Lookup the dnode object. 2) Lookup the dnodes's xattr directory object. 3) Lookup the xattr object in the directory. To avoid this performance penalty Linux filesystems such as ext3 and xfs try to store the xattr as part of the inode on disk. When the xattr is to large to store in the inode then a single external block is allocated for them. In practice most xattrs are small and this approach works well. The addition of System Attributes (SA) to zfs provides us a clean way to make this optimization. When the dataset property 'xattr=sa' is set then xattrs will be preferentially stored as System Attributes. This allows tiny xattrs (~100 bytes) to be stored with the dnode and up to 64k of xattrs to be stored in the spill block. If additional xattr space is required, which is unlikely under Linux, they will be stored using the traditional directory approach. This optimization results in roughly a 3x performance improvement when accessing xattrs which brings zfs roughly to parity with ext4 and xfs (see table below). When multiple xattrs are stored per-file the performance improvements are even greater because all of the xattrs stored in the spill block will be cached. However, by default SA based xattrs are disabled in the Linux port to maximize compatibility with other implementations. If you do enable SA based xattrs then they will not be visible on platforms which do not support this feature. ---------------------------------------------------------------------- Time in seconds to get/set one xattr of N bytes on 100,000 files ------+--------------------------------+------------------------------ | setxattr | getxattr bytes | ext4 xfs zfs-dir zfs-sa | ext4 xfs zfs-dir zfs-sa ------+--------------------------------+------------------------------ 1 | 2.33 31.88 21.50 4.57 | 2.35 2.64 6.29 2.43 32 | 2.79 30.68 21.98 4.60 | 2.44 2.59 6.78 2.48 256 | 3.25 31.99 21.36 5.92 | 2.32 2.71 6.22 3.14 1024 | 3.30 32.61 22.83 8.45 | 2.40 2.79 6.24 3.27 4096 | 3.57 317.46 22.52 10.73 | 2.78 28.62 6.90 3.94 16384 | n/a 2342.39 34.30 19.20 | n/a 45.44 145.90 7.55 65536 | n/a 2941.39 128.15 131.32* | n/a 141.92 256.85 262.12* Legend: * ext4 - Stock RHEL6.1 ext4 mounted with '-o user_xattr'. * xfs - Stock RHEL6.1 xfs mounted with default options. * zfs-dir - Directory based xattrs only. * zfs-sa - Prefer SAs but spill in to directories as needed, a trailing * indicates overflow in to directories occured. NOTE: Ext4 supports 4096 bytes of xattr name/value pairs per file. NOTE: XFS and ZFS have no limit on xattr name/value pairs per file. NOTE: Linux limits individual name/value pairs to 65536 bytes. NOTE: All setattr/getattr's were done after dropping the cache. NOTE: All tests were run against a single hard drive. Signed-off-by: Brian Behlendorf <[email protected]> Issue #443
* Export symbols for the full SA APIBrian Behlendorf2011-10-051-3/+1
| | | | | | | | | | | | | Export all the symbols for the system attribute (SA) API. This allows external module to cleanly manipulate the SAs associated with a dnode. Documention for the SA API can be found in the module/zfs/sa.c source. This change also removes the zfs_sa_uprade_pre, and zfs_sa_uprade_post prototypes. The functions themselves were dropped some time ago. Signed-off-by: Brian Behlendorf <[email protected]>
* Support custom build directories and move includesBrian Behlendorf2010-09-081-0/+143
One of the neat tricks an autoconf style project is capable of is allow configurion/building in a directory other than the source directory. The major advantage to this is that you can build the project various different ways while making changes in a single source tree. For example, this project is designed to work on various different Linux distributions each of which work slightly differently. This means that changes need to verified on each of those supported distributions perferably before the change is committed to the public git repo. Using nfs and custom build directories makes this much easier. I now have a single source tree in nfs mounted on several different systems each running a supported distribution. When I make a change to the source base I suspect may break things I can concurrently build from the same source on all the systems each in their own subdirectory. wget -c http://github.com/downloads/behlendorf/zfs/zfs-x.y.z.tar.gz tar -xzf zfs-x.y.z.tar.gz cd zfs-x-y-z ------------------------- run concurrently ---------------------- <ubuntu system> <fedora system> <debian system> <rhel6 system> mkdir ubuntu mkdir fedora mkdir debian mkdir rhel6 cd ubuntu cd fedora cd debian cd rhel6 ../configure ../configure ../configure ../configure make make make make make check make check make check make check This change also moves many of the include headers from individual incude/sys directories under the modules directory in to a single top level include directory. This has the advantage of making the build rules cleaner and logically it makes a bit more sense.