aboutsummaryrefslogtreecommitdiffstats
path: root/module/spl
Commit message (Collapse)AuthorAgeFilesLines
* Add slab usage summeries to /procBrian Behlendorf2011-04-061-1/+119
| | | | | | | | | | | | | | | | | | One of the most common things you want to know when looking at the slab is how much memory is being used. This information was available in /proc/spl/kmem/slab but only on a per-slab basis. This commit adds the following /proc/sys/kernel/spl/kmem/slab* entries to make total slab usage easily available at a glance. slab_kmem_total - Total kmem slab size slab_kmem_avail - Alloc'd kmem slab size slab_kmem_max - Max observed kmem slab size slab_vmem_total - Total vmem slab size slab_vmem_avail - Alloc'd vmem slab size slab_vmem_max - Max observed vmem slab size NOTE: The slab_*_max values are expected to over report because they show maximum values since boot, not current values.
* Update /proc/spl/kmem/slab outputBrian Behlendorf2011-04-061-22/+24
| | | | | | | | | | | | | The 'slab_fail', 'slab_create', and 'slab_destroy' columns in the slab output have been removed because they are virtually always zero and not very useful. The much more useful 'size' and 'alloc' columns have been added which show the total slab size and how much of the total size has been allocated to objects. Finally, the formatting has been updated to be much more human readable while still being friendly for tool like awk to parse.
* Linux shrinker compatBrian Behlendorf2011-04-061-41/+9
| | | | | | | | | | | | | | The Linux shrinker has gone through three API changes since 2.6.22. Rather than force every caller to understand all three APIs this change consolidates the compatibility code in to the mm-compat.h header. The caller then can then use a single spl provided shrinker API which does the right thing for your kernel. SPL_SHRINKER_CALLBACK_PROTO(shrinker_callback, cb, nr_to_scan, gfp_mask); SPL_SHRINKER_DECLARE(shrinker_struct, shrinker_callback, seeks); spl_register_shrinker(&shrinker_struct); spl_unregister_shrinker(&&shrinker_struct); spl_exec_shrinker(&shrinker_struct, nr_to_scan, gfp_mask);
* Add crgetfsuid()/crgetfsgid() helpersBrian Behlendorf2011-03-221-78/+58
| | | | | | | | | | | | Solaris credentials don't have an fsuid/fsguid field but Linux credentials do. To handle this case the Solaris API is being modestly extended to include the crgetfsuid()/crgetfsgid() helper functions. Addititionally, because the crget*() helpers are implemented identically regardless of HAVE_CRED_STRUCT they have been moved outside the #ifdef to common code. This simplification means we only have one version of the helper to keep to to date.
* Disable vmalloc() direct reclaimBrian Behlendorf2011-03-201-2/+22
| | | | | | | | | | | | | | | | | | | As part of vmalloc() a __pte_alloc_kernel() allocation may occur. This internal allocation does not honor the gfp flags passed to vmalloc(). This means even when vmalloc(GFP_NOFS) is called it is possible that a synchronous reclaim will occur. This reclaim can trigger file IO which can result in a deadlock. This issue can be avoided by explicitly setting PF_MEMALLOC on the process to subvert synchronous reclaim when vmalloc() is called with !__GFP_FS. An example stack of the deadlock can be found here (1), along with the upstream kernel bug (2), and the original bug discussion on the linux-mm mailing list (3). This code can be properly autoconf'ed when the upstream bug is fixed. 1) http://github.com/behlendorf/zfs/issues/labels/Vmalloc#issue/133 2) http://bugzilla.kernel.org/show_bug.cgi?id=30702 3) http://marc.info/?l=linux-mm&m=128942194520631&w=4
* Remove xvattr supportBrian Behlendorf2011-03-021-2/+2
| | | | | | | | | | | | | | | | | | | The xvattr support in the spl has always simply consisted of defining a couple structures and a few #defines. This was enough to enable compilation of code which just passed xvattr types around but not enough to effectively manipulate them. This change removes even this minimal support leaving it up to packages which leverage the spl to prove the full xvattr support. By removing it from the spl we ensure not conflict with the higher level packages. This just leaves minimal vnode support for basical manipulation of files. This code is does have the proper support functions in the spl and a set of regression tests. Additionally, this change removed the unused 'caller_context_t *' type and replaces it with a 'void *'.
* Fix zlib compressionBrian Behlendorf2011-02-254-3/+230
| | | | | | | | | | | | | | | | While portions of the code needed to support z_compress_level() and z_uncompress() where in place. In reality the current implementation was non-functional, it just was compilable. The critical missing component was to setup a workspace for the compress/uncompress stream structures to use. A kmem_cache was added for the workspace area because we require a large chunk of memory. This avoids to need to continually alloc/free this memory and vmap() the pages which is very slow. Several objects will reside in the per-cpu kmem_cache making them quick to acquire and release. A further optimization would be to adjust the implementation to additional ensure the memory is local to the cpu. Currently that may not be the case.
* Linux compat 2.6.37, invalidate_inodes()Brian Behlendorf2011-02-231-0/+14
| | | | | | | | | | | | | | | | | | In the 2.6.37 kernel the function invalidate_inodes() is no longer exported for use by modules. This memory management functionality is needed to invalidate the inodes attached to a super block without unmounting the filesystem. Because this function still exists in the kernel and the prototype is available is a common header all we strictly need is the symbol address. The address is obtained using spl_kallsyms_lookup_name() and assigned to the variable invalidate_inodes_fn. Then a #define is used to replace all instances of invalidate_inodes() with a call to the acquired address. All the complexity is hidden behind HAVE_INVALIDATE_INODES and invalidate_inodes() can be used as usual. Long term we should try to get this, or another, interface made available to modules again.
* Block in cv_destroy() on all waitersBrian Behlendorf2011-02-041-3/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously we would ASSERT in cv_destroy() if it was ever called with active waiters. However, I've now seen several instances in OpenSolaris code where they do the following: cv_broadcast(); cv_destroy(); This leaves no time for active waiters to be woken up and scheduled and we trip the ASSERT. This has not been observed to be an issue on OpenSolaris because their cv_destroy() basically does nothing. They still do run the risk of the memory being free'd after the cv_destroy() and hitting a bad paging request. But in practice this race is so small and unlikely it either doesn't happen, or is so unlikely when it does happen the root cause has not yet been identified. Rather than risk the same issue in our code this change updates cv_destroy() to block until all waiters have been woken and scheduled. This may take some time because each waiter must acquire the mutex. This change may have an impact on performance for frequently created and destroyed condition variables. That however is a price worth paying it avoid crashing your system. If performance issues are observed they can be addressed by the caller.
* Remove VN_HOLD/VN_RELE/VOP_PUTPAGEBrian Behlendorf2011-01-121-1/+0
| | | | | | Previously these were defined to noops but rather than give the misleading impression that these are actually implemented I'm removing the type entirely for clarity.
* Make vn_cache|vn_file_cache kmem cachesBrian Behlendorf2011-01-121-2/+2
| | | | | | | Both of these caches were previously allowed to be either a vmem or kmem cache based on the size of the object involved. Since we know the object won't be to large and performce is much better for a kmem cache for them to be kmem backed.
* Clean vattr_t and vsecattr_t typesBrian Behlendorf2011-01-121-9/+6
| | | | Minor cleanup for the vattr_t and vsecattr_t types.
* Add vn_mode_to_vtype/vn_vtype to_mode helpersBrian Behlendorf2011-01-121-6/+35
| | | | | Add simple helpers to convert a vnode->v_type to a inode->i_mode. These should be used sparingly but they are handy to have.
* Add cv_timedwait_interruptible() functionNeependra Khare2011-01-111-4/+17
| | | | | | | | | | | | | | | | | | The cv_timedwait() function by definition must wait unconditionally for cv_signal()/cv_broadcast() before waking. This causes processes to go in the D state which increases the load average. The load average is the summation of processes in D state and run queue. To avoid this it can be desirable to sleep interruptibly. These processes do not count against the load average but may be woken by a signal. It is up to the caller to determine why the process was woken it may be for one of three reasons. 1) cv_signal()/cv_broadcast() 2) the timeout expired 3) a signal was received Signed-off-by: Brian Behlendorf <[email protected]>
* Linux Compat: inode->i_mutex/i_semBrian Behlendorf2011-01-111-10/+3
| | | | | | | | Create spl_inode_lock/spl_inode_unlock compability macros to simply access to the inode mutex/sem. This avoids the need to have to ugly up the code with the required #define's at every call site. At the moment the SPL only uses this in one place but higher layers can benefit from the macro.
* Add Thread Specific Data (TSD) ImplementationBrian Behlendorf2010-12-075-3/+653
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Thread specific data has implemented using a hash table, this avoids the need to add a member to the task structure and allows maximum portability between kernels. This implementation has been optimized to keep the tsd_set() and tsd_get() times as small as possible. The majority of the entries in the hash table are for specific tsd entries. These entries are hashed by the product of their key and pid because by design the key and pid are guaranteed to be unique. Their product also has the desirable properly that it will be uniformly distributed over the hash bins providing neither the pid nor key is zero. Under linux the zero pid is always the init process and thus won't be used, and this implementation is careful to never to assign a zero key. By default the hash table is sized to 512 bins which is expected to be sufficient for light to moderate usage of thread specific data. The hash table contains two additional type of entries. They first type is entry is called a 'key' entry and it is added to the hash during tsd_create(). It is used to store the address of the destructor function and it is used as an anchor point. All tsd entries which use the same key will be linked to this entry. This is used during tsd_destory() to quickly call the destructor function for all tsd associated with the key. The 'key' entry may be looked up with tsd_hash_search() by passing the key you wish to lookup and DTOR_PID constant as the pid. The second type of entry is called a 'pid' entry and it is added to the hash the first time a process set a key. The 'pid' entry is also used as an anchor and all tsd for the process will be linked to it. This list is using during tsd_exit() to ensure all registered destructors are run for the process. The 'pid' entry may be looked up with tsd_hash_search() by passing the PID_KEY constant as the key, and the process pid. Note that tsd_exit() is called by thread_exit() so if your using the Solaris thread API you should not need to call tsd_exit() directly.
* Clear cv->cv_mutex when not in useBrian Behlendorf2010-11-291-10/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | For debugging purposes the condition varaibles keep track of the mutex used during a wait. The idea is to validate that all callers always use the same mutex. Unfortunately, we have seen cases where the caller reuses the condition variable with a different mutex but in a way which is known to be safe. My reading of the man pages suggests you should not do this and always cv_destroy()/cv_init() a new mutex. However, there is overhead in doing this and it does appear to be allowed under Solaris. To accomidate this behavior cv_wait_common() and __cv_timedwait() have been modified to clear the associated mutex when the last waiter is dropped. This ensures that while the condition variable is in use the incorrect mutex case is detected. It also allows the condition variable to be safely recycled without requiring the overhead of a cv_destroy()/cv_init() as long as it isn't currently in use. Finally, spin lock cv->cv_lock was removed because it is not required. When the condition variable is used properly the caller will always be holding the mutex so the spin lock is redundant. The lock was originally added because I expected to need to protect more than just the cv->cv_mutex. It turns out that was not the case. Signed-off-by: Brian Behlendorf <[email protected]>
* Linux 2.6.36 compat, use fops->unlocked_ioctl()Brian Behlendorf2010-11-101-5/+6
| | | | | | | | | As of linux-2.6.36 the last in-tree consumer of fops->ioctl() has been removed and thus fops()->ioctl() has also been removed. The replacement hook is fops->unlocked_ioctl() which has existed in kernel since 2.6.12. Since the SPL only contains support back to 2.6.18 vintage kernels, I'm not adding an autoconf check for this and simply moving everything to use fops->unlocked_ioctl().
* Linux 2.6.36 compat, fs_struct->lock type changeBrian Behlendorf2010-11-091-10/+18
| | | | | | | | | | In the linux-2.6.36 kernel the fs_struct lock was changed from a rwlock_t to a spinlock_t. If the kernel would export the set_fs_pwd() symbol by default this would not have caused us any issues, but they don't. So we're forced to add a new autoconf check which sets the HAVE_FS_STRUCT_SPINLOCK define when a spinlock_t is used. We can then correctly use either spin_lock or write_lock in our custom set_fs_pwd() implementation.
* Fix 2.6.35 shrinker callback API changeBrian Behlendorf2010-10-221-3/+18
| | | | | | | | | | | | | As of linux-2.6.35 the shrinker callback API now takes an additional argument. The shrinker struct is passed to the callback so that users can embed the shrinker structure in private data and use container_of() to access it. This removes the need to always use global state for the shrinker. To handle this we add the SPL_AC_3ARGS_SHRINKER_CALLBACK autoconf check to properly detect the API. Then we simply setup a callback function with the correct number of arguments. For now we do not make use of the new 3rd argument.
* Support custom build directoriesBrian Behlendorf2010-09-051-19/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the neat tricks an autoconf style project is capable of is allow configurion/building in a directory other than the source directory. The major advantage to this is that you can build the project various different ways while making changes in a single source tree. For example, this project is designed to work on various different Linux distributions each of which work slightly differently. This means that changes need to verified on each of those supported distributions perferably before the change is committed to the public git repo. Using nfs and custom build directories makes this much easier. I now have a single source tree in nfs mounted on several different systems each running a supported distribution. When I make a change to the source base I suspect may break things I can concurrently build from the same source on all the systems each in their own subdirectory. wget -c http://github.com/downloads/behlendorf/spl/spl-x.y.z.tar.gz tar -xzf spl-x.y.z.tar.gz cd spl-x-y-z ------------------------- run concurrently ---------------------- <ubuntu system> <fedora system> <debian system> <rhel6 system> mkdir ubuntu mkdir fedora mkdir debian mkdir rhel6 cd ubuntu cd fedora cd debian cd rhel6 ../configure ../configure ../configure ../configure make make make make make check make check make check make check This is something the project has almost supported for a long time but finishing this support should save me lots of time.
* Stub out kmem cache defrag APIBrian Behlendorf2010-08-271-0/+12
| | | | | | | | | At some point we are going to need to implement the kmem cache move callbacks to allow for kmem cache defragmentation. This commit simply lays a small part of the API ground work, it does not actually implement any of this feature. This is safe for now because the move callbacks are just an optimization. Even if they are registered we don't ever really have to call them.
* Fix stack overflow in vn_rdwr() due to memory reclaimLi Wei2010-08-121-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | Unless __GFP_IO and __GFP_FS are removed from the file mapping gfp mask we may enter memory reclaim during IO. In this case shrink_slab() entered another file system which is notoriously hungry for stack. This additional stack usage may cause a stack overflow. This patch removes __GFP_IO and __GFP_FS from the mapping gfp mask of each file during vn_open() to avoid any reclaim in the vn_rdwr() IO path. The original mask is then restored at vn_close() time. Hats off to the loop driver which does something similiar for the same reason. [...] shrink_slab+0xdc/0x153 try_to_free_pages+0x1da/0x2d7 __alloc_pages+0x1d7/0x2da do_generic_mapping_read+0x2c9/0x36f file_read_actor+0x0/0x145 __generic_file_aio_read+0x14f/0x19b generic_file_aio_read+0x34/0x39 do_sync_read+0xc7/0x104 vfs_read+0xcb/0x171 :spl:vn_rdwr+0x2b8/0x402 :zfs:vdev_file_io_start+0xad/0xe1 [...] Signed-off-by: Brian Behlendorf <[email protected]>
* Fix taskq code to not drop tasks when TQ_SLEEP is used.Ricardo M. Correia2010-08-021-23/+24
| | | | | | | | | | | | | | | | When TQ_SLEEP is used, taskq_dispatch() should always succeed even if the number of pending tasks is above tq->tq_maxalloc. This semantic is similar to KM_SLEEP in kmem allocations, which also always succeed. However, we cannot block forever otherwise there is a risk of deadlock. Therefore, we still allow the number of pending tasks to go above tq->tq_maxalloc with TQ_SLEEP, but we may sleep up to 1 second per task dispatch, thereby throttling the task dispatch rate. One of the existing splat tests was also augmented to test for this scenario. The test would fail with the previous implementation but now it succeeds. Signed-off-by: Brian Behlendorf <[email protected]>
* Strfree() should call kfree() not kmem_free()Brian Behlendorf2010-07-301-1/+1
| | | | | | | | | | | | Using kmem_free() results in deducting X bytes from the memory accounting when --enable-debug is set. Unfortunately, currently the counterpart kmem_asprintf() and friends do not properly account for memory allocated, so we must do the same on free. If we don't then we end up with a negative number of lost bytes reported when the module is unloaded. A better long term fix would be to add the accounting in to the allocation side but that's a project for another day.
* Ensure kmem_alloc() and vmem_alloc() never failBrian Behlendorf2010-07-261-105/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Solaris semantics for kmem_alloc() and vmem_alloc() are that they must never fail when called with KM_SLEEP. They may only fail if called with KM_NOSLEEP otherwise they must block until memory is available. This is quite different from how the Linux memory allocators work, under Linux a memory allocation failure is always possible and must be dealt with. At one point in the past the kmem code did properly implement this behavior, however as the code evolved this behavior was overlooked in places. This patch goes through all three implementations of the kmem/vmem allocation functions and ensures that they will all block in the KM_SLEEP case when memory is not available. They may still fail in the KM_NOSLEEP case in which case the caller is responsible for handling the failure. Special care is taken in vmalloc_nofail() to avoid thrashing the system on the virtual address space spin lock. The down side of course is if you do see a failure here, which is unlikely for 64-bit systems, your allocation will delay for an entire second. Still this is preferable to locking up your system and it is the best we can do given the constraints. Additionally, the code was cleaned up to be much more readable and comments were added to describe the various kmem-debug-* configure options. The default configure options remain: "--enable-debug-kmem --disable-debug-kmem-tracking"
* Fix two minor compiler warningsBrian Behlendorf2010-07-261-1/+2
| | | | | | | | | In cmd/splat.c there was a comparison between an __u32 and an int. To resolve the issue simply use a __u32 and strtoul() when converting the provided user string. In module/spl/spl-vnode.c we should explicitly cast nd->last.name to a const char * which is what is expected by the prototype.
* Remove deadcode caused by removal of format1 argBrian Behlendorf2010-07-211-14/+5
| | | | | | Commit 55abb0929e4fbe326a9737650a167a1a988ad86b removed the never used format1 argument of spl_debug_msg(). That in turn resulted in some deadcode which should be removed since it's now useless.
* Display DEBUG keyword during module load when --enable-debug is used.Ricardo M. Correia2010-07-201-4/+6
| | | | | Signed-off-by: Ricardo M. Correia <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Fix buggy kmem_{v}asprintf() functionsRicardo M. Correia2010-07-201-4/+4
| | | | | | | | | | When the kvasprintf() call fails they should reset the arguments by calling va_start()/va_copy() and va_end() inside the loop, otherwise they'll try to read more arguments rather than starting over and reading them from the beginning. Signed-off-by: Ricardo M. Correia <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]>
* Prefix all SPL debug macros with 'S'Brian Behlendorf2010-07-2013-424/+456
| | | | | | | | To avoid conflicts with symbols defined by dependent packages all debugging symbols have been prefixed with a 'S' for SPL. Any dependent package needing to integrate with the SPL debug should include the spl-debug.h header and use the 'S' prefixed macros. They must also build with DEBUG defined.
* Split <sys/debug.h> headerBrian Behlendorf2010-07-2013-68/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid symbol conflicts with dependent packages the debug header must be split in to several parts. The <sys/debug.h> header now only contains the Solaris macro's such as ASSERT and VERIFY. The spl-debug.h header contain the spl specific debugging infrastructure and should be included by any package which needs to use the spl logging. Finally the spl-trace.h header contains internal data structures only used for the log facility and should not be included by anythign by spl-debug.c. This way dependent packages can include the standard Solaris headers without picking up any SPL debug macros. However, if the dependant package want to integrate with the SPL debugging subsystem they can then explicitly include spl-debug.h. Along with this change I have dropped the CHECK_STACK macros because the upstream Linux kernel now has much better stack depth checking built in and we don't need this complexity. Additionally SBUG has been replaced with PANIC and provided as part of the Solaris macro set. While the Solaris version is really panic() that conflicts with the Linux kernel so we'll just have to make due to PANIC. It should rarely be called directly, the prefered usage would be an ASSERT or VERIFY. There's lots of change here but this cleanup was overdue.
* Fix -Werror=format-security compiler optionBrian Behlendorf2010-07-141-1/+2
| | | | | Noticed under Ubuntu kernel builds we should be passing a format specifier and the string, not just the string.
* Linux 2.6.35 compat: filp_fsync() dropped 'stuct dentry *'Brian Behlendorf2010-07-142-28/+7
| | | | | | | The prototype for filp_fsync() drop the unused argument 'stuct dentry *'. I've fixed this by adding the needed autoconf check and moving all of those filp related functions to file_compat.h. This will simplify handling any further API changes in the future.
* Add __divdi3(), remove __udivdi3() kernel dependencyBrian Behlendorf2010-07-131-23/+96
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up until now no SPL consumer attempted to perform signed 64-bit division so there was no need to support this. That has now changed so I adding 64-bit division support for 32-bit platforms. The signed implementation is based on the unsigned version. Since the have been several bug reports in the past concerning correct 64-bit division on 32-bit platforms I added some long over due regression tests. Much to my surprise the unsigned 64-bit division regression tests failed. This was surprising because __udivdi3() was implemented by simply calling div64_u64() which is provided by the kernel. This meant that the linux kernels 64-bit division algorithm on 32-bit platforms was flawed. After some investigation this turned out to be exactly the case. Because of this I was forced to abandon the kernel helper and instead to fully implement 64-bit division in the spl. There are several published implementation out there on how to do this properly and I settled on one proposed in the book Hacker's Delight. Their proposed algoritm is freely available without restriction and I have just modified it to be linux kernel friendly. The update implementation now passed all the unsigned and signed regression tests. This should be functional, but not fast, which is good enough for out purposes. If you want fast too I'd strongly suggest you upgrade to a 64-bit platform. I have also reported the kernel bug and we'll see if we can't get it fixed up stream.
* Require gawk the usermode helper fails with awkBrian Behlendorf2010-07-011-1/+1
| | | | | | | | | | For some reason when awk invoked by the usermode helper the command always fails. Interestingly gawk does not suffer from this problem which is why I never observed this failure since the distro I tested with all had gawk installed instead of awk. Anyway, the simplest thing to do here is to just make gawk mandatory. I've added a configure check for gawk specifically and have updated the command to call gawk not awk.
* Add configure check for user_path_dir()Brian Behlendorf2010-07-011-1/+17
| | | | | | | | | | | | I didn't notice at the time but user_path_dir() was not introduced at the same time as set_fs_pwd() change. I had lumped the two together but in fact user_path_dir() was introduced in 2.6.27 and set_fs_pwd() taking 2 args was introduced in 2.6.25. This means builds against 2.6.25-2.6.26 kernels were broken. To fix this I've added a check for user_path_dir() and no longer assume that if set_fs_pwd() takes 2 args then user_path_dir() is also available.
* Initialize the /dev/splatctl device bufferNed Bass2010-07-011-0/+1
| | | | | | | | On open() and initialize the buffer with the SPL version string. The user space splat utility expects to find the SPL version string when it opens and reads from /dev/splatctl. Signed-off-by: Brian Behlendorf <[email protected]>
* Implementation of the TQ_FRONT flag.Ned Bass2010-07-011-21/+74
| | | | | | | | | | | | | | | Adds a task queue to receive tasks dispatched with TQ_FRONT. Worker threads pull tasks from this high priority queue before the default pending queue. Executing tasks out of FIFO order potentially breaks taskq_lowest_id() if we do not preserve the ordering of the work list by taskqid. Therefore, instead of always appending to the work list, we search for the appropriate place to insert a task. The common case is to append to the list, so we make this operation efficient by searching the work list in reverse order. Signed-off-by: Brian Behlendorf <[email protected]>
* Linux-2.6.33 compat, .ctl_name removed from struct ctl_tableBrian Behlendorf2010-06-301-38/+40
| | | | | | | | | As of linux-2.6.33 the ctl_name member of the ctl_table struct has been entirely removed. The upstream code has been updated to depend entirely on the the procname member. To handle this all references to ctl_name are wrapped in a CTL_NAME macro which simply expands to nothing for newer kernels. Older kernels are supported by having it expand to .ctl_name = X just as before.
* Add kmem_vasprintf functionBrian Behlendorf2010-06-241-4/+20
| | | | | | We might as well have both asprintf() variants. This allows us to safely pass a va_list through several levels of the stack using va_copy() instead of va_start().
* Revert "Support TQ_FRONT flag used by taskq_dispatch()"Brian Behlendorf2010-06-211-7/+1
| | | | This reverts commit eb12b3782c94113d2d40d2da22265dc4111a672b.
* Update warnings in kmem debug codeBrian Behlendorf2010-06-161-36/+49
| | | | | | | | | This fix was long overdue. Most of the ground work was laid long ago to include the exact function and line number in the error message which there was an issue with a memory allocation call. However, probably due to lack of time at the moment that informatin never made it in to the error message. This patch fixes that and trys to standardize the kmem debug messages as well.
* Support TQ_FRONT flag used by taskq_dispatch()Brian Behlendorf2010-06-111-1/+7
| | | | | Allow taskq_dispatch() to insert work items at the head of the queue instead of just the tail by passing the TQ_FRONT flag.
* Add kmem_asprintf(), strfree(), strdup(), and minor cleanup.Brian Behlendorf2010-06-111-0/+46
| | | | | | | | | | | This patch adds three missing Solaris functions: kmem_asprintf(), strfree(), and strdup(). They are all implemented as a thin layer which just calls their Linux counterparts. As part of this an autoconf check for kvasprintf was added because it does not appear in older kernels. If the kernel does not provide it then spl-generic implements it. Additionally the dead DEBUG_KMEM_UNIMPLEMENTED code was removed to clean things up and make the kmem.h a little more readable.
* Cleanly split Linux proc.h (fs) from conflicting Solaris proc.h (process)Brian Behlendorf2010-06-115-4/+10
| | | | | | | | Under linux the proc.h header is for the /proc filesystem, and under Solaris the proc/h header if for processes. This patch correctly moves the Linux proc functionality in a linux/proc_compat.h header and leaves the sys/proc.h for use by Solaris. Minor updates were required to all the call sites where it was included of course.
* Stack overflow on 64-bit modulus operations on 32-bit architectures.Alex Zhuravlev2010-06-031-3/+5
| | | | | | | | | | | | | | | | Running 'zpool create' on a 32-bit machine with an SPL compiled with gcc 4.4.4 led to a stack overlow. This turned out to be due to some sort of 'optimization' by gcc: uint64_t __umoddi3(uint64_t dividend, uint64_t divisor) { return dividend - divisor * (dividend / divisor); } This code was supposed to be using __udivdi3 to implement /, but gcc instead implemented it via __umoddi3 itself. Signed-off-by: Brian Behlendorf <[email protected]>
* Minor 32-bit fix cast to hrtime_t before the mutliply.Brian Behlendorf2010-05-231-1/+1
| | | | | It's important to cast to hrtime_t before doing the multiply because the ts.tv_sec type is only 32-bits and we need to promote it to 64-bits.
* Use KM_NODEBUG macro in preference to __GFP_NOWARN.Brian Behlendorf2010-05-201-5/+5
|
* Disable spl_debug_panic_on_bug by default.Brian Behlendorf2010-05-201-1/+1
| | | | | | | | While I may prefer to have the system panic on an SBUG and to get crash dump for analysis. I suspect most peoples systems are not configured from crash dump and the best thing to so is to simply halt the thread and print an error to the console. This way they have a good chance of actually saving the stack trace and debug log.