summaryrefslogtreecommitdiffstats
path: root/module/splat
Commit message (Collapse)AuthorAgeFilesLines
* Support post-3.13 kthread_create() semantics.Tim Chase2014-04-082-6/+6
| | | | | | | | | | | Provide spl_kthread_create() as a wrapper to the kernel's kthread_create() to provide pre-3.13 semantics. Re-try if the call is interrupted or if it would have returned -ENOMEM. Otherwise return NULL. Signed-off-by: Chunwei Chen <[email protected]> Signed-off-by: Tim Chase <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #339
* splat cred:groupmember: Fix false positivesBrian Behlendorf2014-04-081-23/+71
| | | | | | | | | | | | | | | | Due to certain assumptions made in the the cred:groupmember test it could result in false positives when run on specific distributions. This was solely a bug in the test case and not in the groupmember() function which the test case was validating. To prevent future false positives the test case has been rewritten to be both more rigerous and to make fewer assumptions about the system. Minor style cleanup was done to cr_groups_search() and groupmember() functions. Signed-off-by: Brian Behlendorf <[email protected]>
* splat kmem:slab_reclaim: Test cleanupBrian Behlendorf2014-04-081-4/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | By setting __GFP_NORETRY the kernel memory reclaim logic was allowed to abort early and dump a falled allocation stack to the console. Since this was done in a tight loop to fill memory it could result in a large number of stacks being dumped to the console. This in turn slowed down the test sufficiently so it exceeded the time limit and failed. To resolve this issue the __GFP_NORETRY flag is being removed. This is how it should have been called originally to ensure we're simulating the behavior of most callers which will use the GFP_KERNEL flag. In addition, the reclaim granularity of 1000 objects was far to coarse for this to be a realistic test. For kmem:slab_reclaim there might only be a few thousand objects total in the cache. Therefore, the SPLAT_KMEM_OBJ_RECLAIM constant for these tests was lowered. This will cause the reclaim callback to run more frequently which makes for a better test case. The frequency of the cache reaping in kmem:slab_reap was increased to accommodate the reduced number of objects released during the reclaim. These changes only impact the test cases and were done to remove false positives caused by the test case itself. Signed-off-by: Brian Behlendorf <[email protected]>
* Add module versioningBrian Behlendorf2013-12-061-0/+1
| | | | | | | | | | | | | | | | Use the standard Linux MODULE_VERSION macro to expose the installed spl and splat module versions. This will also automatically add a checksum of the .c files and headers in "srcversion". See: /sys/module/spl/version /sys/module/spl/srcversion /sys/module/splat/version /sys/module/splat/srcversion Signed-off-by: Brian Behlendorf <[email protected]> Closes zfsonlinux/zfs#1923 Signed-off-by: Brian Behlendorf <[email protected]>
* Replace current_kernel_time() with getnstimeofday()Richard Yao2013-10-091-5/+6
| | | | | | | | | | | current_kernel_time() is used by the SPLAT, but it is not meant for performance measurement. We modify the SPLAT to use getnstimeofday(), which is equivalent to the gethrestime() function on Solaris. Additionally, we update gethrestime() to invoke getnstimeofday(). Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #279
* Fix KMC_OFFSLAB type cachesBrian Behlendorf2013-07-301-2/+43
| | | | | | | | | | | | | | | Because spl_slab_size() was always returning -ENOSPC for caches of type KMC_OFFSLAB the cache could never be created. Additionally the slab size is rounded up to a page which is what kv_alloc() expects. The kv_alloc() code will minimally allocate a page, in the KMC_OFFSLAB case this could be reduced. The basic regression tests kmem:slab_small, kmem:slab_large, and kmem:slab_align regression were updated to test KMC_OFFSLAB. Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Ying Zhu <[email protected]> Closes #266
* Linux 3.10 compat: add missing include of linux/slab.hYuxuan Shui2013-07-083-0/+3
| | | | | | | | | Linux kernel commit torvalds/linux@0d01ff2 changes some includes we were depending on through linux/proc_fs.h. Signed-off-by: Yuxuan Shui <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #257
* Refresh links to web siteNed Bass2013-03-0418-18/+18
| | | | | | | Update links to refer to the official ZFS on Linux website instead of @behlendorf's personal fork on github. Signed-off-by: Brian Behlendorf <[email protected]>
* Add spl_kmem_cache_expire module optionBrian Behlendorf2013-01-281-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | Cache aging was implemented because it was part of the default Solaris kmem_cache behavior. The idea is that per-cpu objects which haven't been accessed in several seconds should be returned to the cache. On the other hand Linux slabs never move objects back to the slabs unless there is memory pressure on the system. This behavior is now configurable through the 'spl_kmem_cache_expire' module option. The value is a bit mask with the following meaning. 0x1 - Solaris style cache aging eviction is enabled. 0x2 - Linux style low memory eviction is enabled. Both methods may be safely enabled simultaneously, but by default both are disabled. It has never been clear if the kmem cache aging (which has been around from day one) actually does any good. It has however been the source of numerous bugs so I wouldn't mind retiring it entirely. Signed-off-by: Brian Behlendorf <[email protected]> Closes zfsonlinux/zfs#1227 Closes #210
* call_usermodehelper() should wait for processNed Bass2013-01-092-2/+2
| | | | | | | | | | | As of Linux 3.4 the UMH_WAIT_* constants were renumbered. In particular, the meaning of "1" changed from UMH_WAIT_PROC (wait for process to complete), to UMH_WAIT_EXEC (wait for the exec, but not the process). A number of call sites used the number 1 instead of the constant name, so the behavior was not as expected on kernels with this change. Signed-off-by: Brian Behlendorf <[email protected]>
* splat taskq:front: Reduce stack frameBrian Behlendorf2012-12-121-1/+6
| | | | | | | | | | | | The slightly increased size of the taskq_ent_t when debugging is enabled has pushed the taskq:front splat test over frame size limit. To resolve this dynamically allocate the taskq_ent_t structures so they are part of the heap instead of the stack. In function 'splat_taskq_test6_impl' error: the frame size of 1648 bytes is larger than 1024 bytes Signed-off-by: Brian Behlendorf <[email protected]>
* splat taskq:order: Reduce stack frameBrian Behlendorf2012-12-121-1/+6
| | | | | | | | | | | | The slightly increased size of the taskq_ent_t when debugging is enabled has pushed the taskq:order splat test over frame size limit. To resolve this dynamically allocate the taskq_ent_t structures so they are part of the heap instead of the stack. In function 'splat_taskq_test5_impl' error: the frame size of 1680 bytes is larger than 1024 bytes Signed-off-by: Brian Behlendorf <[email protected]>
* splat taskq:cancel: Add test caseBrian Behlendorf2012-12-121-0/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | Add a test case for taskq_cancel_id() to verify it is working properly. Just like taskq:delay we start by dispatching 100 tasks. However this time 1/3 of the tasks use taskq_dispatch() and will be run immediately, and 2/3 use taskq_dispatch_delay(). The idea is to create a busy taskq with both active, pending, and delayed tasks. After all the items have been successfully dispatched the test begins randomly canceling known task ids. It will do this for 5 seconds randomly canceling a task id and then sleeping for a few milliseconds. The task being canceled may have already run, still be on the pending list, or may be currently being executed by a worker thread. The idea is to ensure we catch any subtle race conditions. Once all the non-canceled tasks have completed we cross check the number of tasks which ran with the number of tasks which were successfully canceled. Additionally, we verify that the taskq_cancel_id() function never blocks longer than needed. This time is bounded by the longest run time of the task which was dispatched. Signed-off-by: Brian Behlendorf <[email protected]>
* splat taskq:delay: Add test caseBrian Behlendorf2012-12-121-11/+113
| | | | | | | | | | | Add a test case for taskq_dispatch_delay() to verify it is working properly. The test dispatchs 100 tasks to a taskq with random expiration times spread over 5 seconds. As each task expires and gets executed by a worker thread it verifies that it was run at the correct time. Once all the delayed tasks have been executed we double check that all the dispatched tasks were successful. Signed-off-by: Brian Behlendorf <[email protected]>
* taskq delay/cancel functionalityBrian Behlendorf2012-12-121-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the ability to dispatch a delayed task to a taskq. The desired behavior is for the task to be queued but not executed by a worker thread until the expiration time is reached. To achieve this two new functions were added. * taskq_dispatch_delay() - This function behaves exactly like taskq_dispatch() however it takes a third 'expire_time' argument. The caller should pass the desired time the task should be executed as an absolute value in jiffies. The task is guarenteed not to run before this time, it may run slightly latter if all the worker threads are busy. * taskq_cancel_id() - Given a task id attempt to cancel the task before it gets executed. This is primarily useful for canceling delay tasks but can be used for canceling any previously dispatched task. There are three possible return values. 0 - The task was found and canceled before it was executed. ENOENT - The task was not found, either it was already run or an invalid task id was supplied by the caller. EBUSY - The task is currently executing any may not be canceled. This function will block until the task has been completed. * taskq_wait_all() - The taskq_wait_id() function was renamed taskq_wait_all() to more clearly reflect its actual behavior. It is only curreny used by the splat taskq regression tests. * taskq_wait_id() - Historically, the only difference between this function and taskq_wait() was that you passed the task id. In both functions you would block until ALL lower task ids which executed. This was semantically correct but could be very slow particularly if there were delay tasks submitted. To better accomidate the delay tasks this function was reimplemnted. It will now only block until the passed task id has been completed. This is actually a fairly low risk change for a few reasons. * Only new ZFS callers will make use of the new interfaces and very little common code was changed to support the new functions. * The existing taskq_wait() implementation was not changed just slightly refactored. * The newly optimized taskq_wait_id() implementation was never used by ZFS we can't accidentally introduce a new bug there. NOTE: This functionality does not exist in the Illumos taskqs. Signed-off-by: Brian Behlendorf <[email protected]>
* splat linux:shrinker: Fix fail-safeSteven Johnson2012-12-121-4/+21
| | | | | | Ensure the fail-safe is reset between successive tests. Signed-off-by: Brian Behlendorf <[email protected]>
* splat linux:shrinker: Fix race conditionSteven Johnson2012-12-121-1/+27
| | | | | | | | | | | | Ensure the test thread blocks until the shrinker has completed its work. This is done by putting the test thread to sleep and waking it each time the shrinker callback runs. Once the shrinker size drops to zero or we time out the test is allowed to proceed. Signed-off-by: Brian Behlendorf <[email protected]> Closes #96 Closes #125 Closes #182
* splat taskq:front: Fix raceSteven Johnson2012-12-051-30/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The taskq:front test has a race condition where task 4 and 8 race to complete, due to an incorrectly calculated set of delay "factors" (T). If task 4 wins and actually finishes first, the verification of the order of completion will fail. The delays calculated to order task completion do not take into account the terminal line in the table, and so are all off by a factor of 1. This causes all the tasks in all queues to finish sooner than expected and the accumulated error is the root cause of tasks 4 and 8 racing to complete first. Before the change the "actual" table looks like I commented in #130. I changed: * the table in the comment to correctly reflect the test and the factor timings needed. * the individual task delay factors of T so that ONLY 1 task will every 2T. (on average) * 1T was reduced from 100ms to 50ms. This halves the duration of the test and makes any remaining raciness more likely to cause failures, but it did not cause the test to fail. * simplified the delay factor logic by using a table look-up instead of a switch. * Added a "task started" message so that with -v it is possible to see the order tasks are started. * Moved the "task completed" message inside the spinlock so that with -v the message truly reflects the absolute order of completion as guaranteed by the spinlock. Signed-off-by: Brian Behlendorf <[email protected]> Closes #130
* Linux compat 3.7, kernel_thread()Brian Behlendorf2012-12-032-66/+61
| | | | | | | | | | | | | | The preferred kernel interface for creating threads has been kthread_create() for a long time now. However, several of the SPLAT tests still use the legacy kernel_thread() function which has finally been dropped (mostly). Update the condvar and rwlock SPLAT tests to use the modern interface. Frankly this is something we should have done a long time ago. Signed-off-by: Brian Behlendorf <[email protected]> Closes #194
* splat kmem:slab_overcommit: DisabledBrian Behlendorf2012-11-061-8/+8
| | | | | | | Disable this test because it may result in an OOM event on the system which can result in the test infrastructure being killed. Signed-off-by: Brian Behlendorf <[email protected]>
* splat atomic:64-bit: Create thread outside spin lockBrian Behlendorf2012-11-061-7/+7
| | | | | | | | | | | | | The Fedora 3.6 debug kernel identified the following issue where we create a thread under a spin lock. This isn't safe because sleeping could result in a deadlock. Therefore the lock is changed to a mutex so it's safe to sleep. BUG: sleeping function called from invalid context at mm/slub.c:930 in_atomic(): 1, irqs_disabled(): 0, pid: 10583, name: splat 1 lock held by splat/10583: Signed-off-by: Brian Behlendorf <[email protected]>
* splat: Fix log buffer lockingBrian Behlendorf2012-11-062-14/+15
| | | | | | | | | | The Fedora 3.6 debug kernel identified the following issue where we call copy_to_user() under a spin lock(). This used to be safe in older kernels but no longer appears to be true so the spin lock was changed to a mutex. None of this code is performance critical so allowing the process to sleep is harmless. Signed-off-by: Brian Behlendorf <[email protected]>
* splat: Cleanup headersBrian Behlendorf2012-11-0618-42/+35
| | | | | | | Restructure the the SPLAT headers such that each test only includes the minimal set of headers it requires. Signed-off-by: Brian Behlendorf <[email protected]>
* Linux 3.7 compat, __clear_close_on_exec() removedBrian Behlendorf2012-10-181-136/+0
| | | | | | | | | | | | | | | | | | | | Commit torvalds/linux@b8318b0 moved the __clear_close_on_exec() function out of include/linux/fdtable.h and in to fs/file.c making it unavailable to the SPL. Now as it turns out we only used this function to tear down some test infrastructure for the vn_getf()/vn_releasef() SPLAT regression tests. Rather than implement even more autoconf compatibilty code to handle this we just remove the test case. This also allows us to drop three existing autoconf tests. This does mean the SPLAT tests will no longer verify these functions but historically they have never been a problem. And if we feel we absolutely need this test coverage I'm sure a more portable version of the test case could be added. Signed-off-by: Brian Behlendorf <[email protected]> Closes #183
* Enhance SPLAT kmem:slab_overcommit testBrian Behlendorf2012-08-301-193/+208
| | | | | | | | | | | | | | | | | | | | | After the emergency slab objects were merged I started observing timeout failures in the kmem:slab_overcommit test. These were due to the ineffecient way the slab_overcommit reclaim function was implemented. And due to the additional cost of potentially allocating ten of thousands of emergency objects and tracking them on a single list. This patch addresses the first concern by enhansing the test case to trace all of the allocations objects as a linked list. This allows for a cleaner version of the reclaim function to simply release SPLAT_KMEM_OBJ_RECLAIM objects. Since this touches some common code all the tests which share these data structions were also updated. After making these changes slab_overcommit is reliably passing. However, there is certainly additional cleanup which could be done here. Signed-off-by: Brian Behlendorf <[email protected]>
* Remove Makefile from non-toplevel .gitignore filesRichard Yao2012-08-231-1/+0
| | | | | | | | | | | | | | When building SPL support into the kernel, ./copy-builtin will copy non-toplevel .gitignore files. These files list /Makefile, which causes git-archive to omit ./module/{spl,splat}/Makefile. The absence of these files result in build failures when SPL is selected. ZFS is unaffected because it puts Makefile in the toplevel .gitignore, which is not copied. We fix SPL by emulating that behavior. Reported-by: Fabio Erculiani <[email protected]> Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #152
* Add script for builtin module building.Etienne Dechamps2012-07-261-4/+2
| | | | | | | | | | | | | | | | | This commit introduces a "copy-builtin" script designed to prepare a kernel source tree for building SPL as a builtin module. The script makes a full copy of all needed files, thus making the kernel source tree fully independent of the spl source package. To achieve that, some compilation flags (-include, -I) have been moved to module/Makefile. This Makefile is only used when compiling external modules; when compiling builtin modules, a Kbuild file generated by the configure-builtin script is used instead. This makes sure Makefiles inside the kernel source tree does not contain references to the spl source package. Signed-off-by: Brian Behlendorf <[email protected]> Issue zfsonlinux/zfs#851
* Use MODULE variable in module Makefile like zfs.Etienne Dechamps2012-07-261-19/+19
| | | | | | | | | | | | | | In zfs, each module Makefile contains a MODULE variable which contains the name of the module, and the following declarations reference this variable. In spl, there is a MODULES variable which is never used. Rename it to MODULE and use it like in zfs. This improves consistency between the two build systems. Signed-off-by: Prakash Surya <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue zfsonlinux/zfs#851
* Linux 3.4 compat, __clear_close_on_exec replaces FD_CLRRichard Yao2012-06-131-1/+1
| | | | | | | | | | | | | | | | | | | | torvalds/linux@1dce27c5aa6770e9d195f2bb7db1db3d4dde5591 introduced __clear_close_on_exec() as a replacement for FD_CLR. Further commits appear to have removed FD_CLR from the Linux source tree. This causes the following failure: error: implicit declaration of function '__FD_CLR' [-Werror=implicit-function-declaration] To correct this we update the code to use the current __clear_close_on_exec() interface for readability. Then we introduce an autotools check to determine if __clear_close_on_exec() is available. If it isn't then we define some compatibility logic which used the older FD_CLR() interface. Signed-off-by: Richard Yao <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #124
* Fix uninit variable in slab reclaim testBrian Behlendorf2012-06-131-1/+1
| | | | | | | | Gcc version 4.7.0 reports the delta.tv_sec in the slab reclaim test as potentially unitialized. In practice this will never occur but to keep gcc happy we initialize the variable to zero. Signed-off-by: Brian Behlendorf <behlendo@fedora-17-amd64.(none)>
* Add SPLAT test to exercise slab direct reclaimPrakash Surya2012-05-071-30/+165
| | | | | | | | | | | | | | | | | | | | | | This test is designed to verify that direct reclaim is functioning as expected. We allocate a large number of objects thus creating a large number of slabs. We then apply memory pressure and expect that the direct reclaim path can easily recover those slabs. The registered reclaim function will free the objects and the slab shrinker will call it repeatedly until at least a single slab can be freed. Note it may not be possible to reclaim every last slab via direct reclaim without a failure because the shrinker_rwsem may be contended. For this reason, quickly reclaiming 3/4 of the slabs is considered a success. This should all be possible within 10 seconds. For reference, on a system with 2G of memory this test takes roughly 0.2 seconds to run. It may take longer on larger memory systems but should still easily complete in the alloted 10 seconds. Signed-off-by: Prakash Surya <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #107
* Remove condition variable namesBrian Behlendorf2012-04-061-5/+5
| | | | | | | | | | | | | | | | | | | Long ago I added support to the spl for condition variable names because I thought they might be needed. It turns out they aren't. In fact the official Solaris cv_init(9F) man page discourages their use in the kernel. cv_init(9F) Parameters name - Descriptive string. This is obsolete and should be NULL. (Non-NULL strings are legal, but they're a waste of kernel memory.) Therefore, I'm removing them from the spl to reclaim this memory and adding an ASSERT() to ensure no new consumers are added which make use of the name. Signed-off-by: Brian Behlendorf <[email protected]>
* Add SPL_META_RELEASE to module load/unload messagesBrian Behlendorf2012-03-231-4/+4
| | | | | | | | Include the ZFS_META_RELEASE in the module load/unload messages to more clearly indicate exactly what version of the SPL has been loaded. Signed-off-by: Brian Behlendorf <[email protected]>
* Add taskq contention splat testNed Bass2012-01-181-0/+115
| | | | | | | | | | | | | Add a test designed to generate contention on the taskq spinlock by using a large number of threads (100) to perform a large number (131072) of trivial work items from a single queue. This simulates conditions that may occur with the zio free taskq when a 1TB file is removed from a ZFS filesystem, for example. This test should always pass. Its purpose is to provide a benchmark to easily measure the effectiveness of taskq optimizations using statistics from the kernel lock profiler. Signed-off-by: Brian Behlendorf <[email protected]> Issue #32
* Exercise new taskq interface in splat-taskq testsPrakash Surya2011-12-131-49/+280
| | | | | | | | | | | | The splat-taskq test functions were slightly modified to exercise the new taskq interface in addition to the old interface. If the old interface passes each of its tests, the new interface is exercised. Both sub tests (old interface and new interface) must pass for each test as a whole to pass. Signed-off-by: Prakash Surya <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #65
* Add Test: "Single task queue, recursive dispatch"Prakash Surya2011-12-131-0/+77
| | | | | | | | | | | | | Added another splat taskq test to ensure tasks can be recursively submitted to a single task queue without issue. When the taskq_dispatch_prealloc() interface is introduced, this use case can potentially cause a deadlock if a taskq_ent_t is dispatched while its tqent_list field is not empty. This _should_ never be a problem with the existing taskq_dispatch() interface. Signed-off-by: Prakash Surya <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue #65
* Add SPLAT_TEST_FINI call for SPLAT_TASKQ_TEST6_IDPrakash Surya2011-12-131-0/+1
| | | | | | | | | | This change adds the neglected SPLAT_TEST_FINI call for the SPLAT_TASKQ_TEST6_ID, just as is done for the other 5 SPLAT_TASKQ_* tests. Signed-off-by: Prakash Surya <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Closes #64
* Fix a typo referencing an incorrect symbolPrakash Surya2011-11-211-1/+1
| | | | | | | | | The splat_taskq_test4_common function was incorrectly referencing the splat_taskq-test13_func symbol, when it meant to be using the splat_taskq_test4_func symbol. Signed-off-by: Brian Behlendorf <[email protected]> Closes #61
* Add linux compatibility testsBrian Behlendorf2011-06-214-0/+247
| | | | | | | | | | | | | | | | | | | | | While the splat tests were originally designed to stress test the Solaris primatives. I am extending them to include some kernel compatibility tests. Certain linux APIs have changed frequently. These tests ensure that added compatibility is working properly and no unnoticed regression have slipped in. Test 1 and 2 add basic regression tests for shrink_icache_memory and shrink_dcache_memory. These are simply functional tests to ensure we can call these functions safely. Checking for correct behavior is more difficult since other running processes will influence the behavior. However, these functions are provided by the kernel so if we can successfully call them we assume they are working correctly. Test 3 checks that shrinker functions are being registered and called correctly. As of Linux 3.0 the shrinker API has changed four different times so I felt the need to add a trivial test case to ensure each variant works as expected.
* Make the SPL kernel messages consistent with ZFS.Darik Horn2011-04-211-4/+4
| | | | | | | Change the SPL kernel messages for module loading and module unloading so that they are similar to the ZFS kernel messages. Signed-off-by: Brian Behlendorf <[email protected]>
* Add zlib regression testBrian Behlendorf2011-02-254-0/+169
| | | | | | | | | A zlib regression test has been added to verify the correct behavior of z_compress_level() and z_uncompress. The test case simply takes a 128k buffer, it compresses the buffer, it them uncompresses the buffer, and finally it compares the buffers after the transform. If the buffers match then everything is fine and no data was lost. It performs this test for all 9 zlib compression levels.
* Remove VN_HOLD/VN_RELE/VOP_PUTPAGEBrian Behlendorf2011-01-121-7/+0
| | | | | | Previously these were defined to noops but rather than give the misleading impression that these are actually implemented I'm removing the type entirely for clarity.
* Add Thread Specific Data (TSD) Regression TestBrian Behlendorf2010-12-071-1/+184
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | To validate the correct behavior of the TSD interfaces it's important that we add a regression test. This test is designed to minimally exercise the fundamental TSD behavior, it does not attempt to validate all potential corner cases. The test will first create 32 keys via tsd_create() and register a common destructor. Next 16 wait threads will be created each of which set/verify a random value for all 32 keys, then block waiting to be released by the control thread. Meanwhile the control thread verifies that none of the destructors have been run prematurely. The next phase of the test is to create 16 exit threads which set/verify a random value for all 32 keys. They then immediately exit. This is is designed to verify tsd_exit() which will be called via thread_exit(). This must result in all registered destructors being run and the memory for the tsd being free'd. After this tsd_destroy() is verified by destroying all 32 keys. Once again we must see the expected number of destructors run and the tsd memory free'd. At this point the blocked threads are released and they exit calling tsd_exit() which should do very little since all the tsd has already been destroyed. If this all goes off without a hitch the test passes. To ensure no memory has been leaked, I have manually verified that after spl module unload no memory is reported leaked.
* Linux 2.6.36 compat, use fops->unlocked_ioctl()Brian Behlendorf2010-11-101-5/+4
| | | | | | | | | As of linux-2.6.36 the last in-tree consumer of fops->ioctl() has been removed and thus fops()->ioctl() has also been removed. The replacement hook is fops->unlocked_ioctl() which has existed in kernel since 2.6.12. Since the SPL only contains support back to 2.6.18 vintage kernels, I'm not adding an autoconf check for this and simply moving everything to use fops->unlocked_ioctl().
* Fix incorrect krw_type_t typeBrian Behlendorf2010-11-091-1/+1
| | | | | | | | | Flagged by the default compile options on archlinux 2010.05, we should be using the krw_t type not the krw_type_t type in the private data. module/splat/splat-rwlock.c: In function ‘splat_rwlock_test4_func’: module/splat/splat-rwlock.c:432:6: warning: case value ‘1’ not in enumerated type ‘krw_type_t’
* Support custom build directoriesBrian Behlendorf2010-09-051-15/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | One of the neat tricks an autoconf style project is capable of is allow configurion/building in a directory other than the source directory. The major advantage to this is that you can build the project various different ways while making changes in a single source tree. For example, this project is designed to work on various different Linux distributions each of which work slightly differently. This means that changes need to verified on each of those supported distributions perferably before the change is committed to the public git repo. Using nfs and custom build directories makes this much easier. I now have a single source tree in nfs mounted on several different systems each running a supported distribution. When I make a change to the source base I suspect may break things I can concurrently build from the same source on all the systems each in their own subdirectory. wget -c http://github.com/downloads/behlendorf/spl/spl-x.y.z.tar.gz tar -xzf spl-x.y.z.tar.gz cd spl-x-y-z ------------------------- run concurrently ---------------------- <ubuntu system> <fedora system> <debian system> <rhel6 system> mkdir ubuntu mkdir fedora mkdir debian mkdir rhel6 cd ubuntu cd fedora cd debian cd rhel6 ../configure ../configure ../configure ../configure make make make make make check make check make check make check This is something the project has almost supported for a long time but finishing this support should save me lots of time.
* Fix taskq code to not drop tasks when TQ_SLEEP is used.Ricardo M. Correia2010-08-021-5/+24
| | | | | | | | | | | | | | | | When TQ_SLEEP is used, taskq_dispatch() should always succeed even if the number of pending tasks is above tq->tq_maxalloc. This semantic is similar to KM_SLEEP in kmem allocations, which also always succeed. However, we cannot block forever otherwise there is a risk of deadlock. Therefore, we still allow the number of pending tasks to go above tq->tq_maxalloc with TQ_SLEEP, but we may sleep up to 1 second per task dispatch, thereby throttling the task dispatch rate. One of the existing splat tests was also augmented to test for this scenario. The test would fail with the previous implementation but now it succeeds. Signed-off-by: Brian Behlendorf <[email protected]>
* Split <sys/debug.h> headerBrian Behlendorf2010-07-202-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To avoid symbol conflicts with dependent packages the debug header must be split in to several parts. The <sys/debug.h> header now only contains the Solaris macro's such as ASSERT and VERIFY. The spl-debug.h header contain the spl specific debugging infrastructure and should be included by any package which needs to use the spl logging. Finally the spl-trace.h header contains internal data structures only used for the log facility and should not be included by anythign by spl-debug.c. This way dependent packages can include the standard Solaris headers without picking up any SPL debug macros. However, if the dependant package want to integrate with the SPL debugging subsystem they can then explicitly include spl-debug.h. Along with this change I have dropped the CHECK_STACK macros because the upstream Linux kernel now has much better stack depth checking built in and we don't need this complexity. Additionally SBUG has been replaced with PANIC and provided as part of the Solaris macro set. While the Solaris version is really panic() that conflicts with the Linux kernel so we'll just have to make due to PANIC. It should rarely be called directly, the prefered usage would be an ASSERT or VERIFY. There's lots of change here but this cleanup was overdue.
* Proposed fix for oops on SIGINT in splat atomic:64-bit test.Ned Bass2010-07-151-1/+1
| | | | | | | | | | | | | | | | The threads in the splat atomic:64-bit test share the data structure atomic_priv_t ap, which lives on the kernel stack of the splat user-space utility. If splat terminates before the threads, accesses to that memory location by the other threads become invalid. Splat synchronizes with the threads with the call: wait_event_interruptible(ap.ap_waitq, splat_atomic_test1_cond(&ap, i)); Apparently, the SIGINT wakes and terminates splat prematurely, so that GPFs or other bad things happen when the threads subsequently access ap. This commit prevents this by using the uninterruptible form: wait_event(ap.ap_waitq, splat_atomic_test1_cond(&ap, i));
* Add __divdi3(), remove __udivdi3() kernel dependencyBrian Behlendorf2010-07-131-0/+133
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Up until now no SPL consumer attempted to perform signed 64-bit division so there was no need to support this. That has now changed so I adding 64-bit division support for 32-bit platforms. The signed implementation is based on the unsigned version. Since the have been several bug reports in the past concerning correct 64-bit division on 32-bit platforms I added some long over due regression tests. Much to my surprise the unsigned 64-bit division regression tests failed. This was surprising because __udivdi3() was implemented by simply calling div64_u64() which is provided by the kernel. This meant that the linux kernels 64-bit division algorithm on 32-bit platforms was flawed. After some investigation this turned out to be exactly the case. Because of this I was forced to abandon the kernel helper and instead to fully implement 64-bit division in the spl. There are several published implementation out there on how to do this properly and I settled on one proposed in the book Hacker's Delight. Their proposed algoritm is freely available without restriction and I have just modified it to be linux kernel friendly. The update implementation now passed all the unsigned and signed regression tests. This should be functional, but not fast, which is good enough for out purposes. If you want fast too I'd strongly suggest you upgrade to a 64-bit platform. I have also reported the kernel bug and we'll see if we can't get it fixed up stream.