aboutsummaryrefslogtreecommitdiffstats
path: root/src/hash/sha1_sse2
Commit message (Collapse)AuthorAgeFilesLines
* Un-internal loadstor.h (and its header deps, rotate.h andlloyd2009-12-211-1/+1
| | | | | | | | | | | | | | bswap.h); too many external apps rely on loadstor.h existing. Define 64-bit generic bswap in terms of 32-bit bswap, since it's not much slower if 32-bit is also generic, and much faster if it's not. This may be quite helpful on 32-bit x86 in particular. Change formulation of generic 32-bit bswap. It may be faster or slower depending on the CPU, especially the latency and throuput of rotate instructions, but should be faster on an ideally superscalar processor with rotate instructions (ie, what I expect future CPUs to look more like).
* Make many more headers internal-only.lloyd2009-12-161-1/+1
| | | | | | | | | | | | | Fixes for the amalgamation generator for internal headers. Remove BOTAN_DLL exporting macros from all internal-only headers; the classes/functions there don't need to be exported, and avoiding the PIC/GOT indirection can be a big win. Add missing BOTAN_DLLs where necessary, mostly gfpmath and cvc For GCC, use -fvisibility=hidden and set BOTAN_DLL to the visibility __attribute__ to export those classes/functions.
* Remove extern decl of no longer used/included SHA-1 SSE2 functionlloyd2009-11-231-2/+0
|
* Cleanups - remove emails from source files, they should only live inlloyd2009-11-101-2/+2
| | | | credits.txt and thanks.txt. Remove some various bits of formatting weirdness.
* Add a new need_isa marker for info.txt that lets a module dependlloyd2009-11-061-13/+2
| | | | | | | | | | | | on a particular ISA extension rather than a list of CPUs. Much easier to edit and audit, too. Add markers on the AES-NI code and SHA-1/SSE2. Serpent and XTEA don't need it because they are generic and only depend on simd_32 which will silenty swap out a scalar version if SSE2/AltiVec isn't enabled (since it turns out on supersclar processors just doing 4 blocks in parallel can be a win even in GPRs). Add pentium3 to the list of CPUs with rdtsc, was missing. Odd!
* Clean up prep00_15 - same speed on Core2lloyd2009-10-291-16/+10
|
* Clean up the SSE2 SHA-1 code quite a bit, make better use of C++ featureslloyd2009-10-292-308/+267
| | | | and also make it stylistically much closer to the standard SHA-1 code.
* Small cleanups (remove tab characters, change macros to fit the rest oflloyd2009-10-291-123/+121
| | | | the code stylistically, etc)
* propagate from branch 'net.randombit.botan' (head ↵lloyd2009-10-291-1/+14
|\ | | | | | | | | | | 8fb69dd1c599ada1008c4cab2a6d502cbcc468e0) to branch 'net.randombit.botan.general-simd' (head c05c9a6d398659891fb8cca170ed514ea7e6476d)
| * Rename SSE2 stuff to be generally SIMD since it supports at least SSE2lloyd2009-10-291-1/+14
| | | | | | | | and Altivec (though Altivec is seemingly slower ATM...)
* | Remove the 'realname' attribute on all modules and cc/cpu/os info files.lloyd2009-10-291-2/+0
|/ | | | | Pretty much useless and unused, except for listing the module names in build.h and the short versions totally suffice for that.
* Add 'Distributed under...' text to files missing it. Some format cleanupslloyd2009-10-071-20/+9
|
* Remove add blocks from hash function info.txt fileslloyd2009-09-291-8/+0
|
* Make some changes to the SSE2 implementation of SHA-1 for compatability withlloyd2009-09-131-62/+46
| | | | Visual C++.
* Instead of each SSE2 implementation specifying which compilers + CPUs itlloyd2009-08-271-12/+0
| | | | | works on, have sse2_eng rely on a specific compiler/arch; each sse2 impl depends on the engine anyway, so they will only be loaded if OK.
* Correct some errors in the automatically generated dependencies.lloyd2009-07-161-0/+1
|
* Add a script that reads the output of print_deps.py and rewriteslloyd2009-07-151-6/+4
| | | | | | the info.txt files with the right module dependencies. Apply it across the codebase.
* CPU-specific engines are now only loaded if something depends on them,lloyd2009-07-071-0/+1
| | | | | | | | | | | | and all CPU-specific implementations now depend on the appropriate engine module. The most common problem before with this was that the SSE2 module was built, but the sole SSE2 code (SHA-1) was not (for instance, on an i686). This would cause a compile warning about the unused request object. Preventing unused engines from being built will also (very slightly) speed up the lookup process on most system.
* Thomas Moschny passed along a request from the Fedora packagers which camelloyd2009-03-303-21/+25
| | | | | | | | | | | | | | | up during the Fedora submission review, that each source file include some text about the license. One handy Perl script later and each file now has the line Distributed under the terms of the Botan license after the copyright notices. While I was in there modifying every file anyway, I also stripped out the remainder of the block comments (lots of astericks before and after the text); this is stylistic thing I picked up when I was first learning C++ but in retrospect it is not a good style as the structure makes it harder to modify comments (with the result that comments become fewer, shorter and are less likely to be updated, which are not good things).
* Wrap code and struct definitions internal to sha1_sse2_imp.cpp in anlloyd2008-11-241-0/+4
| | | | | anonymous namespace (in particular this should prevent Doxygen for generating documentation about the v4si union declared there).
* Revert change that added multiblock support to SSE2 SHA-1. Was causinglloyd2008-11-233-206/+183
| | | | | a random segfault (always inside an SSE2 intrinsic). Did not investigate much beyond that. Worth looking into since it seemed worth another 1% or so.
* Dean Gaudet's original version of the SHA-1 SSE2 code supported multiplelloyd2008-11-233-183/+206
| | | | | blocks as input (and can overlap computations from one block to another - very nice). Reimport that original version and use it.
* I had not anticipated this being really worthwhile, but it turns outlloyd2008-11-232-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | to have been so! Change MDx_HashFunction::hash to a new compress_n which hashes an arbitrary number of blocks. I had a thought this might reduce a bit of loop overhead but the results were far better than I anticipated. Speedup across the board of about 2%, and very noticable (+10%) increases for MD4 and Tiger (probably b/c both of those have so few instructions in each iteration of the compression function). Before: SHA-1: amd64: 211.9 MiB/s core: 210.0 MiB/s sse2: 295.2 MiB/s MD4: 476.2 MiB/s MD5: 355.2 MiB/s SHA-256: 99.8 MiB/s SHA-512: 151.4 MiB/s RIPEMD-128: 326.9 MiB/s RIPEMD-160: 225.1 MiB/s Tiger: 214.8 MiB/s Whirlpool: 38.4 MiB/s After: SHA-1: amd64: 215.6 MiB/s core: 213.8 MiB/s sse2: 299.9 MiB/s MD4: 528.4 MiB/s MD5: 368.8 MiB/s SHA-256: 103.9 MiB/s SHA-512: 156.8 MiB/s RIPEMD-128: 334.8 MiB/s RIPEMD-160: 229.7 MiB/s Tiger: 240.7 MiB/s Whirlpool: 38.6 MiB/s
* Enable SSE2 SHA-1 on Intel Prescott CPUslloyd2008-11-171-0/+1
|
* Add BOTAN_DLL macro to public class definitions that were missing it.lloyd2008-10-091-1/+1
|
* Fix copyright noticeslloyd2008-10-091-1/+1
|
* Fix prototype confusion (harmless but incorrect)lloyd2008-09-303-5/+5
|
* Derive x86, x86-64, and SSE2 implementations of SHA-1 directly from SHA_160lloyd2008-09-293-34/+4
|
* Make asm implementations distinctly named objects, for instance MD5_IA32,lloyd2008-09-294-27/+25
| | | | | | | | | | | | | | | | | | | rather than silently replacing the C++ versions. Instead they are silently replaced (currently, at least) at the lookup level: we switch off the set of feature macros set to choose the best implementation in the current build configuration. So you can have (and benchmark) MD5 and MD5_IA32 directly against each other in the same program with no hassles, but if you ask for "MD5", you'll get maybe an MD5 or maybe MD5_IA32. Also make the canonical asm names (which aren't guarded by C++ namespaces) of the form botan_<algo>_<arch>_<func> as in botan_sha160_ia32_compress, to avoid namespace collisions. This change has another bonus that it should in many cases be possible to derive the asm specializations directly from the original implementation, saving some code (and of course logically SHA_160_IA32 is a SHA_160, just one with a faster implementation of the compression function, so this seems reasonable anyway).
* propagate from branch 'net.randombit.botan' (head ↵lloyd2008-09-291-0/+1
| | | | | | ca7d7fc1ae6b55c5328c9cf1ec1cafd1daadedd4) to branch 'net.randombit.botan.modularized' (head 614263a9742a0c554e4093620147f6e156264d41)
* Add info.txt files for asm hash moduleslloyd2008-09-291-0/+2
|
* Rename all modinfo.txt files to info.txt, since they are all (none) oflloyd2008-09-291-0/+0
| | | | | them modules now. In any case there is no distinction so info.txt seems better.
* Make mdx_hash also a module, which most of the hash functions depend on.lloyd2008-09-284-0/+430
Correct the configure program so modules are not autoloaded if their dependences are not available. (Eg, --no-module=mdx_hash will disable MD4, MD5, SHA-1, etc rather than cause a compliation failure)