| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
| |
amalgamation properly, but would happen to work if a previously
written amalgamation was around.
Also make changes allowing using the SIMD optimized versions of SHA-1
and Serpent to be used in the amalgamation.
|
| |
|
| |
|
|
|
|
| |
tests on Nehalem indicate a small but measurable win there (about 3%).
|
|
|
|
| |
alternative methods of getting pieces of the expanded message.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bswap.h); too many external apps rely on loadstor.h existing.
Define 64-bit generic bswap in terms of 32-bit bswap, since it's
not much slower if 32-bit is also generic, and much faster if
it's not. This may be quite helpful on 32-bit x86 in particular.
Change formulation of generic 32-bit bswap. It may be faster or
slower depending on the CPU, especially the latency and throuput
of rotate instructions, but should be faster on an ideally
superscalar processor with rotate instructions (ie, what I expect
future CPUs to look more like).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes for the amalgamation generator for internal headers.
Remove BOTAN_DLL exporting macros from all internal-only headers;
the classes/functions there don't need to be exported, and
avoiding the PIC/GOT indirection can be a big win.
Add missing BOTAN_DLLs where necessary, mostly gfpmath and cvc
For GCC, use -fvisibility=hidden and set BOTAN_DLL to the
visibility __attribute__ to export those classes/functions.
|
|
|
|
| |
credits.txt and thanks.txt. Remove some various bits of formatting weirdness.
|
| |
|
|
|
|
| |
and also make it stylistically much closer to the standard SHA-1 code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
up during the Fedora submission review, that each source file include some
text about the license. One handy Perl script later and each file now has
the line
Distributed under the terms of the Botan license
after the copyright notices.
While I was in there modifying every file anyway, I also stripped out the
remainder of the block comments (lots of astericks before and after the
text); this is stylistic thing I picked up when I was first learning C++
but in retrospect it is not a good style as the structure makes it harder
to modify comments (with the result that comments become fewer, shorter and
are less likely to be updated, which are not good things).
|
|
|
|
|
| |
a random segfault (always inside an SSE2 intrinsic). Did not investigate
much beyond that. Worth looking into since it seemed worth another 1% or so.
|
|
|
|
|
| |
blocks as input (and can overlap computations from one block to another -
very nice). Reimport that original version and use it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to have been so! Change MDx_HashFunction::hash to a new compress_n
which hashes an arbitrary number of blocks. I had a thought this might
reduce a bit of loop overhead but the results were far better than I
anticipated. Speedup across the board of about 2%, and very
noticable (+10%) increases for MD4 and Tiger (probably b/c both
of those have so few instructions in each iteration of the
compression function).
Before:
SHA-1:
amd64: 211.9 MiB/s
core: 210.0 MiB/s
sse2: 295.2 MiB/s
MD4: 476.2 MiB/s
MD5: 355.2 MiB/s
SHA-256: 99.8 MiB/s
SHA-512: 151.4 MiB/s
RIPEMD-128: 326.9 MiB/s
RIPEMD-160: 225.1 MiB/s
Tiger: 214.8 MiB/s
Whirlpool: 38.4 MiB/s
After:
SHA-1:
amd64: 215.6 MiB/s
core: 213.8 MiB/s
sse2: 299.9 MiB/s
MD4: 528.4 MiB/s
MD5: 368.8 MiB/s
SHA-256: 103.9 MiB/s
SHA-512: 156.8 MiB/s
RIPEMD-128: 334.8 MiB/s
RIPEMD-160: 229.7 MiB/s
Tiger: 240.7 MiB/s
Whirlpool: 38.6 MiB/s
|
| |
|
| |
|
|
rather than silently replacing the C++ versions. Instead they are silently
replaced (currently, at least) at the lookup level: we switch off the set
of feature macros set to choose the best implementation in the current
build configuration. So you can have (and benchmark) MD5 and MD5_IA32
directly against each other in the same program with no hassles, but if
you ask for "MD5", you'll get maybe an MD5 or maybe MD5_IA32.
Also make the canonical asm names (which aren't guarded by C++ namespaces)
of the form botan_<algo>_<arch>_<func> as in botan_sha160_ia32_compress,
to avoid namespace collisions.
This change has another bonus that it should in many cases be possible to
derive the asm specializations directly from the original implementation,
saving some code (and of course logically SHA_160_IA32 is a SHA_160, just
one with a faster implementation of the compression function, so this seems
reasonable anyway).
|