| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
parameters are as well. So make them template paramters.
The sole exception was AES, because you could either initialize AES
with a fixed key length, in which case it would only be that specific
key length, or not, in which case it would support any valid AES key
size. This is removed in this checkin; you have to specifically ask for
AES-128, AES-192, or AES-256, depending on which one you want.
This is probably actually a good thing, because every implementation
other than the base one (SSSE3, AES-NI, OpenSSL) did not support
"AES", only the versions with specific fixed key sizes. So forcing
the user to ask for the one they want ensures they get the ones
that are faster and/or safer.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sets the block size statically and also creates an enum with the
size. Use the enum instead of calling block_size() where possible,
since that uses two virtual function calls per block which is quite
unfortunate. The real advantages here as compared to the previous
version which kept the block size as a per-object u32bit:
- The compiler can inline the constant as an immediate operand
(previously it would load the value via an indirection on this)
- Removes 32 bits per object overhead (except in cases with actually
variable block sizes, which are very few and rarely used)
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the initial/default length of the array, update all users to instead
pass the value to the constructor.
This is a old vestigal thing from a class (SecureBuffer) that used
this compile-time constant in order to store the values in an
array. However this was changed way back in 2002 to use the same
allocator hooks as the rest of the containers, so the only advantage
to using the length field was that the initial length was set and
didn't have to be set in the constructor which was midly convenient.
However this directly conflicts with the desire to be able to
(eventually) use std::vector with a custom allocator, since of course
vector doesn't support this.
Fortunately almost all of the uses are in classes which have only a
single constructor, so there is little to no duplication by instead
initializing the size in the constructor.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
harmonising MemoryRegion with std::vector:
The MemoryRegion::clear() function would zeroise the buffer, but keep
the memory allocated and the size unchanged. This is very different
from STL's clear(), which is basically the equivalent to what is
called destroy() in MemoryRegion. So to be able to replace MemoryRegion
with a std::vector, we have to rename destroy() to clear() and we have
to expose the current functionality of clear() in some other way, since
vector doesn't support this operation. Do so by adding a global function
named zeroise() which takes a MemoryRegion which is zeroed. Remove clear()
to ensure all callers are updated.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
rotations in the code. This reduces the number of cache lines
potentially accessed in the first round from 64 to 16 (assuming 64
byte cache lines). On average, about 10 cache lines will actually be
accessed, assuming a uniform distribution of the inputs, so there
definitely is still a timing channel here, just a somewhat smaller
one.
I experimented with using the 256 element table for all rounds but it
reduced performance significantly and I'm not sure if the benefit is
worth the cost or not.
|
| |
|
|
|
|
|
| |
This caused Doxygen to think this was markup meant for it, which really
caused some clutter in the namespace page.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a second template param to SecureVector which specifies the initial
length.
Change all callers to be SecureVector instead of SecureBuffer.
This can go away in C++0x, once compilers implement N2712 ("Non-static
data member initializers"), and we can just write code as
SecureVector<byte> P{18};
instead
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bswap.h); too many external apps rely on loadstor.h existing.
Define 64-bit generic bswap in terms of 32-bit bswap, since it's
not much slower if 32-bit is also generic, and much faster if
it's not. This may be quite helpful on 32-bit x86 in particular.
Change formulation of generic 32-bit bswap. It may be faster or
slower depending on the CPU, especially the latency and throuput
of rotate instructions, but should be faster on an ideally
superscalar processor with rotate instructions (ie, what I expect
future CPUs to look more like).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes for the amalgamation generator for internal headers.
Remove BOTAN_DLL exporting macros from all internal-only headers;
the classes/functions there don't need to be exported, and
avoiding the PIC/GOT indirection can be a big win.
Add missing BOTAN_DLLs where necessary, mostly gfpmath and cvc
For GCC, use -fvisibility=hidden and set BOTAN_DLL to the
visibility __attribute__ to export those classes/functions.
|
|
|
|
| |
to give a 3-7% speed improvement on Core2 with GCC.
|
|
|
|
|
| |
Pretty much useless and unused, except for listing the module names in
build.h and the short versions totally suffice for that.
|
|
|
|
|
|
| |
just too fragile and not that useful. Something like Java's checked exceptions
might be nice, but simply killing the process entirely if an unexpected
exception is thrown is not exactly useful for something trying to be robust.
|
|
|
|
|
|
|
|
| |
the prefetch is called for each block of input, and so a total of
(4096+256)/64 = 68 prefetches are executed for each block. This reduces
performance of iterative modes dramatically.
I'm not sure what the right approach for dealing with this is.
|
|
|
|
|
|
|
|
|
|
| |
timing attacks, since once all the TE/SE tables are entirely in cache then
timing attacks against it become somewhat harder. However for this to be
a full defense it would be necessary to ensure the tables were entirely
loaded into cache, which is not guaranteed by the normal SSE prefetch
instructions. (Or prefetch instructions for other CPUs, AFAIK).
Much more importantly, it provides a 10% speedup.
|
| |
|
|
|
|
|
|
| |
enc/dec functions it replaces, these are public interfaces.
Add the first bits of a SSE2 implementation of Serpent. Currently incomplete.
|
|
|
|
|
|
|
|
|
| |
decryption. Currently only used for counter mode. Doesn't offer much
advantage as-is (though might help slightly, in terms of cache effects),
but allows for SIMD implementations to process multiple blocks in parallel
when possible. Particularly thinking here of Serpent; TEA/XTEA also seem
promising in this sense, as is Threefish once that is implemented as a
standalone block cipher.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
up during the Fedora submission review, that each source file include some
text about the license. One handy Perl script later and each file now has
the line
Distributed under the terms of the Botan license
after the copyright notices.
While I was in there modifying every file anyway, I also stripped out the
remainder of the block comments (lots of astericks before and after the
text); this is stylistic thing I picked up when I was first learning C++
but in retrospect it is not a good style as the structure makes it harder
to modify comments (with the result that comments become fewer, shorter and
are less likely to be updated, which are not good things).
|
| |
|
|
|
|
| |
encryption.
|
|
|
|
|
|
|
|
|
|
|
| |
This seems to have a significant impact on overall speed, now measuring
on my Core2 Q6600:
AES-128: 123.41 MiB/sec
AES-192: 108.28 MiB/sec
AES-256: 95.72 MiB/sec
which is roughly 8-10% faster than before.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before:
$ ./check --bench-algo=AES-128,AES-256 --seconds=10
AES-128: 101.99 MiB/sec
AES-256: 78.30 MiB/sec
After:
$ ./check --bench-algo=AES-128,AES-256 --seconds=10
AES-128: 106.51 MiB/sec
AES-256: 84.26 MiB/sec
|
| |
|
|
|
|
| |
conflicts/collisions
|
|
|