| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
at config time.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sets the block size statically and also creates an enum with the
size. Use the enum instead of calling block_size() where possible,
since that uses two virtual function calls per block which is quite
unfortunate. The real advantages here as compared to the previous
version which kept the block size as a per-object u32bit:
- The compiler can inline the constant as an immediate operand
(previously it would load the value via an indirection on this)
- Removes 32 bits per object overhead (except in cases with actually
variable block sizes, which are very few and rarely used)
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
process
|
|
|
|
|
| |
for getting access to the key schedule, instead of giving the key
schedule protected status, which is much harder tu audit.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the implementation rather than the preferred one. Update all
implementations.
Add a new function parallel_bytes() which returns
parallelism() * BLOCK_SIZE * BUILD_TIME_CONSTANT
This is because i noticed all current calls of parallelism() just
multiplied the result by the block size already, so this simplified
that code.
The build time constant is set to 4, which was the previous default
return value of parallelism(). However the SIMD versions returned
2*native paralellism rather than 4*, so this increases the buffer
sizes used for those algorithms.
The constant multiple lives in buildh.in and build.h, and is named
BOTAN_BLOCK_CIPHER_PAR_MULT.
|
|
|
|
|
|
|
|
| |
Default unless specified is now 4.
For SIMD code, use 2x the number of blocks which are processed in
parallel using SIMD by that cipher. It may make sense to increase this to
4x or even more, further experimentation is necessary.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bswap.h); too many external apps rely on loadstor.h existing.
Define 64-bit generic bswap in terms of 32-bit bswap, since it's
not much slower if 32-bit is also generic, and much faster if
it's not. This may be quite helpful on 32-bit x86 in particular.
Change formulation of generic 32-bit bswap. It may be faster or
slower depending on the CPU, especially the latency and throuput
of rotate instructions, but should be faster on an ideally
superscalar processor with rotate instructions (ie, what I expect
future CPUs to look more like).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes for the amalgamation generator for internal headers.
Remove BOTAN_DLL exporting macros from all internal-only headers;
the classes/functions there don't need to be exported, and
avoiding the PIC/GOT indirection can be a big win.
Add missing BOTAN_DLLs where necessary, mostly gfpmath and cvc
For GCC, use -fvisibility=hidden and set BOTAN_DLL to the
visibility __attribute__ to export those classes/functions.
|
| |
|
| |
|
|
and Altivec (though Altivec is seemingly slower ATM...)
|