| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
the low bytes were equal, then the saturating subtraction result in
that byte would be 0 with the high byte containing a non-zero value.
To deal with this, shift and or together the two values into the low
byte.
Add some new tests which check out the SIMD implementation more
carefully, including values that trigger the problem in the earlier
version.
|
|
|
|
|
|
|
|
|
|
| |
working correctly under Clang - the technique for emulating unsigned
compare relied on signed overflow. The new method does not, and works
under GCC, ICC, and Clang. Even better, the compare takes only 2
instructions instead of 4.
Prevent using any of the asm implementations under Clang on x86-32.
All of them crash under Clang 2.9, unclear why.
|
|
|
|
| |
added to the flags here.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sets the block size statically and also creates an enum with the
size. Use the enum instead of calling block_size() where possible,
since that uses two virtual function calls per block which is quite
unfortunate. The real advantages here as compared to the previous
version which kept the block size as a per-object u32bit:
- The compiler can inline the constant as an immediate operand
(previously it would load the value via an indirection on this)
- Removes 32 bits per object overhead (except in cases with actually
variable block sizes, which are very few and rarely used)
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
process
|
|
|
|
|
| |
for getting access to the key schedule, instead of giving the key
schedule protected status, which is much harder tu audit.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the implementation rather than the preferred one. Update all
implementations.
Add a new function parallel_bytes() which returns
parallelism() * BLOCK_SIZE * BUILD_TIME_CONSTANT
This is because i noticed all current calls of parallelism() just
multiplied the result by the block size already, so this simplified
that code.
The build time constant is set to 4, which was the previous default
return value of parallelism(). However the SIMD versions returned
2*native paralellism rather than 4*, so this increases the buffer
sizes used for those algorithms.
The constant multiple lives in buildh.in and build.h, and is named
BOTAN_BLOCK_CIPHER_PAR_MULT.
|
|
|
|
|
|
|
|
| |
Default unless specified is now 4.
For SIMD code, use 2x the number of blocks which are processed in
parallel using SIMD by that cipher. It may make sense to increase this to
4x or even more, further experimentation is necessary.
|
| |
|
|
faster than the scalar version on a Core2.
|