| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
| |
Doing two blocks at a time exposes more ILP and substantially
improves performance.
Idea from http://jultika.oulu.fi/files/nbnfioulu-201305311409.pdf
|
| |
|
|
|
|
| |
Clearly I have a tic for this.
|
|
|
|
| |
Needed for the create calls
|
|
|
|
|
| |
Previously calling update or encrypt without calling set_key first
would result in invalid outputs or else crashing.
|
| |
|
|
|
|
| |
This ended up allocating 256 KiB!
|
| |
|
|
|
|
|
| |
This improves performance by ~ .5 cycle/byte. Also it ensures that
our cache reading countermeasure works as expected.
|
|
|
|
|
|
|
|
|
| |
Should have significantly better cache characteristics, though it
would be nice to verify this.
It reduces performance somewhat but less than I expected, at least
on Skylake. I need to check this across more platforms to make sure
t won't hurt too badly.
|
|
|
|
|
|
|
|
|
| |
Using a larger table helps quite a bit. Using 4 tables (ala AES T-tables)
didn't seem to help much at all, it's only slightly faster than a single
table with rotations.
Continue to use the 8 bit table in the first and last rounds as a
countermeasure against cache attacks.
|
|
|
|
|
| |
Missed by everything but the OCB wide tests because most ciphers
have fixed width and get the override.
|
|
|
|
| |
GCC 7 can actually vectorize this for AVX2
|
|
|
|
| |
From ~5 cbp to ~2.5 cbp on Skylake
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem with asm rol/ror is the compiler can't schedule effectively.
But we only need asm in the case when the rotation is variable, so distinguish
the two cases. If a compile time constant, then static_assert that the rotation
is in the correct range and do the straightforward expression knowing the compiler
will probably do the right thing. Otherwise do a tricky expression that both
GCC and Clang happen to have recognize. Avoid the reduction case; instead
require that the rotation be in range (this reverts 2b37c13dcf).
Remove the asm rotations (making this branch illnamed), because now both Clang
and GCC will create a roll without any extra help.
Remove the reduction/mask by the word size for the variable case. The compiler
can't optimize that it out well, but it's easy to ensure it is valid in the callers,
especially now that the variable input cases are easy to grep for.
|
|
|
|
| |
Nothing major but probably good to clean these up.
|
|
|
|
|
| |
Things like -Wconversion and -Wuseless-cast that are noisy and
not on by default.
|
|
|
|
| |
[ci skip]
|
|
|
|
| |
Sonar
|
|
|
|
| |
Found with Sonar
|
|
|
|
|
|
| |
Mostly residue from the old system of splitting impls among subclasses
Found with Sonar
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Done by a perl script which converted all classes to final, followed
by selective reversion where it caused compilation failures.
|
|
|
|
| |
Some help from include-what-you-use
|
|
|
|
| |
[ci skip]
|
|
|
|
|
|
| |
ISO C++ reserves names with double underscores in them
Closes #512
|
| |
|
|
|
|
|
| |
Defined in build.h, all equal to BOTAN_DLL so ties into existing
system for exporting symbols.
|
| |
|
|
|
|
| |
Based on the patch in GH #1146
|
|
|
|
| |
Based on VC2017 output
|
|
|
|
| |
Remove NEON support, replace macros with inlines
|
| |
|
|
|
|
| |
GH #1077
|
|
|
|
|
|
| |
Using _mm_set_epi32 caused 2 distinct (adjacent) loads followed
by an unpack to combine the registers. Have not tested on hardware
to see if this actually improves performance.
|
|
|
|
|
|
| |
Combine several shuffle operations into one. Thanks to jww for the hint.
Probably not noticably faster on any system.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
It complains it cannot pass the __m128i without loss of alignment.
(Why, I have no idea.)
|
|
|
|
| |
Bit over 2x faster on my desktop
|
|
|
|
| |
256 bit ARX block cipher with hardware support, what's not to love.
|
|
|
|
| |
This work was sponsored by Ribose Inc
|
|
|
|
|
|
| |
Allow an empty nonce to mean "continue using the current cipher state".
GH #864
|
|
|
|
|
|
|
|
|
| |
* fixes for deprecated constructions in c++11 and later (explicit rule of 3/5 or implicit rule of 0 and other violations)
* `default` specifier instead of `{}` in some places(probably all)
* removal of unreachable code (for example `return` after `throw`)
* removal of compilation unit only visible, but not used functions
* fix for `throw()` specifier - used instead `BOTAN_NOEXCEPT`
* removed not needed semicolons
|
| |
|