| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Previously --disable-sse2/--disable-ssse3 would not work as expected
|
|
|
|
| |
Closes #2089
|
|
|
|
| |
Closes #2082
|
|
|
|
|
|
| |
BearSSL is much slower than Botan's builtins, and it is not commonly
included in distributions so doesn't even have the advantage of
ubiquity.
|
|
|
|
| |
Not sure why this wasn't causing an error in the MSVC CI builds.
|
| |
|
| |
|
| |
|
|
|
|
| |
New redundant-move and pessimizing-move warnings found some
|
|
|
|
| |
As that is the proper name of the hash. Add a typedef for compat.
|
|
|
|
| |
Improves performance by about 10-12%
|
|
|
|
| |
See #1822
|
|
|
|
| |
Both about 33% faster on Skylake
|
|
|
|
|
|
|
| |
It was only needed for one case which is easily hardcoded. Include
rotate.h in all the source files that actually use rotr/rotl but
implicitly picked it up via loadstor.h -> bswap.h -> rotate.h include
chain.
|
| |
|
| |
|
|\ |
|
| |
| |
| |
| |
| |
| | |
Allows replacing div/mod by a variable with a shift/mask.
Allows storing just the bit count, which saves a few bytes.
|
| | |
|
|/ |
|
|
|
|
|
|
|
|
|
| |
Prefer using wrappers in mem_utils for this.
Current exception is where memcpy is being used to convert between
two different types, since copy_mem requires input and output
pointers have the same type. There should be a new function to
handle conversion-via-memcpy operation.
|
| |
|
| |
|
|
|
|
| |
Typically not a bottleneck but this shows up in XMSS profiling
|
|
|
|
|
| |
Currently just a copy of the baseline compression function, but
compiled with BMI2 flags. On Skylake improves performance by about 40%.
|
| |
|
|
|
|
| |
GH #1477
|
|
|
|
| |
Noticable speedup for SHAKE esp with longer output lengths
|
| |
|
| |
|
|
|
|
| |
Put all the statics at beginning followed by member functions.
|
|
|
|
| |
Inspired by #1433
|
| |
|
| |
|
|
|
|
| |
Lack of these broke single file amalgamation (GH #1386)
|
|
|
|
|
| |
Was fixed in 2017 SP1. Same bug hit Crypto++ -
https://gihub.com/weidai11/cryptopp/issues/527
|
|
|
|
|
|
| |
This breaks how we determine the ISA flags for amalgamation files.
The code for doing that is kind of a hack but I don't want to mess
with it right now, easier to just rename the ISA internally.
|
|
|
|
| |
Simplifies macro generation
|
|
|
|
| |
These conflict with name of temp variables and MSVC gets noisy.
|
| |
|
| |
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Algorithm uses 4 tables of precalculated CRC24 values, thanks to which
it can process in parallel 32 bits of data. This tric doubles performance
Further improvements are possible.
Results - (tested with RNP) processing 1GB armor data
```
OLD: rnp --enarmor=msg /tmp/1gb.rnd --output 4.48s user 0.89s system 98% cpu 5.429 total
NEW: rnp --enarmor=msg /tmp/1gb.rnd --output 2.38s user 0.86s system 79% cpu 4.089 total
OLD: rnp --dearmor out.xxx --output out.d 5.58s user 0.65s system 98% cpu 6.338 total
NEW: rnp --dearmor out.xxx --output out.d 3.28s user 0.84s system 96% cpu 4.275 total
```
|
|/
|
|
| |
Needed for the create calls
|
| |
|
| |
|
|
|
|
|
| |
Nothing major but does improve perf for large buffers from
910 MB/s to 970 MB/s on Skylake.
|
| |
|
|
|
|
| |
Reduces stack usage and a bit faster
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem with asm rol/ror is the compiler can't schedule effectively.
But we only need asm in the case when the rotation is variable, so distinguish
the two cases. If a compile time constant, then static_assert that the rotation
is in the correct range and do the straightforward expression knowing the compiler
will probably do the right thing. Otherwise do a tricky expression that both
GCC and Clang happen to have recognize. Avoid the reduction case; instead
require that the rotation be in range (this reverts 2b37c13dcf).
Remove the asm rotations (making this branch illnamed), because now both Clang
and GCC will create a roll without any extra help.
Remove the reduction/mask by the word size for the variable case. The compiler
can't optimize that it out well, but it's easy to ensure it is valid in the callers,
especially now that the variable input cases are easy to grep for.
|