| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
The numbers in #256 suggest that it does nothing at all for performance.
|
|
|
|
|
| |
Turns out to be a pessimization - removing improves ECDSA verify
by up to 5% on Skylake.
|
| |
|
| |
|
|
|
|
|
|
|
| |
This improves strong prime generation slightly as otherwise we perform
two (redundant) Lucas checks on q, first when generating q with weak
probability and then a second time when doing the strong confirmation
of q if 2*q+1 shows to be prime.
|
|\ |
|
| |
| |
| |
| |
| | |
Previous version leaked some (minimal) information from the loop
bounds.
|
|/
|
|
|
|
|
|
|
|
|
|
| |
In RSA keygen we have to verify that p-1 and e are coprime. But this
is expensive to compute. So first do a single round of Miller-Rabin
primality test; only if that passes do we test coprimality. Improves
RSA keygen times notably. All times averaged over many keygens:
1024-bit 21.74 ms -> 10.78 ms
2048-bit 94.93 ms -> 62.80 ms
3072-bit 296.79 ms -> 198.12 ms
4096-bit 738.07 ms -> 499.10 ms
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Caused an extra allocation for no reason in some cases.
|
|
|
|
| |
Based on profiling RSA key generation
|
| |
|
|
|
|
|
| |
On its own gives a modest speedup (3-5%) to RSA sign/decrypt, and it
is needed for another more complicated optimization.
|
|
|
|
| |
I think this is a false positive but whatever
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Create BOTAN_DEPRECATED_HEADER so we can warn about this consistently.
Shuffle around the filter headers so all of the concrete filters
are defined in filters.h instead of being spread across many headers.
Document which headers are deprecated as well as a list of headers which
will be made internal-only in a future major release.
|
|
|
|
| |
Fix a few minor issues found thereby
|
| |
|
|
|
|
| |
Add a checker script.
|
| |
|
| |
|
|
|
|
| |
Deprecate some crufty functions. Optimize binary encoding/decoding.
|
|
|
|
| |
No real bugs, but pointed out some odd constructs and duplicated logic
|
|
|
|
| |
Assumed to be 0/1
|
|
|
|
|
| |
Use ct_is_zero instead of more complicated construction, and
avoid duplicated size check/resize - Data::set_word will handle it.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the long ago when I wrote the Barrett code I must have missed that
Barrett works for any input < 2^2k where k is the word size of the
modulus. Fixing this has several nice effects, it is faster because it
replaces a multiprecision comparison with a single size_t compare, and
now the branch does not reveal information about the input or modulus,
but only their word lengths, which is not considered sensitive.
Fixing this allows reverting the change make in a57ce5a4fd2 and now
RSA signing is even slightly faster than in 2.8, rather than 30% slower.
|
|
|
|
| |
As it would leak if an input was > p^2, or just close to it in size.
|
| |
|
|
|
|
|
|
|
|
| |
Was already done in P-256 but not in P-{192,224,384}.
This is a cache-based side channel which would be good to address. It
seems like it would be very difficult to exploit even with perfect
recovery, but crazier things have worked.
|
|
|
|
|
| |
Previously we unpoisoned the input to high_bit but this is no
longer required. But still the output should be unpoisoned.
|
|
|
|
|
|
|
| |
They get compiled as const-time on x86-64 with GCC but I don't think
this can be totally relied on. But it is anyway an improvement.
And, faster, because we compute it recursively
|
|
|
|
|
|
|
|
|
|
|
|
| |
The decoding leaked some information about the delimiter index
due to copying only exactly input_len - delim_idx bytes. I can't
articulate a specific attack that would work here, but it is easy
enough to fix this to run in const time instead, where all bytes
are accessed regardless of the length of the padding.
CT::copy_out is O(n^2) and thus terrible, but in practice it is only
used with RSA decryption, and multiplication is also O(n^2) with the
modulus size, so a few extra cycles here doesn't matter much.
|
|
|
|
|
| |
We know the lookup table is some power of 2, unrolling a bit
allows more IPC
|
|\ |
|
| |
| |
| |
| |
| | |
Previous EEA leaked information about the low word of the prime,
which is a problem for RSA.
|
| | |
|
|/
|
|
|
|
| |
Instead require the inputs be reduced already. For RSA-CRT use
Barrett which is const time already. For SRP6 inputs were not reduced,
use the Barrett hook available in DL_Group.
|
|
|
|
| |
This var is not used if we use Baile-PSW instead
|
|
|
|
| |
This is a tiny thing but it saves over 100K cycles for P-384 ECDSA
|
|\ |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
They would previously leak for example if the requested shift was 0.
However, that should only happen in two situations: very dumb code
explicitly requested a shift of zero (in which case we don't care if
performance is poor, your code is dumb) or a variable shift that just
happens to be zero, in which case the variable may be a secret, for
instance this can be seen in the GCD computation.
|
| | |
|
|\ \
| |/
|/| |
|