| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Fix a few minor issues found thereby
|
| |
|
|
|
|
| |
Add a checker script.
|
| |
|
| |
|
|
|
|
| |
Deprecate some crufty functions. Optimize binary encoding/decoding.
|
|
|
|
| |
No real bugs, but pointed out some odd constructs and duplicated logic
|
|
|
|
| |
Assumed to be 0/1
|
|
|
|
|
| |
Use ct_is_zero instead of more complicated construction, and
avoid duplicated size check/resize - Data::set_word will handle it.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the long ago when I wrote the Barrett code I must have missed that
Barrett works for any input < 2^2k where k is the word size of the
modulus. Fixing this has several nice effects, it is faster because it
replaces a multiprecision comparison with a single size_t compare, and
now the branch does not reveal information about the input or modulus,
but only their word lengths, which is not considered sensitive.
Fixing this allows reverting the change make in a57ce5a4fd2 and now
RSA signing is even slightly faster than in 2.8, rather than 30% slower.
|
|
|
|
| |
As it would leak if an input was > p^2, or just close to it in size.
|
| |
|
|
|
|
|
|
|
|
| |
Was already done in P-256 but not in P-{192,224,384}.
This is a cache-based side channel which would be good to address. It
seems like it would be very difficult to exploit even with perfect
recovery, but crazier things have worked.
|
|
|
|
|
| |
Previously we unpoisoned the input to high_bit but this is no
longer required. But still the output should be unpoisoned.
|
|
|
|
|
|
|
| |
They get compiled as const-time on x86-64 with GCC but I don't think
this can be totally relied on. But it is anyway an improvement.
And, faster, because we compute it recursively
|
|
|
|
|
|
|
|
|
|
|
|
| |
The decoding leaked some information about the delimiter index
due to copying only exactly input_len - delim_idx bytes. I can't
articulate a specific attack that would work here, but it is easy
enough to fix this to run in const time instead, where all bytes
are accessed regardless of the length of the padding.
CT::copy_out is O(n^2) and thus terrible, but in practice it is only
used with RSA decryption, and multiplication is also O(n^2) with the
modulus size, so a few extra cycles here doesn't matter much.
|
|
|
|
|
| |
We know the lookup table is some power of 2, unrolling a bit
allows more IPC
|
|\ |
|
| |
| |
| |
| |
| | |
Previous EEA leaked information about the low word of the prime,
which is a problem for RSA.
|
| | |
|
|/
|
|
|
|
| |
Instead require the inputs be reduced already. For RSA-CRT use
Barrett which is const time already. For SRP6 inputs were not reduced,
use the Barrett hook available in DL_Group.
|
|
|
|
| |
This var is not used if we use Baile-PSW instead
|
|
|
|
| |
This is a tiny thing but it saves over 100K cycles for P-384 ECDSA
|
|\ |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
They would previously leak for example if the requested shift was 0.
However, that should only happen in two situations: very dumb code
explicitly requested a shift of zero (in which case we don't care if
performance is poor, your code is dumb) or a variable shift that just
happens to be zero, in which case the variable may be a secret, for
instance this can be seen in the GCD computation.
|
| | |
|
|\ \
| |/
|/| |
|
| | |
|
|/ |
|
|
|
|
| |
Saves 5% for ECDSA
|
|
|
|
| |
If not negative we don't need to check the size
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise we can end up calling the Barrett reducer with an input that
is more than the square of the modulus, which will make it fall back
to the (slow) const time division.
This only affected even moduli, and only when the base was larger than
the modulus.
OSS-Fuzz 11750
|
|
|
|
| |
This is still leaky, but much less than before.
|
|
|
|
| |
Unfortunately Barrett reductions API allows negative inputs
|
| |
|
|
|
|
| |
This would continually reallocate to larger sizes which is bad news.
|
| |
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Originally wrote it for div-by-word but that ends up requiring a dword
type which we don't always have. And uint8_t covers the most important
cases of n = 10 and n = 58 (whenever I get around to writing base58).
We could portably support up to div-by-uint32, but I don't think we need it.
Nicely for n = 10, this is actually faster than the variable time division.
|
|/
|
|
| |
This is still leaky, but better than nothing.
|
|
|
|
|
|
|
|
| |
It is stupid and slow (~50-100x slower than variable time version) but
still useful for protecting critical algorithms.
Not currently used, waiting for OSS-Fuzz to test it for a while before
we commit to it.
|
|
|
|
|
|
|
| |
If one of the values had leading zero words, this could end up
calling bigint_sub with x_size < y_size.
OSS-Fuzz 11664 and 11656
|