botan.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Correct spelling	Jack Lloyd	2018-12-29	1	-0/+1
\|
*	Add OS::read_env_variable	Jack Lloyd	2018-12-29	3	-9/+22
\| \| \| \|	Combines the priv check and the getenv call on one.
*	Merge GH #1798 Use posix_memalign instead of mmap for page locked pool	Jack Lloyd	2018-12-29	1	-17/+9
\|\
\| *	Use posix_memalign instead of mmap for creating the locking pool	Jack Lloyd	2018-12-28	1	-17/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As described in #602, using mmap with fork causes problems because the mmap remains shared in the child instead of being copy-on-write, then the parent and child stomp on each others memory. However we really do not need mmap semantics, we just want a block of memory that is page-aligned, which can be done with posix_memalign instead. This was added in POSIX.1-2001 and seems to be implemented by all modern systems. Closes #602
* \|	Avoid const-time modulo in DSA verification	Jack Lloyd	2018-12-29	1	-1/+11
\| \| \| \| \| \| \| \| \| \| \| \|	It has a substantial perf hit and is not necessary. It may not be really necessary for signatures either but leave that as it, with a comment explaining.
* \|	Simplifications in BigInt	Jack Lloyd	2018-12-29	1	-7/+1
\|/ \| \| \| \|	Use ct_is_zero instead of more complicated construction, and avoid duplicated size check/resize - Data::set_word will handle it.
*	Make bigint_sub_abs const time	Jack Lloyd	2018-12-27	2	-6/+26
\|
*	Add a test of highly imbalanced RSA key	Jack Lloyd	2018-12-27	1	-0/+15
\|
*	Fix Barrett reduction input bound	Jack Lloyd	2018-12-26	3	-13/+23
\| \| \| \| \| \| \| \| \| \| \| \|	In the long ago when I wrote the Barrett code I must have missed that Barrett works for any input < 2^2k where k is the word size of the modulus. Fixing this has several nice effects, it is faster because it replaces a multiprecision comparison with a single size_t compare, and now the branch does not reveal information about the input or modulus, but only their word lengths, which is not considered sensitive. Fixing this allows reverting the change make in a57ce5a4fd2 and now RSA signing is even slightly faster than in 2.8, rather than 30% slower.
*	Avoid size-based bypass of the comparison in Barrett reduction.	Jack Lloyd	2018-12-24	1	-1/+1
\| \| \| \|	As it would leak if an input was > p^2, or just close to it in size.
*	Avoid conditional branch in Barrett for negative inputs	Jack Lloyd	2018-12-24	1	-4/+27
\|
*	Always use const-time modulo during DSA signing	Jack Lloyd	2018-12-24	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \|	Since we are reducing a mod-p integer down to mod-q this would nearly always use ct_modulo in any case. And, in the case where Barrett did work, it would reveal that g^k mod p was <= qq which would likely be useful for searching for k. This should actually be slightly faster (if anything) since it avoids the unnecessary comparison against qq and jumps directly to ct_modulo.
*	Address a side channel in RSA and SM2	Jack Lloyd	2018-12-24	2	-8/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Barrett will branch to a different (and slower) algorithm if the input is larger than the square of the modulus. This branch can be detected by a side channel. For RSA we need to compute m % p and m % q to get CRT started. Being able to detect if m > qq (assuming q is the smaller prime) allows a binary search on the secret prime. This attack is blocked by input blinding, but still seems dangerous. Unfortunately changing to use the generic const time modulo instead of Barrett introduces a rather severe performance regression in RSA signing. In SM2, reduce k-rx modulo the order before multiplying it with (x-1)^-1. Otherwise the need for slow modulo vs Barrett leaks information about k and/or x.
*	In NIST P-xxx reductions unpoison S before using it	Jack Lloyd	2018-12-24	1	-8/+10
\| \| \| \| \| \| \| \|	Was already done in P-256 but not in P-{192,224,384}. This is a cache-based side channel which would be good to address. It seems like it would be very difficult to exploit even with perfect recovery, but crazier things have worked.
*	Unpoison result of high_bits_free	Jack Lloyd	2018-12-24	1	-0/+1
\| \| \| \| \|	Previously we unpoisoned the input to high_bit but this is no longer required. But still the output should be unpoisoned.
*	Correct read in test fuzzers	Jack Lloyd	2018-12-23	1	-1/+1
\|
*	Add a multi-file input mode for test fuzzers	Jack Lloyd	2018-12-23	3	-24/+105
\| \| \| \| \| \| \| \| \| \|	The test_fuzzers.py script is very slow especially on CI. Add a mode to the test fuzzers where it will accept many files on the command line and test each of them in turn. This is 100s of times faster, as it avoids all overhead from fork/exec. It has the downside that you can't tell which input caused a crash, so retain the old mode with --one-at-a-time option for debugging work.
*	Move coverage before fuzzers in Travis build	Jack Lloyd	2018-12-23	1	-1/+1
\| \| \| \| \|	Coverage is the slowest build, moving it up puts it into the initial tranche of builds so it finishes before the end of the build.
*	In Travis, run OS X first	Jack Lloyd	2018-12-23	1	-1/+1
\| \| \| \| \| \|	It is slower to startup and the overall build ends up waiting for these last 2 builds. By running them in the front of the line they can overlap with other builds.
*	By default just run 20 of the AEAD test vectors through CLI	Jack Lloyd	2018-12-23	1	-6/+11
\| \| \| \| \|	Running them all takes a long time, especially in CI, and doesn't really add much.
*	Increase Travis ccache size	Jack Lloyd	2018-12-23	1	-1/+1
\| \| \| \|	The cache size increases will continue until hit rate improves.
*	Increase Travis git pull depth	Jack Lloyd	2018-12-23	1	-1/+1
\| \| \| \| \| \| \|	Undocumented? side effect of a small git pull depth - if more than N new commits are pushed to master while an earlier build is running, the old build starts failing, as when CI does the pull it does not find the commit it is building within the checked out tree.
*	Another try at silencing Coverity on this	Jack Lloyd	2018-12-23	1	-1/+1
\|
*	Initialize System_Error::m_error_code	Jack Lloyd	2018-12-23	1	-1/+2
\| \| \| \|	Actual bug, flagged by Coverity
*	Avoid double return of unique_ptr	Jack Lloyd	2018-12-23	1	-1/+3
\| \| \| \|	Flagged by Coverity
*	Add --no-store-vc-rev option for use in CI builds	Jack Lloyd	2018-12-23	1	-0/+2
\| \| \| \| \| \| \|	This skips putting the git revision in the build.h header. This value changing every time means we effectively disable ccache's direct mode (which is faster than preprocessor mode) and also prevent any caching of the amalgamation file (since version.cpp expands the macro).
*	Increase Travis ccache to 750M	Jack Lloyd	2018-12-23	1	-1/+1
\| \| \| \|	Even 600M is not sufficient for the coverage build
*	Rename OS::get_processor_timestamp to OS::get_cpu_cycle_counter	Jack Lloyd	2018-12-23	5	-14/+15
\| \| \| \| \|	Using phrase "timestamp" makes it sound like it has some relation to wall clock which it does not.
*	Now Timer does not need to include an internal header	Jack Lloyd	2018-12-23	1	-1/+0
\|
*	De-inline more of Timer	Jack Lloyd	2018-12-23	2	-41/+37
\| \| \| \|	No reason for these to be inlined
*	Make significant_words const time also	Jack Lloyd	2018-12-23	4	-40/+75
\| \| \| \| \| \|	Only used in one place, where const time doesn't matter, but can't hurt. Remove low_bit, can be replaced by ctz.
*	In Timer, grab CPU clock first	Jack Lloyd	2018-12-23	1	-9/+9
\| \| \| \| \| \|	Reading the system timestamp first causes every event to get a few hundred cycles tacked onto it. Only mattered when the thing being tested was very fast.
*	Increase Travis ccache again	Jack Lloyd	2018-12-23	1	-1/+1
\| \| \| \|	Still insufficient for debug builds
*	Remove now incorrect comment	Jack Lloyd	2018-12-22	1	-5/+0
\|
*	Make high_bit and ctz actually const time	Jack Lloyd	2018-12-22	1	-3/+3
\|
*	Promote ct_is_zero and expand_top_bit to bit_ops.h	Jack Lloyd	2018-12-22	2	-10/+21
\|
*	Make ctz and high_bit faster and const-time-ish	Jack Lloyd	2018-12-22	3	-48/+51
\| \| \| \| \| \| \|	They get compiled as const-time on x86-64 with GCC but I don't think this can be totally relied on. But it is anyway an improvement. And, faster, because we compute it recursively
*	Increase Travis cache size [ci skip]	Jack Lloyd	2018-12-22	1	-2/+2
\| \| \| \| \|	With compression disabled, the cache is too small for builds that use debug info, and causes 100% miss rate.
*	Fix build with PGI [ci skip]	Jack Lloyd	2018-12-22	1	-5/+7
\| \| \| \|	I couldn't get anything to link with PGI, but at least it builds again.
*	Merge GH #1794 Improve const time logic in PKCS1v15 and OAEP decoding	Jack Lloyd	2018-12-21	9	-92/+171
\|\
\| *	Use consistent logic for OAEP and PKCS1v15 decoding	Jack Lloyd	2018-12-21	9	-92/+171
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The decoding leaked some information about the delimiter index due to copying only exactly input_len - delim_idx bytes. I can't articulate a specific attack that would work here, but it is easy enough to fix this to run in const time instead, where all bytes are accessed regardless of the length of the padding. CT::copy_out is O(n^2) and thus terrible, but in practice it is only used with RSA decryption, and multiplication is also O(n^2) with the modulus size, so a few extra cycles here doesn't matter much.
* \|	Avoid including rotate.h in bswap.h	Jack Lloyd	2018-12-21	28	-2/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	It was only needed for one case which is easily hardcoded. Include rotate.h in all the source files that actually use rotr/rotl but implicitly picked it up via loadstor.h -> bswap.h -> rotate.h include chain.
* \|	Stop compressing Travis ccache	Jack Lloyd	2018-12-21	1	-3/+1
\|/ \| \| \|	Since CPU is main bottleneck to the build, this is likely not helping.
*	Address a couple of Coverity false positives	Jack Lloyd	2018-12-19	4	-7/+62
\| \| \| \|	Add tests for is_power_of_2
*	Avoid using unblinded Montgomery ladder during ECC key generation	Jack Lloyd	2018-12-18	2	-11/+32
\| \| \| \| \| \| \| \| \| \| \|	As doing so means that information about the high bits of the scalar can leak via timing since the loop bound depends on the length of the scalar. An attacker who has such information can perform a more efficient brute force attack (using Pollard's rho) than would be possible otherwise. Found by Ján Jančár (@J08nY) using ECTester (https://github.com/crocs-muni/ECTester) CVE-2018-20187
*	Test how long it takes to precompute base point multiples	Jack Lloyd	2018-12-16	2	-1/+21
\|
*	In PointGFp addition, prevent all_zeros from being shortcircuited	Jack Lloyd	2018-12-14	1	-4/+7
\| \| \| \| \| \|	This doesn't matter much but it causes confusing valgrind output when const-time checking since it distinguishes between the two possible conditional returns.
*	Unroll const_time_lookup by 2	Jack Lloyd	2018-12-14	1	-6/+10
\| \| \| \| \|	We know the lookup table is some power of 2, unrolling a bit allows more IPC
*	Simplify the const time lookup in ECC scalar mul	Jack Lloyd	2018-12-14	1	-12/+9
\| \| \| \| \|	Code is easier to understand and it may let the CPU interleave the loads and logical ops better. Slightly faster on my machine.
*	Use a 3-bit comb for ECC base point multiply	Jack Lloyd	2018-12-13	2	-19/+36
\| \| \| \|	Improves ECDSA signing by 15%