aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* Move bigint_simple_mul into mp_mul.cpp, since that is the only place itlloyd2006-08-194-17/+26
| | | | | | was used. Make a variant of bigint_simple_mul, bigint_simple_sqr, for mp_sqr.cpp
* Fix typolloyd2006-08-191-1/+1
|
* Delete trailing whitespacelloyd2006-08-194-5/+5
|
* Move Montgomery reduction algorithm into mp_asm.cpplloyd2006-08-198-110/+69
| | | | | | | | | | Move the inner-most loop of Montgomery into bigint_mul_add_words, in mp_muladd.cpp Use bigint_mul_add_words for the inner loop of bigint_simple_multiply Move the compare/subtract at the end of the Montomgery algorithm into bigint_monty_redc
* Don't test Skipjack at startup - it's really not that important, andlloyd2006-08-191-8/+0
| | | | | running the test means the algorithm prototype is loaded into memory when it will probably never be used later.
* Remove trailing whitespacelloyd2006-08-192-2/+2
|
* Align the major jump targetslloyd2006-08-191-15/+6
| | | | | | Remove the comment containing the unoptimized C code Add copyright notice
* Add an x86 assembly implementation of bigint_mul_add_words, which islloyd2006-08-184-3/+134
| | | | the core loop of bigint_monty_redc.
* Simplify the implementation of bigint_divoplloyd2006-08-181-6/+8
|
* Move montgomery_reduce to after choose_window_bits for better consistencylloyd2006-08-171-18/+18
| | | | between the Montgomery and fixed-window exponentiators.
* Create a slightly higher level wrapper around bigint_monty_redc, save alloyd2006-08-171-18/+13
| | | | few lines.
* Remove whitespacelloyd2006-08-171-3/+0
|
* Fix the es_capi module; was not using the new global_config() accessorlloyd2006-08-171-1/+1
|
* Inline the call to word_add in bigint_monty_redc - the carry in waslloyd2006-08-171-3/+3
| | | | | | always zero, so this is both a bit more efficient and more readable. It won't be able to take advantage of asm implementations of word_add, but the benefit from that with a single call per loop is small anyway.
* Move bigint_monty_redc to its own file; profiling indicates that thislloyd2006-08-172-33/+49
| | | | | single function is using 30+% of the runtime during RSA operations, making it a strong candidate for implementation in assembly.
* Split Montgomery reduction into two functions, the core algorithm linkedlloyd2006-08-163-6/+15
| | | | | | as C (for replacing by asm later), and another that performs a subtract if needed (inside powm_mnt.cpp). That way an asm version of the Montgomery algorithm won't have to deal with calling other functions.
* Add a distinct loop ending for loop-until-equals-immediate; other loopslloyd2006-08-155-7/+13
| | | | ending conditions will be needed later.
* Remove some variables we didn't really need in the key schedulelloyd2006-08-151-6/+4
|
* Version bump in the configure scriptlloyd2006-08-152-2/+2
|
* Change the Serpent linear transforms to use the move-and-shift-3 macrolloyd2006-08-151-4/+2
|
* Add a specialized shift instruction for 3 that uses LEA to do a shift andlloyd2006-08-151-0/+1
| | | | move in one instruction.
* Drop the asm-specific serpent.hlloyd2006-08-152-34/+0
|
* Formatting/readability changeslloyd2006-08-151-6/+5
|
* Replace Serpent's key_xor function with a macro, so the header can belloyd2006-08-152-7/+5
| | | | shared between the C++ and assembly versions.
* Remove continuation slashes from the last line of some of the macroslloyd2006-08-151-8/+8
|
* Reorder the linear transformations for (nominally) better instructionlloyd2006-08-151-10/+10
| | | | scheduling.
* Have the expansion loop in the key schedule take advantage of freelloyd2006-08-152-12/+17
| | | | registers to load words we will need in advance.
* Remove unused variablelloyd2006-08-151-5/+7
| | | | Collect the external functions into a single extern "C" block
* Implement the Serpent key schedule in assembly as well, so the C++lloyd2006-08-153-122/+98
| | | | | | versions of the Sboxes can be removed. Add some parens inside the asm macros
* Remove an unused functionlloyd2006-08-151-26/+1
|
* Implement decryption in the Serpent assembly codelloyd2006-08-154-207/+386
|
* Add the beginnings of an x96 assembler version of Serpent. Currently onlylloyd2006-08-154-0/+621
| | | | encryption is done in asm, the rest is still in C++
* Was using sha1_core in the END_FUNCTION calls; doesn't make a difference,lloyd2006-08-142-2/+2
| | | | | since right now END_FUNCTION doesn't use its argument, but it looked strange and might cause problems later.
* Changelog updates1.5.10lloyd2006-08-131-2/+8
|
* Get instruction scheduling decently correct. Now running at 110 Mb/s onlloyd2006-08-131-5/+5
| | | | my Athlon, which isn't too far behind OpenSSL
* Load the message words we need in the round before. By going out to thelloyd2006-08-131-54/+133
| | | | | stack to get the address of the message array each time, we can free up a register for the rest of the code inside the rounds.
* Introduce a MSG() macro which returns the desired message wordlloyd2006-08-131-9/+13
|
* Use LEA with the magic constant and A, rather than the magic and thelloyd2006-08-131-9/+9
| | | | boolean; same trick as in MD5. Roughly a 5% speedup.
* Make the temporary implicit, since we always use ESP inside the roundlloyd2006-08-131-47/+49
| | | | functions.
* Add a (working, optimized) x86 version of MD4lloyd2006-08-133-2/+182
|
* Add the memory word and the magic constant using LEA, rather than thelloyd2006-08-131-24/+24
| | | | | boolean function result and the magic constant; the memory word is available sooner, and it seems to produce a major (12%) win.
* Forgot the II() macro in the last checkinlloyd2006-08-131-1/+2
|
* Use the spare register to load the message word, which will potentiallylloyd2006-08-131-3/+7
| | | | help hide some of the memory latency.
* Make the temporary implicit, since we were always passing the same registerlloyd2006-08-131-106/+108
|
* Cleanups, and move the initial memory access to the beginning of eachlloyd2006-08-132-52/+77
| | | | MD5 round in an attempt to hide the latency a bit
* Respect the --seconds command line argument with --bench-algolloyd2006-08-132-4/+4
|
* Add an x86 assembly MD5 implementation; works, but needs optimizationlloyd2006-08-133-0/+176
|
* Add a macro for the not instructionlloyd2006-08-131-0/+1
|
* Minor formatting changes, reorder one instructionlloyd2006-08-131-3/+1
|
* Add checks for MD4, MD5, and SHA-1 for zero-length inputslloyd2006-08-131-0/+3
|