Commit message (Collapse) | Author | Age | Files | Lines | |
---|---|---|---|---|---|
* | The assembly code is only using 81 words of W, but 84 were being allocated. | lloyd | 2006-08-21 | 1 | -2/+2 |
| | |||||
* | Remove a check for GCC in the source; that's what the module compiler | lloyd | 2006-08-21 | 1 | -4/+0 |
| | | | | restrictions are for. | ||||
* | Rename some variables for consistency with the SHA-1 asm code | lloyd | 2006-08-21 | 2 | -14/+16 |
| | |||||
* | Get ride of an unnecessary register copy | lloyd | 2006-08-21 | 1 | -11/+9 |
| | |||||
* | Inside the compression function, store the original stack pointer in the | lloyd | 2006-08-21 | 2 | -28/+38 |
| | | | | | W array, and then use %esp to point to the message words. This gives an extra register for temporary usage. | ||||
* | Let SHA_160::W be resized dynamically; potentially the asm version could | lloyd | 2006-08-21 | 1 | -0/+8 |
| | | | | use a little extra workspace, this makes that simpler to do. | ||||
* | Somewhat ineffectual instruction reorderings in the round functions | lloyd | 2006-08-21 | 1 | -28/+28 |
| | | | | | Use EDX instead of EBP for holding the pointer to the digest array at the end of the function. | ||||
* | Rotate the temporary variable along with the chaining variables; gives | lloyd | 2006-08-21 | 1 | -175/+154 |
| | | | | some further room for optimization. | ||||
* | Declare mp_bits for alg_ia32, since it touches the MPI code | lloyd | 2006-08-20 | 1 | -0/+2 |
| | |||||
* | Fix typo | lloyd | 2006-08-19 | 1 | -1/+1 |
| | |||||
* | Move Montgomery reduction algorithm into mp_asm.cpp | lloyd | 2006-08-19 | 2 | -45/+1 |
| | | | | | | | | | | Move the inner-most loop of Montgomery into bigint_mul_add_words, in mp_muladd.cpp Use bigint_mul_add_words for the inner loop of bigint_simple_multiply Move the compare/subtract at the end of the Montomgery algorithm into bigint_monty_redc | ||||
* | Align the major jump targets | lloyd | 2006-08-19 | 1 | -15/+6 |
| | | | | | | Remove the comment containing the unoptimized C code Add copyright notice | ||||
* | Add an x86 assembly implementation of bigint_mul_add_words, which is | lloyd | 2006-08-18 | 4 | -3/+134 |
| | | | | the core loop of bigint_monty_redc. | ||||
* | Fix the es_capi module; was not using the new global_config() accessor | lloyd | 2006-08-17 | 1 | -1/+1 |
| | |||||
* | Add a distinct loop ending for loop-until-equals-immediate; other loops | lloyd | 2006-08-15 | 5 | -7/+13 |
| | | | | ending conditions will be needed later. | ||||
* | Change the Serpent linear transforms to use the move-and-shift-3 macro | lloyd | 2006-08-15 | 1 | -4/+2 |
| | |||||
* | Add a specialized shift instruction for 3 that uses LEA to do a shift and | lloyd | 2006-08-15 | 1 | -0/+1 |
| | | | | move in one instruction. | ||||
* | Drop the asm-specific serpent.h | lloyd | 2006-08-15 | 2 | -34/+0 |
| | |||||
* | Formatting/readability changes | lloyd | 2006-08-15 | 1 | -6/+5 |
| | |||||
* | Remove continuation slashes from the last line of some of the macros | lloyd | 2006-08-15 | 1 | -8/+8 |
| | |||||
* | Reorder the linear transformations for (nominally) better instruction | lloyd | 2006-08-15 | 1 | -10/+10 |
| | | | | scheduling. | ||||
* | Have the expansion loop in the key schedule take advantage of free | lloyd | 2006-08-15 | 2 | -12/+17 |
| | | | | registers to load words we will need in advance. | ||||
* | Remove unused variable | lloyd | 2006-08-15 | 1 | -5/+7 |
| | | | | Collect the external functions into a single extern "C" block | ||||
* | Implement the Serpent key schedule in assembly as well, so the C++ | lloyd | 2006-08-15 | 3 | -122/+98 |
| | | | | | | versions of the Sboxes can be removed. Add some parens inside the asm macros | ||||
* | Remove an unused function | lloyd | 2006-08-15 | 1 | -26/+1 |
| | |||||
* | Implement decryption in the Serpent assembly code | lloyd | 2006-08-15 | 4 | -207/+386 |
| | |||||
* | Add the beginnings of an x96 assembler version of Serpent. Currently only | lloyd | 2006-08-15 | 4 | -0/+621 |
| | | | | encryption is done in asm, the rest is still in C++ | ||||
* | Was using sha1_core in the END_FUNCTION calls; doesn't make a difference, | lloyd | 2006-08-14 | 2 | -2/+2 |
| | | | | | since right now END_FUNCTION doesn't use its argument, but it looked strange and might cause problems later. | ||||
* | Get instruction scheduling decently correct. Now running at 110 Mb/s on | lloyd | 2006-08-13 | 1 | -5/+5 |
| | | | | my Athlon, which isn't too far behind OpenSSL | ||||
* | Load the message words we need in the round before. By going out to the | lloyd | 2006-08-13 | 1 | -54/+133 |
| | | | | | stack to get the address of the message array each time, we can free up a register for the rest of the code inside the rounds. | ||||
* | Introduce a MSG() macro which returns the desired message word | lloyd | 2006-08-13 | 1 | -9/+13 |
| | |||||
* | Use LEA with the magic constant and A, rather than the magic and the | lloyd | 2006-08-13 | 1 | -9/+9 |
| | | | | boolean; same trick as in MD5. Roughly a 5% speedup. | ||||
* | Make the temporary implicit, since we always use ESP inside the round | lloyd | 2006-08-13 | 1 | -47/+49 |
| | | | | functions. | ||||
* | Add a (working, optimized) x86 version of MD4 | lloyd | 2006-08-13 | 3 | -2/+182 |
| | |||||
* | Add the memory word and the magic constant using LEA, rather than the | lloyd | 2006-08-13 | 1 | -24/+24 |
| | | | | | boolean function result and the magic constant; the memory word is available sooner, and it seems to produce a major (12%) win. | ||||
* | Forgot the II() macro in the last checkin | lloyd | 2006-08-13 | 1 | -1/+2 |
| | |||||
* | Use the spare register to load the message word, which will potentially | lloyd | 2006-08-13 | 1 | -3/+7 |
| | | | | help hide some of the memory latency. | ||||
* | Make the temporary implicit, since we were always passing the same register | lloyd | 2006-08-13 | 1 | -106/+108 |
| | |||||
* | Cleanups, and move the initial memory access to the beginning of each | lloyd | 2006-08-13 | 2 | -52/+77 |
| | | | | MD5 round in an attempt to hide the latency a bit | ||||
* | Add an x86 assembly MD5 implementation; works, but needs optimization | lloyd | 2006-08-13 | 3 | -0/+176 |
| | |||||
* | Add a macro for the not instruction | lloyd | 2006-08-13 | 1 | -0/+1 |
| | |||||
* | Minor formatting changes, reorder one instruction | lloyd | 2006-08-13 | 1 | -3/+1 |
| | |||||
* | Clear the W buffer inside the SHA_160::clear() functions | lloyd | 2006-08-13 | 1 | -0/+1 |
| | |||||
* | Remove a block of disabled code that was just for debug purposes | lloyd | 2006-08-13 | 1 | -8/+0 |
| | |||||
* | Clean up the macros, add comment headers, add a couple of helper macros | lloyd | 2006-08-13 | 2 | -28/+63 |
| | | | | | | for spilling/restoring registers. Reorder some instructions for slightly better scheduling across rounds | ||||
* | Drop the AES asm code for now | lloyd | 2006-08-13 | 3 | -192/+0 |
| | |||||
* | Update sha1core.S to match the macro updates in the last checkin. Rename | lloyd | 2006-08-13 | 1 | -63/+63 |
| | | | | some variables for easier reading. | ||||
* | A few macro fixes | lloyd | 2006-08-13 | 1 | -7/+10 |
| | |||||
* | Add stub versions of AES assembler | lloyd | 2006-08-13 | 3 | -0/+193 |
| | |||||
* | Rename sha_x86 module to alg_ia32; there will probably be other algorithms | lloyd | 2006-08-13 | 4 | -0/+0 |
| | | | | going in here (at least eventually, and potentially soon-ish) |