botan.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	Add copyright + license on the new SIMD files	lloyd	2009-10-28	4	-2/+14
\|
*	Document SIMD changes	lloyd	2009-10-28	1	-0/+2
\|
*	propagate from branch 'net.randombit.botan' (head ↵	lloyd	2009-10-28	12	-404/+1101
\|\ \| \| \| \| \| \| \| \| \| \|	bf629b13dd132b263e76a72b7eca0f7e4ab19aac) to branch 'net.randombit.botan.general-simd' (head f731cff08ff0d04c062742c0c6cfcc18856400ea)
\| *	Add an AltiVec SIMD_32 implementation. Tested and works for Serpent and XTEA	lloyd	2009-10-28	1	-0/+178
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	on a PowerPC 970 running Gentoo with GCC 4.3.4 Uses a GCC syntax for creating literal values instead of the Motorola syntax [{1,2,3,4} instead of (1,2,3,4)]. In tests so far, this is much, much slower than either the standard scalar code, or using the SIMD-in-scalar-registers code. It looks like for whatever reason GCC is refusing to inline the function: SIMD_Altivec(__vector unsigned int input) { reg = input; } and calls it with a branch hundreds of times in each function. I don't know if this is the entire reason it's slower, but it definitely can't be helping. The code handles unaligned loads OK but assumes stores are to an aligned address. This will fail drastically some day, and needs to be fixed to either use scalar stores, which (most?) PPCs will handle (if slowly), or batch the loads and stores so we can work across the loads. Considering the code so far loads 4 vectors of data in one go this would probably be a big win (and also for loads, since instead of doing 8 loads for 4 registers only 5 are needed).
\| *	Define SSE rotate_right in terms of rotate left, and load_be in terms	lloyd	2009-10-28	1	-3/+2
\| \| \| \| \| \| \| \|	of load_le + bswap
\| *	Add XTEA decryption	lloyd	2009-10-26	1	-11/+47
\| \|
\| *	Add subtraction operators to SIMD_32 classes, needed for XTEA decrypt	lloyd	2009-10-26	2	-0/+26
\| \|
\| *	Add a wrapper for a set of SSE2 operations with convenient syntax for 4x32	lloyd	2009-10-26	11	-404/+862
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	operations. Also add a pure scalar code version. Convert Serpent to use this new interface, and add an implementation of XTEA in SIMD. The wrappers plus the scalar version allow SIMD-ish code to work on all platforms. This is often a win due to better ILP being visible to the processor (as with the recent XTEA optimizations). Only real danger is register starvation, mostly an issue on x86 these days. So it may (or may not) be a win to consolidate the standard C++ versions and the SIMD versions together. Future work: - Add AltiVec/VMX version - Maybe also for ARM's NEON extension? Less pressing, I would think. - Convert SHA-1 code to use SIMD_32 - Add XTEA SIMD decryption (currently only encrypt) - Change SSE2 engine to SIMD_engine - Modify configure.py to set BOTAN_TARGET_CPU_HAS_[SSE2\|ALTIVEC\|NEON\|XXX] macros
* \|	Add missing log note for 1.9.1 change notes on CTR/OFB change	lloyd	2009-10-28	1	-0/+1
\| \|
* \|	Indent fix	lloyd	2009-10-26	1	-1/+1
\|/
*	Tick version to 1.9.2-dev	lloyd	2009-10-26	3	-4/+6
\|
*	Small cleanups	lloyd	2009-10-26	1	-4/+3
\|
*	Add ; after call to VC++'s __cpuid, not a macro	lloyd	2009-10-25	2	-7/+14
\|
*	Cast the u32bit output array to an int* when calling the VC++ intrinsic,	lloyd	2009-10-25	1	-3/+6
\| \| \| \| \| \| \|	since it passes signed ints for whatever reason. Ensure CALL_CPUID is always defined (previously, it would not be if on an x86 but compiled with something other than GCC, ICC, VC++).
*	Update docs for 1.9.1 release 2009-10-231.9.1	lloyd	2009-10-23	3	-3/+4
\|
*	Kill stdio include	lloyd	2009-10-23	1	-2/+0
\|
*	Use new load/store ops in xtea x4 code	lloyd	2009-10-23	1	-12/+6
\|
*	Add new store_[l\|b]e variants taking 8 values.	lloyd	2009-10-23	1	-16/+108
\| \| \| \| \| \|	Add new load options that are passed a number of variables by reference, setting them all at once. Will allow for batching operations (eg using SIMD operations to do 128-bit wide bswaps) for future optimizations.
*	Simply unrolling the loop in XTEA and processing 4 blocks worth of data at	lloyd	2009-10-23	1	-0/+70
\| \| \| \| \| \| \| \|	a time more than doubles performance (from 38 MB/s to 90 MB/s on Core2 Q6600). Could do even better with SIMD, I'm sure, but this is fast and easy, and works everywhere. Probably will hurt on 32-bit x86 from the register pressure.
*	Increase the internal buffer size of the Hex coder/decoder, and put it into	lloyd	2009-10-23	1	-3/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	a named constant instead of being magic. Move from 64 bytes to 256. This was necessary to allow Pipe(new Hex_Decoder, filter, ...) to give filter a sufficiently large input block. It would be nicer if the filter itself (in this case, ECB_Decryption, but others apply as well) was smart enough to buffer on its own. It might also be useful if code could query what parallelism a block cipher provided and modify their actions accordingly.
*	Add TEA and XTEA ECB vectors	lloyd	2009-10-23	1	-0/+650
\|
*	Add test vectors for TEA and XTEA in CTR mode	lloyd	2009-10-23	1	-0/+1242
\|
*	Note removing exception specs. Reorder by interestingness	lloyd	2009-10-22	1	-2/+3
\|
*	Remove all exception specifications. The way these are designed in C++ is	lloyd	2009-10-22	121	-140/+140
\| \| \| \| \| \|	just too fragile and not that useful. Something like Java's checked exceptions might be nice, but simply killing the process entirely if an unexpected exception is thrown is not exactly useful for something trying to be robust.
*	Reset version as 1.9.1-dev instead of -rc1	lloyd	2009-10-21	3	-3/+3
\|
*	Enable CPUID on x86 (checking wrong macro name)	lloyd	2009-10-21	1	-1/+1
\|
*	Disable traceback	lloyd	2009-10-21	1	-2/+2
\|
*	Format, add names to params in header	lloyd	2009-10-19	1	-3/+7
\|
*	Document Clang support	lloyd	2009-10-19	1	-1/+1
\|
*	Add theoreticaly support for Clang/LLVM. Current Gentoo clang ebuild doesn't	lloyd	2009-10-19	1	-0/+46
\| \| \| \|	seem to work with C++ at all so untested.
*	Be more forgiving of names passed with --cpu	lloyd	2009-10-19	1	-6/+9
\|
*	Also enable x86 asm word_add	lloyd	2009-10-15	1	-8/+0
\|
*	Enable x86-64 asm word_add	lloyd	2009-10-15	1	-8/+0
\|
*	merge of '5cfca720d4ca8d1e8f6946c7d9b4a8a6943094d0'	lloyd	2009-10-15	31	-432/+456
\|\ \| \| \| \| \| \|	and '8cc9c08544c0f1f1dba7c7a8da51d1657b1c7df8'
\| *	Similiar treatment for OFB which is also just a plain stream cipher	lloyd	2009-10-14	7	-100/+148
\| \|
\| *	Convert CTR_BE from a Filter to a StreamCipher. Must wrap in a ↵	lloyd	2009-10-14	11	-217/+231
\| \| \| \| \| \| \| \| \| \| \| \|	StreamCipher_Filter to pass it directly to a Pipe now.
\| *	Cleanups/random changes in the stream cipher code:	lloyd	2009-10-14	14	-111/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Remove encrypt, decrypt - replace by cipher() and cipher1() Remove seek() - not well supported/tested, I want to redo with a new interface once CTR and OFB modes become stream ciphers. Rename resync to set_iv() Remove StreamCipher::IV_LENGTH and add StreamCipher::valid_iv_length() to allow multiple IV lengths (as for instance Turing allows, as would Salsa20 if XSalsa20 were supported).
\| *	Fix some minor compilation issues in the examples	lloyd	2009-10-14	3	-4/+4
\| \|
* \|	Avoid using word_add() in gfp_element.cpp, actually more complex than necessary,	lloyd	2009-10-15	1	-1/+3
\|/ \| \| \|	and was tickling a bug in the asm versions because of the constant 0.
*	Check for cipher_mode() being set; if it is, not an algo_factory algo	lloyd	2009-10-13	1	-0/+4
\|
*	propagate from branch 'net.randombit.botan.1_8' (head ↵	lloyd	2009-10-13	303	-5563/+9498
\|\ \| \| \| \| \| \| \| \| \| \|	c5ae189464f6ef16e3ce73ea7c563412460d76a3) to branch 'net.randombit.botan' (head e2b95b6ad31c7539cf9ac0ebddb1d80bf63b5b21)
\| *	Add a couple more Python examples and the very beginning of a manual/reference	lloyd	2009-10-10	3	-0/+143
\| \| \| \| \| \| \| \|	for the Python wrappers.
\| *	Remove redundant function	lloyd	2009-10-09	1	-10/+3
\| \|
\| *	Add PBKDF2 wrapper	lloyd	2009-10-09	1	-0/+17
\| \|
\| *	Reasonably functional RSA support; keygen, import/export, encrypt/decrypt, ↵	lloyd	2009-10-09	5	-156/+231
\| \| \| \| \| \| \| \|	sign/verify
\| *	Tick timestamp in building.tex	lloyd	2009-10-09	1	-1/+1
\| \|
\| *	Bump version to 1.9.1-rc1	lloyd	2009-10-09	3	-3/+3
\| \|
\| *	Remove unused arg	lloyd	2009-10-09	1	-3/+2
\| \|
\| *	Fix python install target. Add CryptoBox wrapper plus an example	lloyd	2009-10-09	4	-1/+60
\| \|
\| *	Ignore XS goop	lloyd	2009-10-09	1	-0/+3
\| \|