aboutsummaryrefslogtreecommitdiffstats
path: root/src/block/xtea_simd
Commit message (Collapse)AuthorAgeFilesLines
* Split the SIMD implementations into their own modules and choose onelloyd2011-05-241-1/+1
| | | | at config time.
* More size_tlloyd2010-10-131-2/+2
|
* Add a new subclass for BlockCipher BlockCipher_Fixed_Block_Size, whichlloyd2010-10-131-4/+4
| | | | | | | | | | | | | | sets the block size statically and also creates an enum with the size. Use the enum instead of calling block_size() where possible, since that uses two virtual function calls per block which is quite unfortunate. The real advantages here as compared to the previous version which kept the block size as a per-object u32bit: - The compiler can inline the constant as an immediate operand (previously it would load the value via an indirection on this) - Removes 32 bits per object overhead (except in cases with actually variable block sizes, which are very few and rarely used)
* s/BLOCK_SIZE/block_size()/lloyd2010-10-131-4/+4
|
* Use size_t rather than u32bit for the blocks argument of encrypt_nlloyd2010-10-122-4/+4
|
* s/u32bit/size_t/ for block cipher parallelism querieslloyd2010-10-121-1/+1
|
* Remove more uses of vector to pointer implicit conversionslloyd2010-09-131-2/+6
|
* Only call the scalar versions if we actually have leftover blocks tolloyd2010-06-221-2/+4
| | | | process
* In IDEA, Noekeon, Serpent, XTEA, provide and use ro accessor functionslloyd2010-06-211-2/+2
| | | | | for getting access to the key schedule, instead of giving the key schedule protected status, which is much harder tu audit.
* More Doxygen fixeslloyd2010-06-151-2/+2
|
* Change BlockCipher::parallelism() to return the native parallelism oflloyd2010-05-251-1/+1
| | | | | | | | | | | | | | | | | | | | the implementation rather than the preferred one. Update all implementations. Add a new function parallel_bytes() which returns parallelism() * BLOCK_SIZE * BUILD_TIME_CONSTANT This is because i noticed all current calls of parallelism() just multiplied the result by the block size already, so this simplified that code. The build time constant is set to 4, which was the previous default return value of parallelism(). However the SIMD versions returned 2*native paralellism rather than 4*, so this increases the buffer sizes used for those algorithms. The constant multiple lives in buildh.in and build.h, and is named BOTAN_BLOCK_CIPHER_PAR_MULT.
* Set parallelism defaults.lloyd2010-02-251-0/+2
| | | | | | | | Default unless specified is now 4. For SIMD code, use 2x the number of blocks which are processed in parallel using SIMD by that cipher. It may make sense to increase this to 4x or even more, further experimentation is necessary.
* Un-internal loadstor.h (and its header deps, rotate.h andlloyd2009-12-211-1/+1
| | | | | | | | | | | | | | bswap.h); too many external apps rely on loadstor.h existing. Define 64-bit generic bswap in terms of 32-bit bswap, since it's not much slower if 32-bit is also generic, and much faster if it's not. This may be quite helpful on 32-bit x86 in particular. Change formulation of generic 32-bit bswap. It may be faster or slower depending on the CPU, especially the latency and throuput of rotate instructions, but should be faster on an ideally superscalar processor with rotate instructions (ie, what I expect future CPUs to look more like).
* Make many more headers internal-only.lloyd2009-12-161-1/+1
| | | | | | | | | | | | | Fixes for the amalgamation generator for internal headers. Remove BOTAN_DLL exporting macros from all internal-only headers; the classes/functions there don't need to be exported, and avoiding the PIC/GOT indirection can be a big win. Add missing BOTAN_DLLs where necessary, mostly gfpmath and cvc For GCC, use -fvisibility=hidden and set BOTAN_DLL to the visibility __attribute__ to export those classes/functions.
* Full working amalgamation build, plus internal-only headers concept.lloyd2009-12-162-8/+1
|
* Kill realnames on new modules not in mailinelloyd2009-10-291-2/+0
|
* Rename SSE2 stuff to be generally SIMD since it supports at least SSE2lloyd2009-10-293-0/+168
and Altivec (though Altivec is seemingly slower ATM...)