| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
| |
Also, fix AltiVec detection on Linux and NetBSD for most G4s.
|
|
|
|
| |
detection.
|
| |
|
| |
|
|
|
|
|
|
| |
It would be useful in its own right, many other things need to do hashing,
but the tr1 dependency kills it right now. Something to revisit in the C++0x
branch, perhaps?
|
|\
| |
| |
| |
| |
| | |
5749645b3dc61c94f9b2980aa7773a3849105a81)
to branch 'net.randombit.botan.buf-op' (head 7c1f7c88bd4d016ff49f098e47ac6032ff43041b)
|
| |\
| | |
| | |
| | |
| | |
| | | |
79ed5b0f9057b2d40335e268fdb9f375837d1d11)
to branch 'net.randombit.botan.buf-op' (head 87160704bdc30b0a4cb19fd4516e20e85dca2869)
|
| | |
| | |
| | |
| | | |
(std::tr1::function).
|
|/ / |
|
|/
|
|
| |
I tend to rewrite often in particular files while debugging things.
|
| |
|
|
|
|
| |
the calendar time without tying to a particular format. From the C++0x branch.
|
| |
|
|
|
|
|
|
|
| |
Add macros for OS support of gmtime_r (Unix) and gmtime_s (Win32) to deal
with thread-unsafety of std::gmtime. Only enable gmtime_r on Linux currently,
but it's probably available pretty much everywhere (specified in pthreads,
origininally, AFAICT).
|
| |
|
| |
|
|
|
|
|
|
| |
or big endian, for large loads always memcpy, then go back and swap as
needed. Otherwise (unknown or mixed endian) just load one at a time as
usual.
|
|
|
|
|
| |
but if SSE2 or SSSE3 is available uses SIMD magic to swap 4 32 bit values
at once.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bswap.h); too many external apps rely on loadstor.h existing.
Define 64-bit generic bswap in terms of 32-bit bswap, since it's
not much slower if 32-bit is also generic, and much faster if
it's not. This may be quite helpful on 32-bit x86 in particular.
Change formulation of generic 32-bit bswap. It may be faster or
slower depending on the CPU, especially the latency and throuput
of rotate instructions, but should be faster on an ideally
superscalar processor with rotate instructions (ie, what I expect
future CPUs to look more like).
|
|
|
|
| |
Move most of the engine headers to internal
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes for the amalgamation generator for internal headers.
Remove BOTAN_DLL exporting macros from all internal-only headers;
the classes/functions there don't need to be exported, and
avoiding the PIC/GOT indirection can be a big win.
Add missing BOTAN_DLLs where necessary, mostly gfpmath and cvc
For GCC, use -fvisibility=hidden and set BOTAN_DLL to the
visibility __attribute__ to export those classes/functions.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
QueryPerformanceCounter, into an entropy source hres_timer. Its
results, if any, do not count as contributing entropy to the poll.
Convert the other (monotonic/fixed epoch) timers to a single function
get_nanoseconds_clock(), living in time.h, which statically chooses
the 'best' timer type (clock_gettime, gettimeofday, std::clock, in
that order depending on what is available). Add feature test macros
for clock_gettime and gettimeofday.
Remove the Timer class and timer.h. Remove the Timer& argument to the
algorithm benchmark function.
|
|
|
|
| |
system before returning a new instance.
|
|
|
|
| |
build magic, name them asm_macr_ARCH.h. Change all including files accordingly.
|
| |
|
| |
|
|
|
|
| |
which is currently just a stub returning false.
|
|
|
|
|
| |
Rename BOTAN_UNALIGNED_LOADSTOR_OK to BOTAN_UNALIGNED_MEMORY_ACCESS_OK
which is somewhat more clear as to the point.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
change some of the hash functions to use it as low hanging fruit.
Probably could use further optimization (just unrolls x4 currently), but
merely having it as syntax is good as it allows optimizing many functions
at once (eg using SSE2 to do 4-way byteswaps).
|
|
|
|
|
| |
Document SHA optimizations, AltiVec runtime checking, fixes for cpuid
for both icc and msvc.
|
| |
|
|
|
|
|
| |
returns true if they might plausibly work. AltiVec and SSE2 versions call
into CPUID, scalar version always works.
|
| |
|
|
|
|
|
| |
Relies on mfspr emulation/trapping by the kernel, which works on (at least)
Linux and NetBSD.
|
|
|
|
|
|
| |
for unaligned writes is messy as hell.
If writes are batched this is somewhat easier to deal with (somewhat).
|
| |
|
|\
| |
| |
| |
| |
| | |
8fb69dd1c599ada1008c4cab2a6d502cbcc468e0)
to branch 'net.randombit.botan.general-simd' (head c05c9a6d398659891fb8cca170ed514ea7e6476d)
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
on a PowerPC 970 running Gentoo with GCC 4.3.4
Uses a GCC syntax for creating literal values instead of the Motorola
syntax [{1,2,3,4} instead of (1,2,3,4)].
In tests so far, this is much, much slower than either the standard scalar code,
or using the SIMD-in-scalar-registers code. It looks like for whatever reason
GCC is refusing to inline the function:
SIMD_Altivec(__vector unsigned int input) { reg = input; }
and calls it with a branch hundreds of times in each function. I don't know
if this is the entire reason it's slower, but it definitely can't be helping.
The code handles unaligned loads OK but assumes stores are to an aligned address.
This will fail drastically some day, and needs to be fixed to either use scalar
stores, which (most?) PPCs will handle (if slowly), or batch the loads and
stores so we can work across the loads. Considering the code so far loads 4
vectors of data in one go this would probably be a big win (and also for loads,
since instead of doing 8 loads for 4 registers only 5 are needed).
|
| |
| |
| |
| | |
of load_le + bswap
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
operations.
Also add a pure scalar code version.
Convert Serpent to use this new interface, and add an implementation of
XTEA in SIMD.
The wrappers plus the scalar version allow SIMD-ish code to work on all
platforms. This is often a win due to better ILP being visible to the
processor (as with the recent XTEA optimizations). Only real danger is
register starvation, mostly an issue on x86 these days. So it may (or may
not) be a win to consolidate the standard C++ versions and the SIMD versions
together.
Future work:
- Add AltiVec/VMX version
- Maybe also for ARM's NEON extension? Less pressing, I would think.
- Convert SHA-1 code to use SIMD_32
- Add XTEA SIMD decryption (currently only encrypt)
- Change SSE2 engine to SIMD_engine
- Modify configure.py to set BOTAN_TARGET_CPU_HAS_[SSE2|ALTIVEC|NEON|XXX] macros
|
|/
|
|
|
| |
Pretty much useless and unused, except for listing the module names in
build.h and the short versions totally suffice for that.
|
| |
|