aboutsummaryrefslogtreecommitdiffstats
path: root/doc/todo.txt
blob: e7eb34d88ab09061a006626aedd0c6573f17a2f2 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160

There are many areas where Botan is deficient. This file documents
some of the more interesting ones. If you're thinking about working on
something within Botan, one of these areas might be a good place to
start. Questions or comments can go to the development mailing list.

Build System / Porting
--------------------

The new configure script is fairly flexible in terms of build systems
(though there do remain a few pieces of code tied to the idea of
make-style syntax). No doubt many users would appreciate having Botan
well-integrated into their build environment, so patches to
configure.pl (and new template files for misc/config/makefile/) to add
support for other build systems. The most requested by far is Visual
Studio project files; others that might be of interest would be
autotools, Scons, CMake, and jam/bjam.

Testing the configure/build/install steps on as many platforms and
compilers as possible is a huge win for us. Builds on some platforms,
like the Motorola 680x0 and Hitachi SH machines, IBM's AIX on any CPU
type, and the Haiku operating system (a BeOS R5 clone) - have *never*
been attempted; the support is based entirely on documentation and
conjecture, and is very unlikely to work. Support for several
operating systems is completely nonexistent - this class includes VMS,
vxWorks, eCos, MINIX, GNU/Hurd, L4, and Coyotos. Others, like IRIX,
HP-UX, QNX, and Tru64, are tested only very rarely. Similarly, many
commercial compilers are only tested occasionally.

Setting up a buildbot system would be ideal, if access to enough
machines can be arranged (for the x86 and amd64 operating systems, a
single machine running Xen or VMware could suffice). Even one-shot
tests with the latest sources on a variety of machines would be
incredibly useful.

A nice but not essential feature for configure.pl would be adding the
ability to generate any needed or requested package-building scripts,
with support for systems like rpm, portage, dpkg, commercial Unix
package systems, and Windows installer systems.

Splitting the build into distinct static and shared targets (and
static-debug and shared-debug) would make certain things much simpler,
as well as being a performance advantage on many systems (in
particular on x86, where losing %ebx for the PIC pointer is a huge
loss)

Modules to allow use of platform-specific features within Botan can
make life significantly better for users on that platform. Generic
Unix/POSIX support is more or less complete, but there are countless
vendor extensions that might be used in Botan in interesting and
useful ways. Windows has the basics (two entropy source modules, and
modules giving access to mutexes and high resolution timers), but
there are probably a number of interesting extensions one could write,
like making Botan's objects callable by DCOM. Other systems probably
have all kinds of interesting system and library calls we can use.

Self-test / Benchmark System
--------------------

The code is not terrible, but it is significantly sloppier than the
library code it is testing. Reporting should be generalized and
encapsulated, so it can easily be extended to produce tests results as
text to the terminal, or HTML with full details, or as an email, or
any of a number of useful formats (which would provide a varying
amount of information about what was tested and what went wrong).
Bonus points for writing a general system that takes in an arbitrary
'template' file and outputs the filled out report.

Much of the code operates at a very low level of abstraction; this has
caused it to be difficult to add tests that vary much from the simple
known answer tests used for the ciphers and hashes.

All of the simple functions (rotate_left, get_byte, etc) should be
tested (a failure in one of these causes many failures later, which
are harder to diagnose).

There are significant codepaths that have no tests written for them,
particularly in the X.509 certificate processing code.

The benchmark code should also have its output formats generalized; it
would be pretty great to have a benchmark run produce a detailed
report as HTML and some gnuplot datasets to generate the images
included from the HTML file.

New Memory Allocator
--------------------

The current pool allocator is serviceable but it can be very wasteful
of memory and could easily be several times faster. Someone who is
interested in algorithms might enjoy working on this.

Documentation
--------------------

This could occupy someone for months. Perhaps even a majority of the
API is undocumented, and while these are the less important pieces (or
at least pieces meant mostly for internal library use), it would be
great to have at least a brief description of each of them, along with
a pointer to the appropriate headers. Text written in either a
tutorial style or as a straight API breakdown could easily be
integrated.

There are many obvious example programs which have yet to be written,
including encrypting a file with a shared passphrase, and securely
salting and hashing a password for storage. Check the mailing list
archives for ideas.

Public Key Engines
--------------------

In addition to the fairly low level BigInt optimizations that remain
to be done, Botan provides a plugin system that allows different
implementations of entire algorithms (RSA, DSA, etc) to be included,
which can then be used in a completely transparent manner by
application code. As of this writing one hardware public key
accelerator (AEP's SureWare Runner cards) and two software backends
(GNU MP and OpenSSL's BN library) are supported. There are many others
out there, including Apple's vBigNum AltiVec library, Intel's
Performance Primitives library, OpenBSD's /dev/crypto, and hardware
units like the Broadcom BCM582x and Hi/fn 6500.

BigInt
--------------------

The portable BigInt routines are fairly good, and as of 1.6 we're
using reasonably good algorithms. But well written assembly can often
speed up public key operations by 50% or more. There currently exists
some limited x86 and x86-64 assembly, but implementations for other
architectures (such as Cell's SPU units, PowerPC, SPARCv9, MIPS, and
ARM) could really help, as could further work on the x86 code
(including making use of SSE instructions and VIA's Montgomery
multiplication instruction). The key routines for good performance are
bigint_monty_redc and bigint_simple_sqr; together they make up 30-60%
of the runtime of most public key algorithms.

It is very likely that many of the core algorithms (in src/mp_*) could
be optimized at the C level by anyone has some knowledge or interest
in algorithms.

Compression Modules
--------------------

Botan currently supports the bzip2 and zlib compression
formats. Support for gzip and (less importantly) zip would likely be
appreciated by many users. There are also other interesting algorithms
such as LZO (supposedly very fast, which might make it useful in
custom network protocols), and LZW (a compression algorithm patented
by nCipher; they sell hardware implementations).

X.509 Attribute Certificates
--------------------

Most of the low-level processing code needed, like support for the
ASN.1 SIGNED macro and the DER/BER codec, have already been written
and used sufficiently to be well tested and relatively easy to work
with. However it involves a lot of careful coding and design work to
deal with the semantic issues and provide a good interface to the
user; at this point I don't have the slightest idea what a useful API
for attribute certificates would be like. RFC 3281 and its references
have most of the information you'll need.