diff options
author | lloyd <[email protected]> | 2006-05-18 18:33:19 +0000 |
---|---|---|
committer | lloyd <[email protected]> | 2006-05-18 18:33:19 +0000 |
commit | a2c99d3270eb73ef2db5704fc54356c6b75096f8 (patch) | |
tree | ad3d6c4fcc8dd0f403f8105598943616246fe172 /doc |
Initial checkin1.5.6
Diffstat (limited to 'doc')
51 files changed, 9566 insertions, 0 deletions
diff --git a/doc/api.tex b/doc/api.tex new file mode 100644 index 000000000..f85c8f5cc --- /dev/null +++ b/doc/api.tex @@ -0,0 +1,3687 @@ +\documentclass{article} + +\setlength{\textwidth}{6.5in} +\setlength{\textheight}{9in} + +\setlength{\headheight}{0in} +\setlength{\topmargin}{0in} +\setlength{\headsep}{0in} + +\setlength{\oddsidemargin}{0in} +\setlength{\evensidemargin}{0in} + +\title{\textbf{Botan API Reference (December 31, 2005)}} +\author{} +\date{} + +\newcommand{\filename}[1]{\texttt{#1}} +\newcommand{\manpage}[2]{\texttt{#1}(#2)} + +\newcommand{\macro}[1]{\texttt{#1}} + +\newcommand{\function}[1]{\textbf{#1}} +\newcommand{\keyword}[1]{\texttt{#1}} +\newcommand{\type}[1]{\texttt{#1}} +\renewcommand{\arg}[1]{\textsl{#1}} +\newcommand{\namespace}[1]{\texttt{#1}} + +\newcommand{\ie}[0]{\emph{i.e.}} +\newcommand{\eg}[0]{\emph{e.g.}} + +\begin{document} + +\maketitle + +\tableofcontents + +\parskip=5pt +\pagebreak + +\section{Introduction} + +Botan is a C++ library which attempts to provide the most common cryptographic +algorithms and operations in an easy to use and portable package. Currently it +runs on a wide variety of systems, using numerous different compilers and on +many different CPU architectures. + +The base library is written in ISO C++, so it can be ported with minimal fuss, +but Botan also supports a modules system, which allows system dependent code +to be compiled into the library for use by application code. + +While you are reading this, you may want to refer to the header files +\filename{base.h} and \filename{pipe.h}. These files contain the classes that +form the basic interface for the library. + +\subsection{Basic Conventions} + +With a few exceptions declarations in the library are contained within the +namespace \namespace{Botan}. Botan declares several typedef'ed types to help +buffer it against changes in machine architecture. These types are used +extensively in the interface, and thus it would be often be convenient to use +them without the \namespace{Botan} prefix. You can, by \keyword{using} the +namespace \namespace{Botan\_types} (this way you can use the type names without +the namespace prefix, but the remainder of the library stays out of the global +namespace). The included types are \type{byte} and \type{u32bit}, which are +unsigned integer types. + +The headers for Botan are usually available in the form +\filename{botan/headername.h}. For brevity in this documentation, headers are +always just called \filename{headername.h}, but they should be used as +\filename{botan/headername.h} in your actual code. + +\subsection{Targets} + +Botan's primary targets (system-wise) are 32 and 64-bit systems with at least a +few megabytes of memory. Generally, given the choice between optimizing for +32-bit systems and 64-bit systems, Botan chooses 64-bits, simply on the theory +that where performance really matters (servers), people are using 64-bit +machines. But performance on 32 bit systems is also quite good. + +Today smaller systems, such as handhelds, set-top boxes, and the bigger smart +phones and smart cards, are also capable of using Botan. However, Botan uses a +fairly large amount of code space (multiple megabytes), which could be +prohibitive in some systems. Actual RAM usage is quite small, usually under +64K, though C++ runtime overheads might cause more to be used. + +Botan's design makes it quite easy to remove unused algorithms in such a way +that applications do not need to be recompiled to work, even applications that +use the algorithms in question. They can simply ask Botan if the algorithm +exists, and if Botan says yes, ask the library to give them such an object for +that algorithm. + +\pagebreak + +\subsection{Why Botan?} + +Botan may be the perfect choice for your application. Or it might be a +terribly bad idea. This section is basically to make it clear what Botan is +and is not. + +First, let's cover the major strengths. Botan: + +\begin{list}{$\cdot$} + \item Support is (usually) quickly available on the project mailing lists. + Commercial support licenses are available for those that desire them. + + \item + \item Is written in a (fairly) clean object-oriented style, and the usual + API works in terms of reasonably high-level abstractions. + + \item Supports a huge variety of algorithms, including most of the major + public key algorithms and standards (such as IEEE 1363, PKCS, and + X.509v3). + + \item Supports a name-based lookup scheme, so you can get ahold of any + algorithm on the fly. + + \item You can easily extend much of the system at application compile time or + at run time. + + \item Works well with a wide variety of compilers, operating systems, and + CPUs, and more all the time. + + \item Is the only open source crypto library (that I know of) that has + support for memory allocation techniques that prevent an attacker from + reading swap in an attempt to gain access to keys or other secrets. In + fact several different such methods are supported, depending on the + system (two methods for Unix, another for Windows). + + \item Has (optional) support for Zlib and Bzip2 compression/decompression + integrated completely into the system -- it only takes a line or two of + code to add compression to your application. +\end{list} + +\noindent +And the major downsides and deficiencies are: + +\begin{list}{$\cdot$} + \item It's written in C++. If your application isn't, Botan is probably + going to be more pain than it's worth. + \item + + \item Botan doesn't support higher-level protocols and formats like SSL or + OpenPGP. These will eventually be available as separate packages. Of + course you can write it yourself (and I would be happy to help with + that in any way I can). Some work is beginning on TLS and CMS (S/MIME) + support, but it is a ways away still. + + \item Doesn't support elliptic curve algorithms; ECDSA support is planned at + some point, but demand seems quite low. + + \item Doesn't currently support any very high level 'envelope' style + processing - support for this will probably be added once support for + CMS is available, so code using the high level interface will produce + data readable by many other libraries. +\end{list} + +\pagebreak + +\section{Initializing the Library} + +The library needs to have various things done to it in order for it to work +correctly. To make sure this is done properly, you should create a +\type{LibraryInitializer} object at the start of your main() function, before +you start using any part of Botan. The initializer does things like +initializing the memory allocation system, setting up the algorithm lookup +tables, finding out if there is a high resolution timer available to use, and +similar such matters. + +The constructor of this object takes a string which specifies any options. If +more than one is used, they should be separated by a space. The options are +listed here by order of danger (\ie the caution you should have about using the +option), with safest first. + +\noindent +\textbf{Option ``secure\_memory''}: Try to create a more secure allocator type +-- one that either locks allocated memory into RAM, or that memory maps a disk +file that it erases after use. If both are available, it will prefer the memory +mapping mechanism, because locking memory requires privileges on many systems. + +On systems that don't (currently) have any specialized allocators, like +MS Windows, this option is ignored. + +\noindent +\textbf{Option ``config=/path/to/configfile''}: Process the specified +configuration file. Configuration files can specify things like the various +options, new aliases, and new OIDs for algorithms. An example can be found in +\filename{doc/botan.rc}. Currently only one config= argument will be processed, +the rest will be ignored. + +\noindent +\textbf{Option ``thread\_safe''}: The library should use mutexes for guarding +access to shared resources, such as the memory allocation system. If you pass +the ``thread\_safe'' option, and the initializer can't find a useful mutex +module, it will throw an exception. Botan seems to work in threaded programs, +but it hasn't been tested thoroughly, and problems may remain. Note that Botan +is not thread safe at the object level; any objects shared between threads need +explicit locking. + +\noindent +\textbf{Option ``use\_engines''}: Use any available ``engine'' modules to speed +up processing. Currently Botan has support for engines based on the +AEP1000/AEP2000 crypto hardware cards, GNU MP, and OpenSSL's BN +library. Further support for crypto acceleration hardware will be added in +future releases. + +\noindent +\textbf{Option ``fips140''}: This option, in theory, toggles Botan into FIPS +140 mode. Please note that Botan \emph{has not} been FIPS 140 validated at this +time, and that a number of changes will be necessary before such a validation +can occur. Do not use this option. + +\noindent +\textbf{Option ``no\_rng\_seed''}: Don't attempt to seed the global PRNGs at +startup. This is primarily useful when you know that the built-in library +entropy sources will not work, and you are providing you own entropy source(s) +with \function{Global\_RNG::add\_es}. By default Botan will attempt to seed the +PRNGs, and will throw an exception if it fails. This options disables both of +these actions; call \function{Global\_RNG::add\_entropy} or +\function{Global\_RNG::add\_es} to add entropy and/or an entropy source, then +call \function{Global\_RNG::seed} to actually seed the RNG. + +If you do not create a \type{LibraryInitializer} object, pretty much any Botan +operation will fail, because it will be unable to do basic things like allocate +memory or get random bits. Note too, that you should be careful to only create +one such object. + +If you wish, you can use a function-based interface to initialize Botan. The +functions are called \function{initialize} and \function{deinitialize}, and are +in the \namespace{Init} namespace. In fact, the \type{LibraryInitializer} +implementation simply calls these functions. The \function{initialize} function +takes a \type{std::string}, just like \type{LibraryInitializer}'s constructor. +If you choose to use this interface, you should be very careful to make sure +that \function{deinitialize} is always called, even in the case of exceptions, +premature exit or abort, and so on. For this reason using +\type{LibraryInitializer} is preferred, but there are cases where using it is +impossible and an interface using plain functions is the only option. + +\pagebreak + +\section{Gotchas} + +There are a few things to watch out for to prevent problems when using Botan. + +First and primary of these is to \emph{never} allocate any kind of Botan object +globally. The problem is that the constructor for such an object will be called +before the \type{LibraryInitializer} is created, and the constructor will +undoubtedly try to call an object which has not been initialized. If you're +lucky your program will die with an uncaught exception. If you're less lucky, +it will crash from a memory access error. And if you're really unlucky it +\emph{won't} crash, and your program will be in an unknown (but very bad) +state. Use a pointer to an object instead, and initialize it after creating +your \type{LibraryInitializer}. + +The same rule applies for making sure the destructors of all your Botan objects +are called before the \type{LibraryInitializer} is destroyed. This implies you +can't have static variables that are Botan objects inside functions or classes +(since in most C++ runtimes, these objects will be destroyed after main has +returned). This is kind of inelegant, but rarely a real problem in practice. + +Never create a Botan memory object (\type{MemoryVector}, \type{SecureVector}, +\type{SecureBuffer}) with a type that is not a basic integer (\type{byte}, +\type{u16bit}, \type{u32bit}, \type{u64bit}). More strongly, if you, as a user +of the library, are creating any memory buffer object that's not a +\type{SecureVector<byte>} or maybe a \type{MemoryVector<byte>}, you're probably +doing something wrong (I suppose there may be exceptions to this rule, but not +many). This is mostly a stylistic point, with an eye toward compatibility with +future versions. + +Don't include headers you don't have to. Past experience with Botan has shown +that headers get renamed fairly regularly as internal design changes are made, +but this need not affect you, if you follow the ``proper procedures''. Using +the lookup interface defined in \filename{lookup.h} and \filename{look\_pk.h} +will save you a great deal of pain in this regard, as it insulates you against +many such changes. + +Use a \function{try}/\function{catch} block inside your \function{main} +function, and catch any \type{std::exception} throws. This is not strictly +required, but if you don't, and Botan throws an exception, your application +will die mysteriously and (probably) without any error message. + +\pagebreak + +\section{The Basic Interface} + +Botan has two different interfaces. The one documented in this section is meant +more for implementing higher-level types (see the section on filters, later in +this manual) than for use by applications. Using it safely requires a solid +knowledge of encryption techniques and best practices, so unless you know, for +example, what CBC mode and nonces are, and why PKCS \#1 padding is important, +you should avoid this interface in favor of something working at a higher level +(such as the CMS interface). + +\subsection{Basic Algorithm Abilities} + +There are a small handful of functions implemented by most of Botan's +algorithm objects. Among these are: + +\noindent +\type{std::string} \function{name}(): + +Returns a human-readable string of the name of this algorithm. Examples of +names returned are ``Blowfish'' and ``HMAC(MD5)''. You can turn names back into +algorithm objects using the functions in \filename{lookup.h}. + +\noindent +\type{void} \function{clear}(): + +Clear out the algorithm's internal state. A block cipher object will ``forget'' +its key, a hash function will ``forget'' any data put into it, etc. Basically, +the object will look exactly as it did when you initially allocated it. + +\noindent +\function{clone}(): + +This function is central to Botan's name-based interface. The \function{clone} +has many different return types, such as \type{BlockCipher*} and +\type{HashFunction*}, depending on what kind of object it is called on. Note +that unlike Java's clone, this returns a new object in a ``pristine'' state; +that is, operations done on the initial object before calling \function{clone} +do not affect the initial state of the new clone. + +Cloned objects can (and should) be deallocated with the C++ \texttt{delete} +operator. + +\subsection{Keys and IVs} + +Both symmetric keys and initialization values can simply be considered byte (or +octet) strings. These are represented by the classes \type{SymmetricKey} and +\type{InitializationVector}, which are subclasses of \type{OctetString}. + +Since often it's hard to distinguish between a key and IV, many things (such as +key derivation mechanisms) return \type{OctetString} instead of +\type{SymmetricKey} to allow its use as a key or an IV. + +\noindent +\function{OctetString}(\type{u32bit} \arg{length}): + +This constructor creates a new random key of size \arg{length}. + +\noindent +\function{OctetString}(\type{std::string} \arg{str}): + +The argument \arg{str} is assumed to be a hex string; it is converted to binary +and stored. Whitespace is ignored. + +\noindent +\function{OctetString}(\type{const byte} \arg{input}[], \type{u32bit} +\arg{length}): + +This constructor simply copies its input. + +\subsection{Symmetrically Keyed Algorithms} + +Block ciphers, stream ciphers, and MACs all handle keys in pretty much the same +way. To make this similarity explicit, all algorithms of those types are +derived from the \type{SymmetricAlgorithm} base class. This type has three +functions: + +\noindent +\type{void} \function{set\_key}(\type{const byte} \arg{key}[], \type{u32bit} +\arg{length}): + +Most algorithms only accept keys of certain lengths. If you attempt to call +\function{set\_key} with a key length that is not supported, the exception +\type{Invalid\_Key\_Length} will be thrown. There is also another version of +\function{set\_key} that takes a \type{SymmetricKey} as an argument. + +\noindent +\type{bool} \function{valid\_keylength}(\type{u32bit} \arg{length}) const: + +This function returns true if a key of the given length will be accepted by +the cipher. + +There are also three constant data members of every \type{SymmetricAlgorithm} +object, which specify exactly what limits there are on keys which that object +can accept: + +MAXIMUM\_KEYLENGTH: The maximum length of a key. Usually, this is at most 32 +(256 bits), even if the algorithm actually supports more. In a few rare cases +larger keys will be supported. + +MINIMUM\_KEYLENGTH: The minimum length of a key. This is at least 1. + +KEYLENGTH\_MULTIPLE: The length of the key must be a multiple of this value. + +In all cases, \function{set\_key} must be called on an object before any data +processing (encryption, decryption, etc) is done by that object. If this is not +done, the results are undefined -- that is to say, Botan reserves the right in +this situation to do anything from printing a nasty, insulting message on the +screen to dumping core. + +\subsection{Block Ciphers} + +Block ciphers implement the interface \type{BlockCipher}, found in +\filename{base.h}. + +\noindent +\type{void} \function{encrypt}(\type{const byte} \arg{in}[BLOCK\_SIZE], + \type{byte} \arg{out}[BLOCK\_SIZE]) const + +\noindent +\type{void} \function{encrypt}(\type{byte} \arg{block}[BLOCK\_SIZE]) const + +These functions apply the block cipher transformation to \arg{in} and place the +result in \arg{out}, or encrypts \arg{block} in place (\arg{in} may be the same +as \arg{out}). BLOCK\_SIZE is a constant member of each class, which specifies +how much data a block cipher can process at one time. Note that BLOCK\_SIZE is +not a static class member, like the old BLOCKSIZE was. + +\type{BlockCipher}s have similar functions \function{decrypt}, which perform +the inverse operation. + +Block ciphers implement the \type{SymmetricAlgorithm} interface. + +\subsection{Stream Ciphers} + +Stream ciphers are somewhat different from block ciphers, in that encrypting +data results in changing the internal state of the cipher. Also, you may +encrypt any length of data in one go (in byte amounts). + +\noindent +\type{void} \function{encrypt}(\type{const byte} \arg{in}[], \type{byte} +\arg{out}[], \type{u32bit} \arg{length}) + +\noindent +\type{void} \function{encrypt}(\type{byte} \arg{data}[], \type{u32bit} +\arg{length}): + +These functions encrypt the arbitrary length (well, less than 4 gigabyte long) +string \arg{in} and place it into \arg{out}, or encrypts it in place in +\arg{data}. The \function{decrypt} functions look just like +\function{encrypt}. + +Stream ciphers implement the \type{SymmetricAlgorithm} interface. + +Some stream ciphers support random access to any point in their cipher +stream. For such ciphers, calling \type{void} \function{seek}(\type{u32bit} +\arg{byte}) will change the cipher's state so that it as if the cipher had been +keyed as normal, then encrypted \arg{byte} -- 1 bytes of data (so the next byte +in the cipher stream is byte number \arg{byte}). + +\subsection{Hash Functions / Message Authentication Codes} + +Hash functions take their input without producing any output, only producing +anything when all input has already taken place. MACs are very similar, but are +additionally keyed. Both of these are derived from the base class +\type{BufferedComputation}, which has the following functions. + +\noindent +\type{void} \function{update}(\type{const byte} \arg{input}[], \type{u32bit} +\arg{length}) + +\noindent +\type{void} \function{update}(\type{byte} \arg{input}) + +\noindent +\type{void} \function{update}(\type{const std::string \&} \arg{input}) + +Updates the hash/mac calculation with \arg{input}. + +\noindent +\type{void} \function{final}(\type{byte} \arg{out}[OUTPUT\_LENGTH]) + +\noindent +\type{SecureVector<byte>} \function{final}(): + +Complete the hash/MAC calculation and place the result into \arg{out}. +OUTPUT\_LENGTH is a public constant in each object that gives the length of the +hash in bytes. After you call \function{final}, the hash function is reset to +its initial state, so it may be reused immediately. + +The second method of using final is to call it with no arguments at all, as +shown in the second prototype. It will return the hash/mac value in a memory +buffer, which will have size OUTPUT\_LENGTH. + +There are also a pair of functions called \function{process}. They are +essentially a combination of a single \function{update}, and \function{final}. +Both versions return the final value, rather than placing it an array. Calling +\function{process} with a single byte value isn't available, mostly because it +would rarely be useful. + +A MAC can be viewed (in most cases) as simply a keyed hash function, so classes +which are derived from \type{MessageAuthenticationCode} have \function{update} +and \function{final} classes just like a \type{HashFunction} (and like a +\type{HashFunction}, after \function{final} is called, it can be used to make a +new MAC right away; the key is kept around). + +A MAC has the \type{SymmetricAlgorithm} interface in addition to the +\type{BufferedComputation} interface. + +\pagebreak + +\section{Public Key Cryptography} + +Public key algorithms were added in Botan 0.8.0. The major base classes can be +found in \filename{pubkey.h}. + +\subsection{Creating PK Algorithm Key Objects} + +The library has interfaces for encryption, signatures, etc that do not require +knowing the exact algorithm in use (for example RSA and Rabin-Williams +signatures are handled by the exact same code path). + +One place where we \emph{do} need to know exactly what kind of algorithm is in +use is when we are creating a key (\emph{But}: read the section ``Importing and +Exporting PK Keys'', later in this manual). + +There are (currently) two kinds of public key algorithms in Botan: ones based +on integer factorization (RSA and Rabin-Williams), and ones based on the +discrete logarithm problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and +ElGamal). Since discrete logarithm parameters (primes and generators) can be +shared among many keys, there is the notion of these being a combined type +(called \type{DL\_Group}). + +There are two ways to create a DL private key (such as +\type{DSA\_PrivateKey}). One is to pass in just a \type{DL\_Group} object -- a +new key will automatically be generated. The other involves passing in a group +to use, along with both the public and private values (private value first). + +Since in integer factorization algorithms, the modulus used isn't shared by +other keys, we don't use this notion. You can create a new key by passing in a +\type{u32bit} telling how long (in bits) the key should be, or you can copy an +pre-existing key by passing in the appropriate parameters (primes, exponents, +etc). For RSA and Rabin-Williams (the two IF schemes in Botan), the parameters +are all \type{BigInt}s: prime 1, prime 2, encryption exponent, decryption +exponent, modulus. The last two are optional, since they can easily be derived +from the first three. + +\subsubsection{Creating a DL\_Group} + +There are quite a few ways to get a \type{DL\_Group} object. The best is to use +the function \function{get\_dl\_group}, which takes a string naming a group; it +will either return that group, if it knows about it, or throw an +exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536, +2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF +groups are the ones specified for use with IPSec, and the DSA ones are the +default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, you +should only use the ``DSA-n'' groups, while Diffie-Hellman and ElGamal can use +either type (keep in mind that some applications/standards require DH/ELG to +use DSA-style primes, while others require strong prime groups). + +You can also generate a new random group. This is not recommend, because it is +quite slow, especially for safe primes. + +You can register a new DL group with \function{add\_dl\_group} with a string +naming the group and the \type{DL\_Group}. Future lookups on that name will +return the group. There is no reason to register the group if you do decide to +use a distinct DL group for each key. + +\subsection{Key Checking} + +Most public key algorithms have limitations or restrictions on their +parameters. For example RSA requires an odd exponent, and algorithms based on +the discrete logarithm problem need a generator $> 1$. + +Each low-level public key type has a function named \function{check\_key} which +takes a \type{bool}. This function returns a boolean value that declares +whether or not the key is valid (from an algorithmic standpoint). For example, +it will check to make sure that the prime parameters of a DSA key are, in fact, +prime. It does not have anything to do with the validity of the key for any +particular use, nor does it have anything to do with certificates which link a +key (which, after all, is just some numbers) with a user or other entity. If +\function{check\_key}'s argument is \type{true}, then it does ``strong'' +checking, which includes fairly expensive operations like primality checking. + +Keys are always checked when they are loaded or generated, so typically there +is no reason to use this function directly. However, you can disable or reduce +the checks for particular cases (public keys, loaded private keys, generated +private keys) by setting the right config toggle (see the section on the +configuration subsystem for details). + +\subsection{Getting a PK algorithm object} + +The key types, like \type{RSA\_PrivateKey}, do not implement any kind of +padding or encoding (which is generally necessary for security). To get an +object like this, the easiest thing to do is call the functions found in +\filename{look\_pk.h}. Generally these take a key, followed by a string that +specified what hashing and encoding method(s) to use. Examples of such strings +are ``EME1(SHA-1)'' for OAEP encryption and ``EMSA4(SHA-1)'' for PSS signatures +(where the message is hashed using SHA-1). + +Here are some basic examples (using an RSA key) to give you a feel for the +possibilities. These examples assume \type{rsakey} is an +\type{RSA\_PrivateKey}, since otherwise we would not be able to create a +decryption or signature object with it (you can create encryption or signature +verification objects with public keys, naturally). Remember to delete these +objects when you're done with them. + +\begin{verbatim} + // PKCS #1 v2.0 / IEEE 1363 compatible encryption + PK_Encryptor* rsa_enc1 = get_pk_encryptor(rsakey, "EME1(RIPEMD-160)"); + // PKCS #1 v1.5 compatible encryption + PK_Encryptor* rsa_enc2 = get_pk_encryptor(rsakey, "PKCS1v15"); + + // Raw encryption: no padding, input is directly encrypted by the key + // Don't use this unless you know what you're doing + PK_Encryptor* rsa_enc3 = get_pk_encryptor(rsakey, "Raw"); + + // This object can decrypt things encrypted by rsa_enc1 + PK_Decryptor* rsa_dec1 = get_pk_decryptor(rsakey, "EME1(RIPEMD-160)"); + + // PKCS #1 v1.5 compatible signatures + PK_Signer* rsa_sig = get_pk_signer(rsakey, "EMSA3(MD5)"); + PK_Verifier* rsa_verify = get_pk_verifier(rsakey, "EMSA3(MD5)"); + + // PKCS #1 v2.1 compatible signatures + PK_Signer* rsa_sig2 = get_pk_signer(rsakey, "EMSA4(SHA-1)"); + PK_Verifier* rsa_verify2 = get_pk_verifier(rsakey, "EMSA4(SHA-1)"); + + // Hash input with SHA-1, but don't pad the input in any way; usually + // used with DSA/NR, not RSA + PK_Signer* rsa_sig = get_pk_signer(rsakey, "EMSA1(SHA-1)"); +\end{verbatim} + +\subsection{Encryption} + +The \type{PK\_Encryptor} and \type{PK\_Decryptor} classes are the interface for +encryption and decryption, respectively. + +Calling \function{encrypt} with a \type{byte} array and a length parameter will +return the input encrypted with whatever scheme is being used. Calling the +similar \function{decrypt} will perform the inverse operation. You can also do +these operations with \type{SecureVector<byte>}s. In all cases, the output is +returned via a \type{SecureVector<byte>}. + +If you attempt an operation with a larger size than the key can support (this +limit varies based on the algorithm, the key size, and the padding method used +(if any)), an exception will be thrown. Alternately, you can call +\function{maximum\_input\_size}, which will return the maximum size you can +safely encrypt. In fact, you can often encrypt an object that is one byte +longer, but only if enough of the high bits of the leading byte are set to +zero. Since this is pretty dicey, it's best to stick with the advertised +maximum. + +Available public key encryption algorithms in Botan are RSA and ElGamal. The +encoding methods are EME1, denoted by ``EME1(HASHNAME)'', PKCS \#1 v1.5, +called ``PKCS1v15'' or ``EME-PKCS1-v1\_5'', and raw encoding (``Raw''). + +For compatibility reasons, PKCS \#1 v1.5 is recommend for use with ElGamal +(most other implementations of ElGamal do not support any other encoding +format). RSA can also be used with PKCS \# 1 encoding, but because of various +possible attacks, EME1 is the preferred encoding. EME1 requires the use of a +hash function: unless a competent applied cryptographer tells you otherwise, +you should use SHA-1. + +Don't use ``Raw'' encoding unless you need it for backward compatibility with +old protocols. There are many possible attacks against both ElGamal and RSA +when they are used this way. + +\subsection{Signatures} + +The signature algorithms look quite a bit like the hash functions. You can +repeatedly call \function{update}, giving more and more of a message you wish +to sign, and then call \function{signature}, which will return a signature for +that message. If you want to do it all in one shot, call +\function{sign\_message}, which will just call \function{update} with its +argument and then return whatever \function{signature} returns. + +You can validate a signature by updating the verifier class, and finally seeing +the if the value returned from \function{check\_signature} is true (you pass +the supposed signature to the \function{check\_signature} function as a byte +array and a length or as a \type{MemoryRegion<byte>}). There is another +function, \function{verify\_message}, which takes a pair of byte array/length +pairs (or a pair of \type{MemoryRegion<byte>} objects), the first of which is +the message, the second being the (supposed) signature. It returns true if the +signature is valid and false otherwise. + +Available public key signature algorithms in Botan are RSA, DSA, +Nyberg-Rueppel, and Rabin-Williams. Signature encoding methods include EMSA1, +EMSA2, EMSA3, EMSA4, and Raw. All of them, except Raw, take a parameter naming +a message digest function to hash the message with. Raw actually signs the +input directly; if the message is too big, the signing operation will fail. Raw +is not useful except in very specialized applications. + +There are various interactions which make certain encoding schemes and signing +algorithms more or less useful. + +EMSA2 is the usual method for encoding Rabin-William signatures, so for +compatibility with other implementations you may have to use that. EMSA4 (also +called PSS), also works with Rabin-Williams. EMSA1 and EMSA3 do \emph{not} work +with Rabin-Williams. + +RSA can be used with any of the available encoding methods. EMSA4 is by far the +most secure, but is not (as of now) widely implemented. EMSA3 (also called +``EMSA-PKCS1-v1\_5'') is commonly used with RSA (for example in SSL). EMSA1 +signs the message digest directly, without any extra padding or encoding. This +may be useful, but is not as secure as either EMSA3 or EMSA4. EMSA2 may be used +but is not recommended. + +For DSA and Nyberg-Rueppel, you should use EMSA1. None of the other encoding +methods are particularly useful for these algorithms. + +\subsection{Key Agreement} + +You can get ahold of a \type{PK\_Key\_Agreement\_Scheme} object by calling +\function{get\_pk\_kas} with a key that is of a type that supports key +agreement (such as a Diffie-Hellman key stored in a \type{DH\_PrivateKey} +object), and the name of a key derivation function. This can be ``Raw'', +meaning the output of the primitive itself is returned as the key, or +``KDF1(hash)'' or ``KDF2(hash)'' where ``hash'' is any string you happen to +like (hopefully you like strings like ``SHA-1'' or ``RIPEMD-160''), or +``X9.42-PRF(keywrap)'', which uses the PRF specified in ANSI X9.42. It takes +the name or OID of the key wrap algorithm which will be used to encrypt a +content encryption key. + +How key agreement generally works is that you trade public values with some +other party, and then each of you runs a computation with the other's value and +your key (this should return the same result to both parties). This computation +can be called by using \function{derive\_key} with either a byte array/length +pair, or a \type{SecureVector<byte>} than holds the public value of the other +party. The last argument to either call is a number that specifies how long a +key you want. + +Depending on the key derivation function you're using, you many not +\emph{actually} get back a key of that size. In particular, ``Raw'' will return +a number about the size of the Diffie-Hellman modulus, and KDF1 can only return +a key which is the same size as the output of the hash. KDF2, on the other +hand, will always give you a key exactly as long as you request, regardless of +the underlying hash used with it. The key returned is a \type{SymmetricKey}, +ready to pass to a block cipher, MAC, or other symmetric algorithm. + +The public value which should be used can be obtained by calling +\function{public\_data}, which exists for any key that is associated with a +key agreement algorithm. It returns a \type{SecureVector<byte>}. + +``KDF2(SHA-1)'' is by far the preferred algorithm for key derivation in new +applications. The X9.42 algorithm may be useful in some circumstances, but +unless you need X9.42 compatibility, KDF2 is easier to use. + +There is a Diffie-Hellman example included in the distribution, which you may +want to examine. + +\subsection{Importing and Exporting PK Keys} + +[This section mentions \type{Pipe} and \type{DataSource}, which is not covered +until later in the manual. Please read those sections for more about +\type{Pipe} and \type{DataSource} and their uses.] + +There are many, many different (often conflicting) standards surrounding public +key cryptography. There is, thankfully, only two major standards surrounding +the representation of a public or private key: X.509 (for public keys), and +PKCS \#8 (for private keys). Other crypto libraries, like OpenSSL and B-SAFE, +also support these formats, so you can easily exchange keys with software that +doesn't use Botan. + +In addition to ``plain'' public keys, Botan also supports X.509 certificates. +These are documented in the section ``Certificate Handling'', later in this +manual. + +\subsubsection{Public Keys} + +The interfaces for doing either of these is quite similar. Let's look at the +X.509 stuff first: +\begin{verbatim} +namespace X509 { + void encode(const X509_PublicKey& key, Pipe& out, X509_Encoding enc = PEM); + std::string PEM_encode(const X509_PublicKey& out); + + X509_PublicKey* load_key(DataSource& in); + X509_PublicKey* load_key(const std::string& file); + X509_PublicKey* load_key(const SecureVector<byte>& buffer); +} +\end{verbatim} + +Basically, \function{X509::encode} will take an \type{X509\_PublicKey} (as of +now, that's any RSA, DSA, or Diffie-Hellman key) and encodes it using +\arg{enc}, which can be either \type{PEM} or \type{RAW\_BER}. Using \type{PEM} +is \emph{highly} recommended for many reasons, including compatibility with +other software, for transmission over 8-bit unclean channels, because it can be +identified by a human without special tools, and because it sometimes allows +more sane behavior of tools that process the data. It will place the encoding +into \arg{out}. Remember that if you have just created the \type{Pipe} that you +are passing to \function{X509::encode}, you need to call \function{start\_msg} +first. Particularly with public keys, about 99\% of the time you just want to +PEM encode the key and then write it to a file or something. In this case, it's +probably easier to use \function{X509::PEM\_encode}. This function will simply +return the PEM encoding of the key as a \type{std::string}. + +For loading a public key, the preferred method is one of the variants of +\function{load\_key}. This function will return a newly allocated key based on +the data from whatever source it is using (assuming, of course, the source is +in fact storing a representation of a public key). The encoding used (PEM or +BER) need not be specified; the format will be detected automatically. The key +is allocated with \function{new}, and should be released with \function{delete} +when you are done with it. The first takes a generic \type{DataSource} which +you have to allocate~--~the others are simple wrapper functions that take +either a filename or a memory buffer. + +So what can you do with the return value of \function{load\_key}? On its own, a +\type{X509\_PublicKey} isn't particularly useful; you can't encrypt messages or +verify signatures, or much else. But, using \function{dynamic\_cast}, you can +figure out what kind of operations the key supports. Then, you can cast the key +to the appropriate type and pass it to a higher-level class. For example: + +\begin{verbatim} + /* Might be RSA, might be ElGamal, might be ... */ + X509_PublicKey* key = X509::load_key("pubkey.asc"); + /* You MUST use dynamic_cast to convert, because of virtual bases */ + PK_Encrypting_Key* enc_key = dynamic_cast<PK_Encrypting_Key*>(key); + if(!enc_key) + throw Some_Exception(); + PK_Encryptor* enc = get_pk_encryptor(*enc_key, "EME1(SHA-1)"); + SecureVector<byte> cipher = enc->encrypt(some_message, size_of_message); +\end{verbatim} + +\pagebreak + +\subsubsection{Private Keys} + +There are two different options for private key import/export. The first is a +plaintext version of the private key. This is supported by the following +functions: + +\begin{verbatim} +namespace PKCS8 { + void encode(const PKCS8_PrivateKey& key, Pipe& to, X509_Encoding enc = PEM); + + std::string PEM_encode(const PKCS8_PrivateKey& key); +} +\end{verbatim} + +These functions are basically the same as the X.509 functions described +previously. The only difference is that they take a \type{PKCS8\_PrivateKey} +type (which, again, can be either RSA, DSA, or Diffie-Hellman, but this time +the key must be a private key). In most situations, using these is a bad idea, +because anyone can come along and grab the private key without having to know +any passwords or other secrets. Unless you have very particular security +requirements, always use the versions that encrypt the key based on a +passphrase. For importing, the same functions can be used for encrypted and +unencrypted keys. + +The other way to export a PKCS \#8 key is to first encode it in the same manner +as done above, then encrypt it (using a passphrase and the techniques of PKCS +\#5), and store the whole thing into another structure. This method is +definitely preferred, since otherwise the private key is unprotected. The +following functions support this technique: + +\begin{verbatim} +namespace PKCS8 { + void encrypt_key(const PKCS8_PrivateKey& key, Pipe& out, + std::string passphrase, std::string pbe = "", + X509_Encoding enc = PEM); + + std::string PEM_encode(const PKCS8_PrivateKey& key, std::string passphrase, + std::string pbe = ""); +} +\end{verbatim} + +To export an encrypted private key, call \function{PKCS8::encrypt\_key}. The +\arg{key}, \arg{out}, and \arg{enc} arguments are similar in usage to the ones +for \function{PKCS8::encode}. As you might notice, there are two new arguments +for \function{PKCS8::encrypt\_key}, however. The first is a passphrase (which +you presumably got from a user somehow). This will be used to encrypt the key. +The second new argument is \arg{pbe}; this specifies a particular password +based encryption (or PBE) algorithm. + +The \function{PEM\_encode} version shown here is similar to the one that +doesn't take a passphrase. Essentially it encrypts the key (using the default +PBE algorithm), and then returns a C++ string with the PEM encoding of the key. + +If \arg{pbe} is blank, then the default algorithm (controlled by the +``base/default\_pbe'' option) will be used. As shipped, this default is +``PBE-PKCS5v20(SHA-1,TripleDES/CBC)'' . This is among the more secure options +of PKCS \#5, and is widely supported among implementations of PKCS \#5 v2.0. It +offers 168 bits of security against attacks, which should be more that +sufficient. If you need compatibility with systems that only support PKCS \#5 +v1.5, pass ``PBE-PKCS5v15(MD5,DES/CBC)'' as \arg{pbe}. However, be warned that +this PBE algorithm only has 56 bits of security against brute force attacks. As +of 1.4.5, all three keylengths of AES are also available as options, which can +be used with by specifying a PBE algorithm of +``PBE-PKCS5v20(SHA-1,AES-256/CBC)'' (or ``AES-128'' or ``AES-192''). Support +for AES is slightly non-standard, and some applications or libraries might not +handle it. It is known that OpenSSL (0.9.7 and later) do handle AES for private +key encryption. + +There may be some strange programs out there that support the v2.0 extensions +to PBES1 but not PBES2; if you need to inter-operate with a program like that, +use ``PBE-PKCS5v15(MD5,RC2/CBC)''. For example, OpenSSL supports this format +(though since it also supports the v2.0 schemes, there is no reason not to just +use TripleDES or AES). This scheme uses a 64 bit key, which, while +significantly better than a 56 bit key, is a bit too small for comfort. + +Last but not least, there are some functions which is basically identical to +\function{X509::load\_key}, which will load, and possibly decrypt, a PKCS \#8 +private key: + +\begin{verbatim} +namespace PKCS8 { + PKCS8_PrivateKey* load_key(DataSource& in, const User_Interface& ui); + PKCS8_PrivateKey* load_key(DataSource& in, std::string passphrase = ""); + + PKCS8_PrivateKey* load_key(const std::string& filename, + const User_Interface& ui); + PKCS8_PrivateKey* load_key(const std::string& filename, + const std::string& passphrase = ""); +} +\end{verbatim} + +The versions that take \type{std::string} \arg{passphrase}s are primarily for +compatibility, but they are useful in limited circumstances. The +\type{User\_Interface} versions are how \function{load\_key} is actually +implemented, and provides for much more flexibility. Essentially, if the +passphrase given to the function is not correct, then an exception is thrown +and that is that. However, if you pass in an UI object instead, then the UI +object can keep asking the user for the passphrase until they get it right (or +until they cancel the action, though the UI interface). A +\type{User\_Interface} has very little to do with talking to users; it's just a +way to glue together Botan and whatever user interface you happen to be +using. You can think of it as a user interface interface. The default +\type{User\_Interface} is actually very dumb, and effectively acts just like +the versions taking the \type{std::string}. + +After loading a key, you can use \function{dynamic\_cast} to find out what +operations it supports, and use it appropriately. Remember to \function{delete} +it once you are done with it. + +\subsubsection{Limitations} + +As of now Nyberg-Rueppel and Rabin-Williams keys cannot be imported or +exported, because they have no official ASN.1 OID or definition. ElGamal keys +can (as of Botan 1.3.8) be imported and exported, but the only other +implementation which supports the format is Peter Gutmann's Cryptlib. If you +can help it, stick to RSA and DSA. + +\emph{Note}: Currently NR and RW are given basic ASN.1 key formats (which +mirror DSA and RSA, respectively), which means that, if they are assigned an +OID, they can be imported and exported just as easily as RSA and DSA. You can +assign them an OID by putting a line in a Botan configuration file, calling +\function{OIDS::add\_oid}, or editing \filename{src/policy.cpp}. Be warned that +it is possible that a future version will use a format which is different from +the current one (\ie, a newly standardized format). + +\pagebreak + +\section{Filters and Pipes} + +\subsection{Basic Filter Usage} + +Up until this point, using Botan would be very tedious; to do anything you +would have to bother with putting data into arrays, doing whatever you want +with it, and then sending it someplace. The filter metaphor (defining a series +of operations which take some amount of input, process it, then send it along +to the next filter) works very well in this situation. If you've ever used a +Unix system, the usage of filters in Botan should be very intuitive (and even +if you haven't, don't worry, it's pretty easy). For instance, here is how you +encrypt a file with AES in CBC mode with PKCS\#7 padding, then encode it with +Base64 and send it to standard output (we assume that \verb|file| is an open +\type{istream}): + +\begin{verbatim} + SymmetricKey key(32); + InitializationVector iv(16); // or use: block_size_of("AES") + Pipe encryptor(get_cipher("AES/CBC/PKCS7", key, iv, ENCRYPTION), + new Base64_Encoder); + encryptor.start_msg(); + file >> encryptor; + encryptor.end_msg(); // flush buffers, complete computations + std::cout << encryptor; +\end{verbatim} + +\type{Pipe} works in conjunction with the \type{Filter} class (for example, the +\type{CBC\_Encryption} and \type{Base64\_Encoder} types used above are +\type{Filter}s), but you never have to deal with them directly; \type{Pipe} +handles all the required housekeeping. \type{Pipe} is fully documented in the +section titled ``The Pipe API'', which appears later in this section. + +A useful ability of \type{Pipe} is to split up the work up into what are called +``messages''. Messages are blocks of data that are processed in an identical +fashion (\ie, with the same sequence of \type{Filter}s). Messages are delimited +by the \function{start\_msg} and \function{end\_msg} functions, as shown +above. There are two different ways to make use of messages. One is to send +several messages through a \type{Pipe} without changing the \type{Pipe}'s +configuration, so you end up with a sequence of messages; one use of this would +be to send a sequence of identically encrypted UDP packets, for example (note +that the \emph{data} need not be identical; it is just that each is encrypted, +encoded, signed, etc in an identical fashion). Another is to change the filters +that are used in the \type{Pipe} between each message, by adding or removing +\type{Filter}s; functions that let you do this are documented in the Pipe API +section. Pipe's full interface definition can be found in \filename{pipe.h} + +\subsubsection{Fork} + +It's fairly common that you might receive some data and want to perform more +than one operation on it (\ie, encrypt it with DES and calculate the MD5 hash +of the plaintext at the same time). That's where \type{Fork} comes +in. \type{Fork} is a filter that takes input and passes it on to \emph{one or +more} \type{Filter}s which are attached to it. \type{Fork} changes the nature +of the pipe system completely. Instead of being a linked list, it becomes a +tree. + +Before messages were added to Botan, using \type{Fork} was significantly more +complicated, requiring you to keep pointers to \type{Fork} objects you +allocated and sending control information to them when you wanted to read your +output. Now, however, things are much simpler. Each \type{Filter} in the fork +is given its own output buffer, and thus its own message. For example, if you +have previously written two messages into a \type{Pipe}, then you start a new +one with a \type{Fork} which has three paths of \type{Filter}'s inside it, you +add three new messages to the \type{Pipe}. The data you put into the +\type{Pipe} is duplicated and sent into each set of \type{Filter}s, and the +eventual output is placed into a dedicated message slot in the \type{Pipe}. + +Messages in the \type{Pipe} are allocated in a depth-first manner. This is only +interesting if you are using more than one \type{Fork} in a single \type{Pipe}. +As an example, consider the following: + +\begin{verbatim} + Pipe pipe(new Fork( + new Fork( + new Base64_Encoder, + new Fork( + NULL, + new Base64_Encoder + ) + ), + new Hex_Encoder + ) + ); +\end{verbatim} + +In this case, message 0 will be the output of the first \type{Base64\_Encoder}, +message 1 will be a copy of the input (see below for how \type{Fork} interprets +NULL pointers), message 2 will be the output of the second +\type{Base64\_Encoder}, and message 3 will be the output of the +\type{Hex\_Encoder}. As you can see, this results in message numbers being +allocated in a top to bottom fashion, when looked at on the screen. However, +note that there could be potential for bugs if this is not anticipated. For +example, if your code is passed a \type{Filter}, and you assume it is a +``normal'' one which only uses one message, your message offsets would be +wrong, leading to some confusion during output. + +An alternate method (which is \emph{not} used) would be to give the first +message to the first \type{Base64\_Encoder}, the second to the +\type{Hex\_Encoder}, and then the last two messages to the two \type{Filter}s +in the innermost \type{Fork}. + +The \filename{hasher} and \filename{hasher2} examples show two different ways +of using \type{Pipe} and \type{Fork}. + +There is a very useful trick that you can do with \type{Fork}. Let's say you +had some data that had been encrypted with a block cipher, and then hex +encoded. In addition, a hex encoded MAC of the plaintext had been calculated +and included with the message. You not only want to decrypt the data, you want +to verify the MAC. So the first two filters in the pipe will decode the hex, +and decrypt the raw ciphertext. But now, how are you going to both a) get the +plaintext, and b) calculate the MAC of the plaintext? This is actually very +simple, if a bit obscure. + +What you have to do is, after the filters that do the initial decoding, create +a \type{Fork}. For the first argument, pass a null pointer. The fork object +will understand that this means that you don't want to do any more processing +on that line of the fork; you just want the data that was placed in. And then +in the second argument you would pass in a \type{MAC\_Filter} so you could +compute a MAC of the plaintext. An alternative is to define a simple +passthrough/null \type{Filter}, which just calls \function{send} whenever +\arg{write} is called. This is (in the author's opinion) pointless, but there +is nothing stopping you from doing so if desired. + +For an example of this technique, look at the \filename{rsa\_dec} example in +\filename{doc/examples/}. + +Any \type{Filter}s which are attached to the \type{Pipe} after the \type{Fork} +are implicitly attached onto the first branch created by the fork. For example, +let's say you created this \type{Pipe}: + +\begin{verbatim} +Pipe pipe(new Fork(new Hash_Filter("MD5"), new Hash_Filter("SHA-1")), + new Hex_Encoder); +\end{verbatim} + +And then called \function{start\_msg}, inserted some data, then +\function{end\_msg}. Then \arg{pipe} would contain two messages. The first one +(message number 0) would contain the MD5 sum of the input in hex encoded form, +and the other would contain the SHA-1 sum of the input in raw binary. + +\subsubsection{Chain} + +\type{Chain} is about as simple as it gets. \type{Chain} creates a chain of +\type{Filter}s and encapsulates them inside a single filter (itself). This is +primarily useful for passing a sequence of filters into something which is +expecting only a single \type{Filter} (most notably, \type{Fork}). You can call +\type{Chain}'s constructor with up to 4 \type{Filter*}s (they will be added in +order), or with an array of \type{Filter*}s and a \type{u32bit} which tells +\type{Chain} how many \type{Filter*}s are in the array (again, they will be +attached in order). See the section ``A Filter Example'' for an example of +using \type{Chain}. + +\subsubsection{Data Sources} + +A \type{DataSource} is a simple abstraction for a thing that stores bytes. This +type is used fairly heavily in the areas of the API related to ASN.1 +encoding/decoding. The following types are \type{DataSource}s: \type{Pipe}, +\type{SecureQueue}, and a couple of special purpose ones: +\type{DataSource\_Memory} and \type{DataSource\_Stream}. + +You can create a \type{DataSource\_Memory} with an array of bytes and a length +field. The object will make a copy of the data, so you don't have to worry +about keeping that memory allocated. This is mostly for internal use, but if it +comes in handy, feel free to use it. + +A \type{DataSource\_Stream} is probably more useful than the memory based +one. It's constructors take either a \type{std::istream} or a +\type{std::string}. If it's a stream, the data source will use the +\type{istream} to satisfy read requests (this is particularly useful to use +with \type{std::cin}). If the string version is used, it will attempt to open +up a file with that name and read from it. + +\subsubsection{Data Sinks} + +A \type{DataSink} (in \filename{data\_snk.h}) is a \type{Filter} which takes +arbitrary amounts of input, and produces no output. Generally, this means it's +doing something with the data outside the realm of what +\type{Filter}/\type{Pipe} can handle, for example, writing it to a file (which +is what the \type{DataSink\_Stream} does). There is no need for +\type{DataSink}s which write to a \type{std::string} or memory buffer, because +\type{Pipe} can handle that by itself. + +Here's a quick example of using a \type{DataSink}, which encrypts +\filename{in.txt} and sends the output to \filename{out.txt}. There is +no explicit output operation; the writing of \filename{out.txt} is +implicit. + +\begin{verbatim} + DataSource_Stream in("in.txt"); + Pipe pipe(new CBC_Encryption("Blowfish", "PKCS7", key, iv), + new DataSink_Stream("out.txt")); + pipe.process_msg(in); +\end{verbatim} + +A real advantage of this is that even if ``in.txt'' is large (say, 1 +gigabyte), only as much memory is needed for internal I/O buffers will actually +be used. A naive use of \type{Pipe} would, in that case, use up about 1 +gigabyte of memory, by storing the full encrypted version of the file in +memory, and then writing it all out at once. + +\subsection{The Pipe API} + +Using \type{Pipe} is supposed to be pretty easy (especially in the common, +simple cases). The usage is generally as follows: Initialize a \type{Pipe} with +the filters you want to use, write some data into it, and then read some +processed data out. + +\subsubsection{Initializing Pipe} + +By default, \type{Pipe} will do nothing at all; any input placed into the +\type{Pipe} will be read back unchanged. Obviously, this has limited utility, +and presumably you want to use one or more \type{Filter}s to somehow process +the data. First, you can choose a set of \type{Filter}s to initialize the +\type{Pipe} with via the constructor. Namely, you can pass it either a set of +up to 4 \type{Filter*}s, or a pre-defined array and a length: + +\begin{verbatim} + Pipe pipe1(new Filter1(/*args*/), new Filter2(/*args*/), + new Filter3(/*args*/), new Filter4(/*args*/)); + Pipe pipe2(new Filter1(/*args*/), new Filter2(/*args*/)); + + Filter* filters[5] = { + new Filter1(/*args*/), new Filter2(/*args*/), new Filter3(/*args*/), + new Filter4(/*args*/), new Filter5(/*args*/) /* more if desired... */ + }; + Pipe pipe3(filters, 5); +\end{verbatim} + +This is by far the most common way to initialize a \type{Pipe}. However, +occasionally a more flexible initialization strategy is necessary; this is +supported by 4 member functions: \function{prepend}(\type{Filter*}), +\function{append}(\type{Filter*}), \function{pop}(), and \function{reset}(). +These functions may only be used while the \type{Pipe} in question is not in +use; that is, either before calling \function{start\_msg}, or after +\function{end\_msg} has been called (and no new calls to \function{start\_msg} +have been made yet). + +The function \function{reset}() simply removes all the \type{Filter}s which the +\type{Pipe} is currently using~--~it is reset to an initialize, ``empty'' +state. Any data which is being retained by the \type{Pipe} is retained after a +\function{reset}(), and \function{reset}() does not affect the message numbers +(discussed later). + +Calling \function{prepend} and \function{append} will either prepend or append +the passed \type{Filter} object to the list of transformations. For example, if +you \function{prepend} a \type{Filter} implementing encryption, and the +\type{Pipe} already had a \type{Filter} which hex encoded the input, then the +next set of input would be first encrypted, then hex encoded. Alternately, if +you called \function{append}, then the input would be first be hex encoded, and +then encrypted (which is not terribly useful in this particular example). + +Finally, calling \function{pop}() will remove the first transformation of the +\type{Pipe}. Say we had called \function{prepend} to put an encryption +\type{Filter} into a \type{Pipe}; calling \function{pop}() would remove this +\type{Filter} and return the \type{Pipe} to it's state before we called +\function{prepend}. + +\subsubsection{Giving Data to a Pipe} + +Input to a \type{Pipe} is delimited into messages, which can be read from +independently (\ie, you can read 5 bytes from one message, and then all of +another message, without either read affecting any other messages). The +messages are delimited by calls to \function{start\_msg} and +\function{end\_msg}. In between these two calls, you can write data into a +\type{Pipe}, and it will be processed by the \type{Filter}(s) that it +contains. Writes at any other time are invalid, and will result in an +exception. + +As to writing, you can call any of the functions called \function{write}(), +which can take any of: a \type{byte[]}/\type{u32bit} pair, a +\type{SecureVector<byte>}, a \type{std::string}, a \type{DataSource\&}, or a +single \type{byte}. + +Sometimes, you may want to do only a single write per message. In this case, +you can use the \function{process\_msg} series of functions, which start a +message, write their argument into the \type{Pipe}, and then end the +message. In this case you would not make any explicit calls to +\function{start\_msg}/\function{end\_msg}. The version of \function{write} +which takes a single \type{byte} is not supported by \function{process\_msg}, +but all the other variants are. + +\type{Pipe} can also be used with the \verb|>>| operator, and will accept a +\type{std::istream}, (or on Unix systems with the \verb|fd_unix| module), a +Unix file descriptor. In either case, the entire contents of the file will be +read into the \type{Pipe}. + +\subsubsection{Getting Output from a Pipe} + +Retrieving the processed data from a \type{Pipe} is a bit more complicated, for +various reasons. In particular, because \type{Pipe} will separate each message +into a separate buffer, you have to be able to retrieve data from each message +independently. Each of \type{Pipe}'s read functions has a final parameter which +specifies what message to read from (as a 32-bit integer). If this parameter is +set to \type{Pipe::DEFAULT\_MESSAGE}, it will read the current default message +(\type{DEFAULT\_MESSAGE} is also the default value of this parameter). The +parameter will not be mentioned in further discussion of the reading API, but +it is always there (unless otherwise noted). + +Reading is done with a variety of functions. The most basic are \type{u32bit} +\function{read}(\type{byte} \arg{out}[], \type{u32bit} \arg{len}) and +\type{u32bit} \function{read}(\type{byte\&} \arg{out}). Each reads into +\arg{out} (either up to \arg{len} bytes, or a single byte for the one taking a +\type{byte\&}), and returns the total number of bytes read. There is a variant +of these functions, all named \function{peek}, which performs the same +operations, but does not remove the bytes from the message (reading is a +destructive operation with a \type{Pipe}). + +There are also the functions \type{SecureVector<byte>} \function{read\_all}(), +and \type{std::string} \function{read\_all\_as\_string}(), which return the +entire contents of the message, either as a memory buffer, or a +\type{std::string} (which is generally only useful is the \type{Pipe} has +encoded the message into a text string, such as when a \type{Base64\_Encoder} +is used). + +To determine how many bytes are left in a message, call \type{u32bit} +\function{remaining}() (which can also take an optional message +number). Finally, there are some functions for managing the default message +number: \type{u32bit} \function{default\_msg}() will return the current default +message, \type{u32bit} \function{message\_count}() will return the total number +of messages (0...\function{message\_count}()-1), and +\function{set\_default\_msg}(\type{u32bit} \arg{msgno}) will set a new default +message number (which must be a valid message number for that \type{Pipe}). The +ability to set the default message number is particularly important in the case +of using the file output operations (\verb|<<| with a \type{std::ostream} or +Unix file descriptor), because there is no way to specify it explicitly when +using the output operator. + +\pagebreak + +\subsection{A Filter Example} + +Here is some code which takes one or more filenames in \arg{argv} and +calculates the result of several hash functions for each file. The complete +program can be found as \filename{hasher.cpp} in the Botan distribution. For +brevity, most error checking has been removed. + +\begin{verbatim} + string name[3] = { "MD5", "SHA-1", "RIPEMD-160" }; + Botan::Filter* hash[3] = { + new Botan::Chain(new Botan::Hash_Filter(name[0]), + new Botan::Hex_Encoder), + new Botan::Chain(new Botan::Hash_Filter(name[1]), + new Botan::Hex_Encoder), + new Botan::Chain(new Botan::Hash_Filter(name[2]), + new Botan::Hex_Encoder) }; + + Botan::Pipe pipe(new Botan::Fork(hash, COUNT)); + + for(u32bit j = 1; argv[j] != 0; j++) + { + ifstream file(argv[j]); + pipe.start_msg(); + file >> pipe; + pipe.end_msg(); + file.close(); + for(u32bit k = 0; k != 3; k++) + { + pipe.set_default_msg(3*(j-1)+k); + cout << name[k] << "(" << argv[j] << ") = " << pipe << endl; + } + } +\end{verbatim} + +\pagebreak + +\subsection{Rolling Your Own} + +Well, now that you know how filters work in Botan, you might want to write +your own. Lucky for you, all of the hard work is done by the \type{Filter} base +class, leaving you to handle the details of what your filter is supposed to +do. Remember that if you get confused about any of this, you can always look at +the implementation of Botan's filters to see exactly how everything works. + +There are basically only four functions that a filter need worry about: + +\noindent +\type{void} \function{write}(\type{byte} \arg{input}[], \type{u32bit} +\arg{length}): + +The \function{write} function is what is called when a filter receives input +for it to process. The filter is \emph{not} required to process it right away; +many filters buffer their input before producing any output. A filter will +usually have \function{write} called many times during it's lifetime. + +\noindent +\type{void} \function{send}(\type{byte} \arg{output}[], \type{u32bit} +\arg{length}): + +Eventually, a filter will want to produce some output to send along to the next +filter in the pipeline. It does so by calling \function{send} with whatever it +wants to send along to the next filter. There is also a version of +\function{send} taking a single byte argument, as a convenience. + +\noindent +\type{void} \function{start\_msg()}: + +This function is optional. Implement it if your \type{Filter} would like to do +some processing or setup at the start of each message (for an example, see the +Zlib compression module). + +\noindent +\type{void} \function{end\_msg()}: + +Implementing the \function{end\_msg} function is optional. It is called when it +has been requested that filters finish up their computations. Note that they +must \emph{not} deallocate their resources; this should be done by their +destructor. They should simply finish up with whatever computation they have +been working on (for example, a compressing filter would flush the compressor +and \function{send} the final block), and empty any buffers in preparation for +processing a fresh new set of input. It is essentially the inverse of +\function{start\_msg}. + +Additionally, if necessary, filters can define a constructor that takes any +needed arguments, and a destructor to deal with deallocating memory, closing +files, etc. + +There is also a \type{BufferingFilter} class (in \filename{buf\_filt.h}) which +will take a message and split it up into an initial block which can be of any +size (including zero), a sequence of fixed sized blocks of any non-zero size, +and last (possibly zero-sized) final block. This might make a useful base class +for your filters, depending on what you have in mind. + +\pagebreak + +\subsection{Filter Catalog} + +This section contains descriptions of every \type{Filter} included in Botan. +Note that modules which provide \type{Filter}s are documented elsewhere -- +these \type{Filter}s are available on any installation of Botan. + +\subsubsection{Keyed Filters} + +A few sections ago, it was mentioned that \type{Pipe} can process multiple +messages, treating each of them exactly the same. Well, that was a bit of a +lie. There are some algorithms (in particular, block ciphers not in ECB mode, +and all stream ciphers) that change their state as data is put through them. + +Naturally, you might well want to reset the keys or (in the case of block +cipher modes) IVs used by such filters, so multiple messages can be processed +using completely different keys, or new IVs, or new keys and IVs, or whatever. +And in fact, even for a MAC or an ECB block cipher, you might well want to +change the key used from message to message. + +Enter \type{Keyed\_Filter}. It's a base class of any filter that is keyed: +block cipher modes, stream ciphers, MACs, whatever. It has two functions, +\function{set\_key} and \function{set\_iv}. Calling \function{set\_key} will, +naturally, set (or reset) the key used by the algorithm. Setting the IV only +makes sense in certain algorithms -- a call to \function{set\_iv} on an object +that doesn't support IVs will be ignored. You \emph{must} call +\function{set\_key} before calling \function{set\_iv}: while not all +\type{Keyed\_Filter} objects require this, you should assume it is required +anytime you are using a \type{Keyed\_Filter}. + +Here's a example: + +\begin{verbatim} + Keyed_Filter *cast, *hmac; + Pipe pipe(new Base64_Decoder, + // Note the assignments to the cast and hmac variables + cast = new CBC_Decryption("CAST-128", "PKCS7", cast_key, iv), + new Fork( + 0, // Read the section 'Fork' to understand this + new Chain( + hmac = new MAC_Filter("HMAC(SHA-1)", mac_key, 12), + new Base64_Encoder + ) + ) + ); + pipe.start_msg(); + [use pipe for a while, decrypt some stuff, derive new keys and IVs] + pipe.end_msg(); + + cast->set_key(cast_key2); + cast->set_iv(iv2); + hmac->set_key(mac_key2); + + pipe.start_msg(); + [use pipe for some other things] + pipe.end_msg(); +\end{verbatim} + +There are some requirements to using \type{Keyed\_Filter} which you must +follow. If you call \function{set\_key} or \function{set\_iv} on a filter which +is owned by a \type{Pipe}, you must do so while the \type{Pipe} is +``unlocked''. This refers to the times when no messages are being processed by +\type{Pipe} -- either before \type{Pipe}'s \function{start\_msg} is called, or +after \function{end\_msg} is called (and no new call to \function{start\_msg} +has happened yet). Doing otherwise will result in undefined behavior, probably +silently getting invalid output. + +And remember: if you're resetting both values, reset the key \emph{first}. + +\pagebreak + +\subsubsection{Cipher Filters} + +Getting ahold of a \type{Filter} implementing a cipher is very easy. Simply +make sure you're including the header \filename{lookup.h}, and call +\function{get\_cipher}. Generally you will pass the return value directly into +a \type{Pipe}. There are actually a couple different functions, which do pretty +much the same thing: + +\function{get\_cipher}(\type{std::string} \arg{cipher\_spec}, + \type{SymmetricKey} \arg{key}, + \type{InitializationVector} \arg{iv}, + \type{Cipher\_Dir} \arg{dir}); + +\function{get\_cipher}(\type{std::string} \arg{cipher\_spec}, + \type{SymmetricKey} \arg{key}, + \type{Cipher\_Dir} \arg{dir}); + +The version that doesn't take an IV is useful for things that don't use them, +like block ciphers in ECB mode, or most stream ciphers. If you specify a +\arg{cipher\_spec} that does want a IV, and you use the version that doesn't +take one, an exception will be thrown. The \arg{dir} argument can be either +\type{ENCRYPTION} or \type{DECRYPTION}. In a few cases, like most (but not all) +stream ciphers, these are equivalent, but even then it provides a way of +showing the ``intent'' of the operation to readers of your code. + +The \arg{cipher\_spec} is a string that specifies what cipher is to be +used. The general syntax for \arg{cipher\_spec} is ``STREAM\_CIPHER'', +``BLOCK\_CIPHER/MODE'', or ``BLOCK\_CIPHER/MODE/PADDING''. In the case of +stream ciphers, no mode is necessary, so just the name is sufficient. A block +cipher requires a mode of some sort, which can be ``ECB'', ``CBC'', ``CFB(n)'', +``OFB'', ``CTR-BE'', or ``EAX(n)''. The argument to CFB mode is how many bits +of feedback should be used. If you just use ``CFB'' with no argument, it will +default to using a feedback equal to the block size of the cipher. EAX mode +also takes an optional bit argument, which tells EAX how large a tag size to +use~--~generally this is the size of the block size of the cipher, which is the +default if you don't specify any argument. + +In the case of the ECB and CBC modes, a padding method can also be +specified. If it is not supplied, ECB defaults to not padding, and CBC defaults +to using PKCS \#5/\#7 compatible padding. The padding methods currently +available are ``NoPadding'', ``PKCS7'', ``OneAndZeros'', and ``CTS''. CTS +padding is currently only available for CBC mode, but the others can also be +used in ECB mode. + +Some example \arg{cipher\_spec} arguments are: ``DES/CFB(32)'', +``TripleDES/OFB'', ``Blowfish/CBC/CTS'', ``SAFER-SK(10)/CBC/OneAndZeros'', +``AES/EAX'', ``ARC4'' + +``CTR-BE'' refers to counter mode where the counter is incremented as if it +were a big-endian encoded integer. This is compatible with most other +implementations, but it is possible some will use the incompatible little +endian convention. This version would be denoted as ``CTR-LE'' if it were +supported. + +``EAX'' is a new cipher mode designed by Wagner, Rogaway, and Bellare. It is an +authenticated cipher mode (that is, no separate authentication is needed), has +provable security, and is free from patent entanglements. It runs about half as +fast as most of the other cipher modes (like CBC, OFB, or CTR), which is not +bad considering you don't need to use an authentication code. + +\subsubsection{Hashes and MACs} + +Hash functions and MACs don't need anything special when it comes to +filters. Both just take their input and produce no output until +\function{end\_msg()} is called, at which time they complete the hash or MAC +and send that as output. + +These \type{Filter}s take a string naming the type to be used. If for some +reason you name something that doesn't exist, an exception will be thrown. + +\noindent +\function{Hash\_Filter}(\type{std::string} \arg{hash}, + \type{u32bit} \arg{outlength}): + +This type hashes it's input with \arg{hash}. When \function{end\_msg} is called +on the owning \type{Pipe}, the hash is completed and the digest is sent on to +the next thing in the pipe. The argument \arg{outlength} specifies how much of +the output of the hash will be passed along to the next filter when +\function{end\_msg} is called. By default, it will pass the entire hash. + +Examples of names for \function{Hash\_Filter} are ``SHA-1'' and ``Whirlpool''. + +\noindent +\function{MAC\_Filter}(\type{std::string} \arg{mac}, + \type{const SymmetricKey\&} \arg{key}, + \type{u32bit} \arg{outlength}): + +The constructor for a \type{MAC\_Filter} takes a key, used in calculating the +MAC, and a length parameter, which has semantics exactly the same as the one +passed to \type{Hash\_Filter}s constructor. + +Examples for \arg{mac} are ``HMAC(SHA-1)'', ``MD5-MAC'', and the exceptionally +long, strange, and probably useless name +``CMAC(Lion(Tiger(20,3),MARK-4,1024))''. + +\subsubsection{PK Filters} + +There are four classes in this category, \type{PK\_Encryptor\_Filter}, +\type{PK\_Decryptor\_Filter}, \type{PK\_Signer\_Filter}, and +\type{PK\_Verifier\_Filter}. Each takes a pointer to an object of the +appropriate type (\type{PK\_Encryptor}, \type{PK\_Decryptor}, etc) which is +deleted by the destructor. These classes are found in \filename{pk\_filts.h}. + +Three of these, for encryption, decryption, and signing are pretty much +identical conceptually. Each of them buffers it's input until the end of the +message is marked with a call to the \function{end\_msg} function. Then they +encrypt, decrypt, or sign their input and send the output (the ciphertext, the +plaintext, or the signature) into the next filter. + +Signature verification works a little differently, because it needs to know +what the signature is in order to check it. You can either pass this in along +with the constructor, or call the function \function{set\_signature} -- with +this second method, you need to keep a pointer to the filter around so you can +send it this command. In either case, after \function{end\_msg} is called, it +will try to verify the signature (if the signature has not been set by either +method, an exception will be thrown here). It will then send a single byte onto +the next filter -- a 1 or a 0, which specifies whether the signature verified +or not (respectively). + +For more information about PK algorithms (including creating the appropriate +objects to pass to the constructors), read the section ``Public Key +Cryptography'' in this manual. + +\subsubsection{Encoders} + +Often you want your data to be in some form of text (for sending over channels +which aren't 8-bit clean, printing it, etc). The filters \type{Hex\_Encoder} +and \type{Base64\_Encoder} will convert arbitrary binary data into hex or +base64 formats. Not surprisingly, you can use \type{Hex\_Decoder} and +\type{Base64\_Decoder} to convert it back into it's original form. + +Both of the encoders can take a few options about how the data should be +formatted (all of which have defaults). The first is a \type{bool} which simply +says if the encoder should insert line breaks. This defaults to +false. Line breaks don't matter either way to the decoder, but it makes the +output a bit more appealing to the human eye, and a few transport mechanisms +(notably some email systems) limit the maximum line length. + +The second encoder option is an integer specifying how long such lines will be +(obviously this will be ignored if line-breaking isn't being used). The default +tends to be in the range of 60-80 characters, but is not specified exactly. If +you want a specific value, set it. Otherwise the default should be fine. + +Lastly, \type{Hex\_Encoder} takes an argument of type \type{Case}, which can be +\type{Uppercase} or \type{Lowercase} (default is \type{Uppercase}). This +specifies what case the characters A-F should be output as. The base64 encoder +has no such option, because it uses both upper and lower case letters for it's +output. + +The decoders both take a single option, which tells it how the object should +behave in the case of invalid input. The enum (called \type{Decoder\_Checking}) +can take on any of three values: \type{NONE}, \type{IGNORE\_WS}, and +\type{FULL\_CHECK}. With \type{NONE} (the default, for compatibility with +previous releases), invalid input (for example, a ``z'' character in supposedly +hex input) will simply be ignored. With \type{IGNORE\_WS}, whitespace will be +ignored by the decoder, but receiving other non-valid data will raise an +exception. Finally, \type{FULL\_CHECK} will raise an exception for \emph{any} +characters not in the encoded character set, including whitespace. + +You can find the declarations for these types in \filename{hex.h} and +\filename{base64.h}. + +\pagebreak + +\section{Certificate Handling} + +A certificate is essentially a binding between some identifying information of +a person or other entity (called a \emph{subject}) and a public key. This +binding is asserted by a signature on the certificate, which is placed there by +some authority (the \emph{issuer}) which at least claims that it knows the +subject that is named in the certificate really ``owns'' the private key +corresponding to the public key in the certificate. + +The major certificate format in use today is X.509v3, designed by ISO and +further hacked on by dozens (hundreds?) of other organizations. + +When working with certificates, the main class to remember is +\type{X509\_Certificate}. You can read an object of this type, but you can't +create one on the fly; a CA object is necessary for actually making a new +certificate. So for the most part, you only have to worry about reading them +in, verifying the signatures, and getting the bits of data in them (most +commonly the public key, and the information about the user of that key). An +X.509v3 certificate can contain a literally infinite number of items related to +all kinds of things. Botan doesn't support a lot of them, simply because nobody +uses them and they're an impossible mess to work with. This section only +documents the most commonly used ones of the ones that are supported; for the +rest, read \filename{x509cert.h} and \filename{asn1\_obj.h} (which has the +definitions of various common ASN.1 constructs used in X.509). + +\subsection{So what's in an X.509 certificate?} + +Obviously, you want to be able to get the public key. This is achieved by +calling the member function \function{subject\_public\_key}, which will return +a \type{X509\_PublicKey*}. As to what to do with this, read about +\function{load\_key} in the section ``Importing and Exporting PK Keys''. In the +general case, this could be any kind of public key, though 99\% of the time it +will be an RSA key. However, Diffie-Hellman and DSA keys are also supported, so +be careful about how you treat this. It is also a wise idea to examine the +value returned by \function{constraints}, to see what uses the public key is +approved for. + +The second major piece of information you'll want is the name/email/etc of the +person to whom this certificate is assigned. Here is where things get a little +nasty. X.509v3 has two (well, mostly just two $\ldots$) different places where +you can stick information about the user: the \emph{subject} field, and in an +extension called \emph{subjectAlternativeName}. The \emph{subject} field is +supposed to only included the following information: country, organization +(possibly), an organizational sub-unit name (possibly), and a so-called common +name. The common name is usually the name of the person, or it could be a title +associated with a position of some sort in the organization. It may also +include fields for state/province and locality. What exactly a locality is, +nobody knows, but it's usually given as a city name. + +Botan doesn't currently support any of the Unicode variants used in ASN.1 +(UTF-8, UCS-2, and UCS-4), any of which could be used for the fields in the +DN. This could be problematic, particularly in Asia and other areas where +non-ASCII characters are needed for most names. The UTF-8 and UCS-2 string +types \emph{are} accepted (in fact, UTF-8 is used when encoding much of the +time), but if any of the characters included in the string are not in ISO +8859-1 (\ie 0 \ldots 255), an exception will get thrown. Currently the +\type{ASN1\_String} type holds it's data as ISO 8859-1 internally (regardless +of local character set); this would have to be changed to hold UCS-2 or UCS-4 +in order to support Unicode (also, many interfaces in the X.509 code would have +to accept or return a \type{std::wstring} instead of a \type{std::string}). + +Like the distinguished names, subject alternative names can contain a lot of +things that Botan will flat out ignore (most of which you would never actually +want to use). However, there are three very useful pieces of information which +this extension might hold: an email address (``[email protected]''), a DNS name +(``somehost.site2.com''), or a URI (``http://www.site3.com''). + +So, how to get the information? Simply call \function{subject\_info} with the +name of the piece of information you want, and it will return a +\type{std::string} which is either empty (signifying that the certificate +doesn't have this information), or has the information requested. There are +several names for each possible item, but the most easily readable ones are: +``Name'', ``Country'', ``Organization'', ``Organizational Unit'', ``Locality'', +``State'', ``RFC822'', ``URI'', and ``DNS''. These values are returned as a +\type{std::string}. + +You can also get information about the issuer of the certificate in the same +way, using \function{issuer\_info}. + +\subsubsection{X.509v3 Extensions} + +X.509v3 specifies a large number of possible extensions. Botan supports some, +but by no means all of them. This section lists which ones are supported, and +notes areas where there may be problems with the handling. You have to be +pretty familiar with X.509 in order to understand what this is talking about. + +\begin{list}{$\cdot$} + \item Key Usage and Extended Key Usage: No problems known. + \item + + \item Basic Constraints: No problems known. The default for a v1/v2 + certificate is assume it's a CA if and only if the option + ``x509/default\_to\_ca'' is set. A v3 certificate is marked as a CA if + (and only if) the basic constraints extension is present and set for a + CA cert. + + \item Subject Alternative Names: Only the ``rfc822Name'', ``dNSName'', and + ``uniformResourceIdentifier'' fields will be stored; all others are + ignored. + + \item Issuer Alternative Names: Same restrictions as the Subject Alternative + Names extension. New certificates generated by Botan never include the + issuer alternative name. + + \item Authority Key Identifier: Only the version using KeyIdentifier is + supported. If the GeneralNames version is used and the extension is + critical, an exception is thrown. If both the KeyIdentifier and + GeneralNames versions are present, then the KeyIdentifier will be + used, and the GeneralNames ignored. + + \item Subject Key Identifier: No problems known. +\end{list} + +\subsubsection{Revocation Lists} + +It will occasionally happen that a certificate must be revoked before it's +expiration date. Examples of this happening include the private key being +compromised, or the user to which it has been assigned leaving an +organization. Certificate revocation lists are an answer to this problem +(though online certificate validation techniques are starting to become +somewhat more popular). Essentially, every once in a while the CA will release +a CRL, listing all certificates which have been revoked. Also included is +various pieces of information like what time a particular certificate was +revoked, and for what reason. In most systems, it is wise to support some form +of certificate revocation, and CRLs handle this fairly easily. + +For most users, processing a CRL is quite easy. All you have to do is call the +constructor, which will take a filename (or a \type{DataSource\&}). The CRLs +can either be in raw BER/DER, or in PEM format; the constructor will figure out +which format without any extra information. For example: + +\begin{verbatim} + X509_CRL crl1("crl1.der"); + + DataSource_Stream in("crl2.pem"); + X509_CRL crl2(in); +\end{verbatim} + +After that, pass the \type{X509\_CRL} object to a \type{X509\_Store} object +with \type{X509\_Code} \function{add\_crl}(\type{X509\_CRL}), and all future +verifications will take into account the certificates listed, assuming +\function{add\_crl} returns \type{VERIFIED}. If it doesn't return +\type{VERIFIED}, then the return value is an error code signifying that the CRL +could not be processed due to some problem (which could range from the issuing +certificate not being found, to the CRL having some format problem). For more +about the \type{X509\_Store} API, read the section later in this chapter. + +\pagebreak + +\subsection{Reading Certificates} + +\type{X509\_Certificate} has two constructors, each of which takes a source of +data; a filename to read, and a \type{DataSource\&}. + +\subsection{Storing and Using Certificates} + +If you read a certificate, you probably want to verify the signature on +it. However, consider that to do so, we may have to verify the signature on the +certificate that we used to verify the first certificate, and on and on until +we hit the top of the certificate tree somewhere. It would be a might huge pain +to have to handle all of that manually in every application, so there is +something that does it for you: \type{X509\_Store}. + +This is a pretty easy thing to use. The basic operations are: put certificates +and CRLs into it, search for certificates, and attempt to verify +certificates. That's about it. In the future, there will be support for online +retrieval of certificates and CRLs (\eg with the HTTP cert-store interface +currently under consideration by PKIX). + +\subsubsection{Adding Certificates} + +You can add new certificates to a certificate store using any of these +functions: + +\function{add\_cert}(\type{const X509\_Certificate\&} \arg{cert}, + \type{bool} \arg{trusted} \type{= false}) + +\function{add\_certs}(\type{DataSource\&} \arg{source}) + +\function{add\_trusted\_certs}(\type{DataSource\&} \arg{source}) + +The versions that take a \type{DataSource\&} will add all of the certificates +that it can find in that source. + +All of them add the cert(s) to the store. The 'trusted' certificates are the +ones which you have some reason to trust are genuine. For example, say your +application is working with certificates which are owned by employees of some +company, and all of their certificates are signed by the company CA, whose +certificate is in turned signed by a commercial root CA. What you would then do +is include the certificate of the commercial CA with your application, and read +it in as a trusted certificate. From there, you could verify the company CA's +certificate, and then use that to verify the end user's certificates. Only +self-signed certificates may be considered trusted. + +\subsubsection{Adding CRLs} + +\type{X509\_Code} \function{add\_crl}(\type{const X509\_CRL\&} \arg{crl}); + +This will process the CRL and mark the revoked certificates. This will also +work if a revoked certificate is added to the store sometime after the CRL is +processed. The function can return an error code (listed later), or will return +\type{VERIFIED} if everything completed successfully. + +\subsubsection{Storing Certificates} + +You can output a set of certificates by calling \function{PEM\_encode}, which +will return a \type{std::string} containing each of the certificates in the +store, PEM encoded and concatenated. This simple format can easily be read by +both Botan and other libraries/applications. + +\pagebreak + +\subsubsection{Searching for Certificates} + +You can find certificates in the store with a series of functions contained +in the \function{X509\_Store\_Search} namespace: + +\begin{verbatim} +namespace X509_Store_Search { +std::vector<X509_Certificate> by_email(const X509_Store& store, + const std::string& email_addr); +std::vector<X509_Certificate> by_name(const X509_Store& store, + const std::string& name); +std::vector<X509_Certificate> by_dns(const X509_Store&, + const std::string& dns_name); +} +\end{verbatim} + +These functions will return a (possibly empty) vector of certificates from +\arg{store} matching your search criteria. The email address and DNS name +searches are case-insensitive but are sensitive to extra whitespace and so +on. The name search will do case-insensitive substring matching, so, for +example, calling \function{X509\_Store\_Search::by\_name}(\arg{your\_store}, +``dob'') will return certificates for ``J.R. 'Bob' Dobbs'' and +``H. Dobbertin'', assuming both of those certificates are in \arg{your\_store}. + +You could then display the results to a user, and allow them to select the +appropriate one. Searching using an email address as the key is usually more +effective than the name, since email addresses are rarely shared. + +\subsubsection{Certificate Stores} + +An object of type \type{Certificate\_Store} is a generalized interface to an +external source for certificates (and CRLs). Examples of such a store would be +one that looked up the certificates in a SQL database, or by contacting a CGI +script running on a HTTP server. There are currently three mechanisms for +looking up a certificate, and one for retrieving CRLs. By default, most of +these mechanisms will simply return an empty \type{std::vector} of +\type{X509\_Certificate}. This storage mechanism is \emph{only} queried when +doing certificate validation: it allows you to distribute only the root key +with an application, and let some online method handle getting all the other +certificates that are needed to validate an end entity certificate. In +particular, the search routines will not attempt to access the external +database. + +The three certificate lookup methods are \function{by\_SKID} (Subject Key +Identifier), \function{by\_name} (the CommonName DN entry), and +\function{by\_email} (stored in either the distinguished name, or in a +subjectAlternativeName extension). The name and email versions take a +\type{std::string}, while the SKID version takes a \type{SecureVector<byte>} +containing the subject key identifier in raw binary. You can choose not to +implement \function{by\_name} or \function{by\_email}, but \function{by\_SKID} +is mandatory to implement, and, currently, is the only version which is used by +\type{X509\_Store}. + +Finally, there is a method for finding CRLs, called \function{get\_crls\_for}, +which takes an \type{X509\_Certificate} object, and returns a +\type{std::vector} of \type{X509\_CRL}. While generally there will be only one +CRL, the use of the vector makes it easy to return no CRLs (\eg, if the +certificate store doesn't support retrieving them), or return multiple ones +(for example, if the certificate store can't determine precisely which key was +used to sign the certificate). Implementing the function is optional, and by +default will return no CRLs. If it is available, it will be used by +\type{X509\_CRL}. + +As for actually using such a store, you have to tell \type{X509\_Store} about +it, by calling the \type{X509\_Store} member function + +\function{add\_new\_certstore}(\type{Certificate\_Store}* \arg{new\_store}) + +The argument, \arg{new\_store}, will be deleted by \type{X509\_Store}'s +destructor, so make sure to allocate it with \function{new}. + +\pagebreak + +\subsubsection{Verifying Certificates} + +There is a single function in \type{X509\_Store} related to verifying a +certificate: + +\type{X509\_Code} +\function{validate\_cert}(\type{const X509\_Certificate\&} \arg{cert}, + \type{Cert\_Usage} \arg{usage} = \type{ANY}) + +To sum things up simply, it returns \type{VERIFIED} if the certificate can +safely be considered valid for the usage(s) described by \arg{usage}, and an +error code if it is not. Naturally, things are a bit more complicated than +that. The enum \type{Cert\_Usage} is defined inside the \type{X509\_Store} +class, it (currently) can take on any of the values \type{ANY} (any usage is +OK), \type{TLS\_SERVER} (for SSL/TLS server authentication), \type{TLS\_CLIENT} +(for SSL/TLS client authentication), \type{CODE\_SIGNING}, +\type{EMAIL\_PROTECTION} (email encryption, usually this means S/MIME), +\type{TIME\_STAMPING} (in theory any time stamp application, usually IETF +PKIX's Time Stamp Protocol), or \type{CRL\_SIGNING}. Note that Microsoft's code +signing system, certainly the most widely used, uses a completely different +(and basically undocumented) method for marking certificates for code signing. + +First, how does it know if a certificate is valid? Basically, a certificate is +valid if both of the following hold: a) the signature in the certificate can be +verified using the public key in the issuer's certificate, and b) the issuer's +certificate is a valid CA certificate. Note that this definition is +recursive. We get out of this by ``bottoming out'' when we reach a certificate +that we consider trusted. In general this will either be a commercial root CA, +or an organization or application specific CA. + +There are actually a few other restrictions (validity periods, key usage +restrictions, etc), but the above summarizes the major points of the validation +algorithm. In theory, Botan implements the certificate path validation +algorithm given in RFC 2459, but in practice it does not (yet), because we +don't support the X.509v3 policy or name constraint extensions. + +Possible values for \arg{usage} are \type{TLS\_SERVER}, \type{TLS\_CLIENT}, +\type{CODE\_SIGNING}, \type{EMAIL\_PROTECTION}, \type{CRL\_SIGNING}, and +\type{TIME\_STAMPING}, and \type{ANY}. The default \type{ANY} does not mean +valid for any use, it means ``is valid for some usage''. This is generally +fine, and in fact requiring that a random certificate support a particular +usage will likely result in a lot of failures, unless your application is very +careful to always issue certificates with the proper extensions, and you never +use certificates generated by other apps. + +Return values for \function{validate\_cert} (and \function{add\_crl}) include: + +\begin{list}{$\cdot$} + \item VERIFIED: The certificate is valid for the specified use. + \item + \item INVALID\_USAGE: The certificate cannot be used for the specified use. + + \item CANNOT\_ESTABLISH\_TRUST: The root certificate was not marked as + trusted. + \item CERT\_CHAIN\_TOO\_LONG: The certificate chain exceeded the length + allowed by a basicConstraints extension. + \item SIGNATURE\_ERROR: An invalid signature was found + \item POLICY\_ERROR: Some problem with the certificate policies was found. + + \item CERT\_FORMAT\_ERROR: Some format problem was found in a certificate. + \item CERT\_ISSUER\_NOT\_FOUND: The issuer of a certificate could not be + found. + \item CERT\_NOT\_YET\_VALID: The certificate is not yet valid. + \item CERT\_HAS\_EXPIRED: The certificate has expired. + \item CERT\_IS\_REVOKED: The certificate has been revoked. + + \item CRL\_FORMAT\_ERROR: Some format problem was found in a CRL. + \item CRL\_ISSUER\_NOT\_FOUND: The issuer of a CRL could not be found. + \item CRL\_NOT\_YET\_VALID: The CRL is not yet valid. + \item CRL\_HAS\_EXPIRED: The CRL has expired. + + \item CA\_CERT\_CANNOT\_SIGN: The CA certificate found does not have an + contain a public key that allows signature verification. + \item CA\_CERT\_NOT\_FOR\_CERT\_ISSUER: The CA cert found is not allowed to + issue certificates. + \item CA\_CERT\_NOT\_FOR\_CRL\_ISSUER: The CA cert found is not allowed to + issue CRLs. + + \item UNKNOWN\_X509\_ERROR: Some other error occurred. + +\end{list} + +\subsection{Certificate Authorities} + +Setting up a CA for X.509 certificates is actually probably the easiest thing +to do related to X.509. A CA is represented by the type \type{X509\_CA}, which +can be found in \filename{x509\_ca.h}. A CA always needs it's own certificate, +which can either be a self-signed certificate (see below on how to create one) +or one issued by another CA (see the section on PKCS \#10 requests). Creating +a CA object is done by the following constructor: + +\begin{verbatim} + X509_CA(const X509_Certificate& cert, const PKCS8_PrivateKey& key); +\end{verbatim} + +The private key is the private key corresponding to the public key in the the +CA's certificate. + +Generally, requests for new certificates are supplied to a CA in the form on +PKCS \#10 certificate requests (called a \type{PKCS10\_Request} object in +Botan). These are decoded in a similar manner to +certificates/CRLs/etc. Generally, a request is vetted by humans (who somehow +verify that the name in the request corresponds to the name of the person who +requested it), and then signed by a CA key, generating a new certificate. + +\begin{verbatim} + X509_Certificate sign_request(const PKCS10_Request&) const; +\end{verbatim} + +\subsubsection{Generating CRLs} + +As mentioned previously, the ability to process CRLs is highly important in +many PKI systems. In fact, according to strict X.509 rules, you must not +validate any certificate if the appropriate CRLs are not available (though +hardly any systems are that strict). In any case, a CA should have a valid CRL +available at all times. + +Of course, you might be wondering what to do if no certificates have been +revoked. In fact, CRLs can be issued without any actually revoked certificates +- the list of certs will simply be empty. To generate a new, empty CRL, just +call \type{X509\_CRL} +\function{X509\_CA::new\_crl}(\type{u32bit}~\arg{seconds}~=~0)~--~it will +create a new, empty, CRL. If \arg{seconds} is the default 0, then the normal +default CRL next update time (the value of the ``x509/crl/next\_update'') will +be used. If not, then \arg{seconds} specifies how long (in seconds) it will be +until the CRL's next update time (after this time, most clients will reject the +CRL as too old). + +On the other hand, you may have issued a CRL before. In which case, you will +want to issue a new CRL which contains both all previously revoked +certificates, along with any new ones. This is done by calling the +\type{X509\_CA} member function +\function{update\_crl}(\type{X509\_CRL}~\arg{old\_crl}, +\type{std::vector<CRL\_Entry>}~\arg{new\_revoked}, +\type{u32bit}~\arg{seconds}~=~0), where \type{X509\_CRL} is the last CRL this +CA issued, and \arg{new\_revoked} is a list of any newly revoked certificates. +The function returns a new \type{X509\_CRL} to make available for clients. The +semantics for the \arg{seconds} argument is the same as \function{new\_crl}. + +The \type{CRL\_Entry} type is a structure which contains, at a minimum, the +serial number of the revoked certificate. As serial numbers are never repeated, +the pairing of an issuer and a serial number (should) distinctly identify any +certificate. In this case, we represent the serial number as a +\type{SecureVector<byte>} called \arg{serial}. There are two additional +(optional) values, an enumeration called \type{CRL\_Code} which specifies the +reason for revocation (\arg{reason}), and an object which represents the time +that the certificate became invalid (if this information is known). + +If you wish to remove an old entry from the CRL, insert a new entry for the +same cert, with a \arg{reason} code of \type{DELETE\_CRL\_ENTRY}. For example, +if a revoked certificate has expired 'normally', there is no reason to continue +to explicitly revoke it, since clients will reject the cert as expired in any +case. + +\pagebreak + +\subsubsection{Self-Signed Certificates} + +Generating a new self-signed certificate can often be useful, for example when +setting up a new root CA, or for use in email applications. In this case, +the solution is summed up simply as: + +\begin{verbatim} +namespace X509 { + X509_Certificate create_self_signed_cert(const X509_Cert_Options& opts, + const PKCS8_PrivateKey& key); +} +\end{verbatim} + +Where \arg{key} is obviously the private key you wish to use (the public key, +used in the certificate itself, is extracted from the private key), and +\arg{opts} is an structure which has various bits of information which will be +used in creating the certificate (this structure, and its use, is discussed +below). This function is found in the header \filename{x509self.h}. There is an +example of using this function in the \filename{self\_sig} example. + +\subsubsection{Creating PKCS \#10 Requests} + +Also in \filename{x509self.h}, there is a function for generating new PKCS \#10 +certificate requests. + +\begin{verbatim} +namespace X509 { + PKCS10_Request create_cert_req(const X509_Cert_Options&, + const PKCS8_PrivateKey&); +} +\end{verbatim} + +This function acts quite similarly to \function{create\_self\_signed\_cert}, +except it instead returns a PKCS \#10 certificate request. After creating it, +one would typically transmit it to a CA, who signs it and returns a freshly +minted X.509 certificate. There is an example of using this function in the +\filename{pkcs10} example. + +\subsubsection{Certificate Options} + +So what is this \type{X509\_Cert\_Options} thing we've been passing around? +Basically, it's a bunch of information which will end up being stored into the +certificate. This information comes in 3 major flavors: information about the +subject (CA or end-user), the validity period of the certificate, and +restrictions on the usage of the certificate. + +First and foremost is a number of \type{std::string} members, which contains +various bits of information about the user: \arg{common\_name}, +\arg{serial\_number}, \arg{country}, \arg{organization}, \arg{org\_unit}, +\arg{locality}, \arg{state}, \arg{email}, \arg{dns\_name}, and \arg{uri}. As +many of these as possible should be filled it (especially an email address), +though the only required ones are \arg{common\_name} and \arg{country}. + +There is another value which is only useful when creating a PKCS \#10 request, +which is called \arg{challenge}. This is a challenge password, which you can +later use to request certificate revocation (\emph{if} the CA supports doing +revocations in this manner). + +Then there is the validity period; these are set with \function{not\_before} +and \function{not\_after}. Both of these functions also take a +\type{std::string}, which specifies when the certificate should start being +valid, and when it should stop being valid. If you don't set the starting +validity period, it will automatically choose the current time. If you don't +set the ending time, it will choose the starting time plus a default time +period. The arguments to these functions specify the time in the following +format: ``2002/11/27 1:50:14''. The time is in 24 hour format, and the date is +encoded as year/month/day. The date must be specified, but you can omit the +time or trailing parts of it, for example ``2002/11/27 1:50'' or +``2002/11/27''. + +Lastly, you can set constraints on a key. The one you're mostly likely to want +to use is to create (or request) a CA certificate, which can be done by calling +the member function \function{CA\_key}. This should only be used when needed. + +Other constraints can be set by calling the member functions +\function{add\_constraints} and \function{add\_ex\_constraints}. The first takes +a \type{Key\_Constraints} value, and replaces any previously set value. If no +value is set, then the certificate key is marked as being valid for any usage. +You can set it to any of the following (for more than one usage, OR them +together): \type{DIGITAL\_SIGNATURE}, \type{NON\_REPUDIATION}, +\type{KEY\_ENCIPHERMENT}, \type{DATA\_ENCIPHERMENT}, \type{KEY\_AGREEMENT}, +\type{KEY\_CERT\_SIGN}, \type{CRL\_SIGN}, \type{ENCIPHER\_ONLY}, +\type{DECIPHER\_ONLY}. Many of these have quite special semantics, so you +should either consult the appropriate standards document (such as RFC 3280), or +simply not call \function{add\_constraints}, in which case the appropriate +values will be chosen for you. + +The second function, \function{add\_ex\_constraints}, allows you to specify an +OID which has some meaning with regards to restricting the key to particular +usages. You can, if you wish, specify any OID you like, but there are a set of +standard ones which other applications will be able to understand. These are +the ones specified by the PKIX standard, and are named ``PKIX.ServerAuth'' (for +TLS server authentication), ``PKIX.ClientAuth'' (for TLS client +authentication), ``PKIX.CodeSigning'', ``PKIX.EmailProtection'' (most likely +for use with S/MIME), ``PKIX.IPsecUser'', ``PKIX.IPsecTunnel'', +``PKIX.IPsecEndSystem'', and ``PKIX.TimeStamping''. You can call +\function{add\_ex\_constraints} any number of times~--~each new OID will be +added to the list to include in the certificate. + +\pagebreak + +\section{CMS} + +The Cryptographic Message Syntax (CMS) is an IETF standardized format for +message encryption and signatures. It is based on PKCS \#7, but has been +extended to allow compression, authentication, and password based encryption. +Some simple uses of CMS will inter-operate with PKCS \#7 implementations, but +most uses will cause incompatibilities. + +CMS is based on the idea of layering. At the lowest level is a data type (the +actual message), which is encapsulated in another layer, for example one that +provides encryption or adds a signature. This layer can in turn be encapsulated +in another layer, and so on as often as you like. + +\emph{Note that CMS is not available in the current distribution. You can +download an alpha version separately from the website.} + +\subsection{Encoding} + +The CMS encoder included in Botan does not allow you to use the full range of +options available; for example, when signing, you can only sign with one key at +a time (this particular restriction may be changed in later versions). However, +you can do repeated signature operations, signing the previously signed +data. Semantically, this is not quite the same (since the second and later +signatures sign the signatures that came before it, as well as the data), but +practically speaking it's the same thing. + +WRITEME + +\subsection{Decoding} + +WRITEME + +\pagebreak + +\section{Random Number Generators} + +The random number generators provided in Botan are meant for creating keys, +IVs, padding, nonces, and anything else which requires 'random' data. It is +important to remember that the output of these classes will vary, even if they +are supplied with exactly the same seed (\ie, two \type{Randpool} objects with +similar initial states will not produce the same output, because the value of +high resolution timers is added to the state at various points). + +To ensure good quality output, a PRNG needs to be seeded with truly random data +(such as that produced by a hardware RNG). Typically, you will use an +\type{EntropySource} (see below). To add entropy to a PRNG, you can use +\type{void} \function{add\_entropy}(\type{const byte} \arg{data}[], +\type{u32bit} \arg{length}) or (better), use the \type{EntropySource} +interface. + +One a PRNG has been initialized, you can get a single byte of random data by +calling \type{byte} \function{random()}, or get a large block by calling +\type{void} \function{randomize}(\type{byte} \arg{data}[], \type{u32bit} +\arg{length}), which will put random bytes into each member of the array from +indexes 0 $\ldots$ \arg{length} -- 1. + +You can avoid all the problem inherent to seeding the PRNG by using the +globally shared PRNG, described later in this section. + +\subsection{Entropy Estimation} + +The PRNG algorithms included in Botan have various sanity checks included. In +particular, they try to make sure that a reasonable amount of entropy has been +input into them before they will output any randomness. If this condition is +not met, they will throw a \type{PRNG\_Unseeded} exception. While generally a +library shouldn't be making policy decisions for applications, it seems +generally preferable for the application to fail than for it to generate +insecure keys. + +On Windows and Unix systems, the available entropy source modules can provide +more than enough entropy to seed the PRNGs sufficiently. However, if these +entropy sources aren't compiled into the library, the application will have to +handle seeding on its own. + +\pagebreak + +\subsection{The Global PRNG} + +Botan maintains a global PRNG (actually, a pair of them) that is used +internally for things like generating secret keys and salts. These PRNGs are +automatically seeded by the \type{LibraryInitializer}. Most of the time, you +won't need to access it directly because the library handles the common cases +where randomness is needed for you, but you might want to for a complicated +application (or when implementing things at a low level). + +To use it, include \filename{rng.h}. You can't get a pointer to the actual +global PRNG object, because it is guarded with a mutex for thread safety, so +the interface basically defines a set of entry points into the object. All of +them are in the namespace \namespace{Global\_RNG}, which is inside the +\namespace{Botan} namespace. So you might call them as +\texttt{Botan::Global\_RNG::function}, or if you have a \keyword{using} +declaration to include Botan objects into the global namespace, just +\texttt{Global\_RNG::function}. + +There are six functions, four for adding entropy and two for getting +randomness out. + +\vskip 5pt +\noindent +\type{void} \function{Global\_RNG::randomize}(\type{byte} \arg{buf[]}, + \type{u32bit} \arg{size}) + +Get \arg{size} bytes of random bytes from the global PRNG and put it into +\arg{buf}. + +\vskip 5pt +\noindent +\type{byte} \function{Global\_RNG::random}(): + +Return a single random byte + +\vskip 5pt +\noindent +\type{void} \function{Global\_RNG::add\_entropy}(\type{const byte} \arg{buf}[], + \type{u32bit} \arg{size}): + +Add the contents of \arg{buf}, which is of size \arg{size}, into the global +PRNG's internal state. The contents of the buffer cannot be recovered from the +PRNG output or internal state, and the PRNGs included in Botan are specifically +designed to be safe even if fed large amounts of data chosen by an attacker +trying to weaken the PRNG. So feel free to include things like data you +received over a socket (if you're writing a network application), passwords, +log data, etc. + +\vskip 5pt +\noindent +\type{void} \function{Global\_RNG::add\_entropy}(\type{EntropySource\&} + \arg{es}, \type{bool} \arg{slow\_poll}): + +Poll \arg{es} for entropy. If \arg{slow\_poll} is true, then do a slow poll, +otherwise do a fast poll. + +\vskip 5pt +\noindent +\type{u32bit} \function{Global\_RNG::seed} +(\type{bool} \arg{slow\_poll} = \arg{true}, + \type{u32bit} \arg{bits\_to\_get} = 256) + +Seed the global PRNG, either a fast or slow poll (default a slow), until it +gets at least \arg{bits\_to\_get} bits of entropy. However, if little entropy +is available on the system, it's entirely possible it will retrieve less than +that (particularly if a fast poll is being done). This function will return an +estimate for how many bits were gathered by the seeding process. + +If you pass 0 for \arg{bits\_to\_get}, then a poll will be run from all +available entropy sources. Usually if enough entropy is collected after a few +sources, the function will exit early. This is especially useful if you don't +trust \filename{/dev/urandom} to be safe for some reason. + +If you've got a long running server process, it's a good idea to create a +thread that just calls this function every once in a while, sleeping the rest +of the time. Make sure to cancel it before you shutdown the library, though; +otherwise it will try to get memory from the now-nonexist allocators, fail, and +throw an exception (or crash). An alternate method might be to call it after +servicing a particular number of clients. + +\vskip 5pt +\noindent +\type{u32bit} \function{Global\_RNG::add\_es} +(\type{EntropySource*} \arg{source}, \type{bool} \arg{last} = \arg{true}) + +Normally the library generates a list of entropy sources for +\function{Global\_RNG::seed} to call at initialization time. With this function +you can add new entropy sources which will be queried. If \arg{last} is true, +the the entropy source is put at the end of the list of currently used entropy +sources. If you'd like to be sure that your source is always called, set +\arg{last} to \arg{false}, in which case it will placed at the start of the +list. + +\subsection{Randpool} + +\type{Randpool} is the primary PRNG within Botan. In recent versions all uses +of it have been wrapped by an implementation of the X9.31 PRNG (see below). If +for some reason you should have cause to create a PRNG instead of using the +``global'' one owned by the library, it would be wise to consider the same on +the grounds of general caution; while \type{Randpool} is designed with known +attacks and PRNG weaknesses in mind, it is not an standard/official PRNG. The +remainer of this section is a (fairly technical, though high-level) description +of the algorithms used in this PRNG. Unless you have a specific interest in +this subject, the rest of this section might prove somewhat uninteresting. + +\type{Randpool} has an internal state called pool, which is 512 bytes +long. This is where entropy is mixed into and extracted from. There is also a +small output buffer (called buffer), which holds the data which has already +been generated but has just not been output yet. + +It is based around a MAC and a block cipher (which are currently HMAC(SHA-256) +and AES-256). Where a specific size is mentioned, it should be taken as a +multiple of the cipher's block size. For example, if a 256-bit block cipher +were used instead of AES, all of the sizes internally would double. Every time +some new output is needed, we compute the MAC of a counter and a high +resolution timer. The resulting MAC is XORed into the output buffer (wrapping +as needed), and the output buffer is then encrypted with AES, producing 16 +bytes of output. + +After 8 blocks (or 128 bytes) have been produced, we mix the pool. To do this, +we first rekey both the MAC and the cipher; the new MAC key is the MAC of the +current pool under the old MAC key, while the new cipher key is the MAC of the +current pool under the just-chosen MAC key. We then encrypt the entire pool in +CBC mode, using the current (unused) output buffer as the IV. We then generate +a new output buffer, using the mechanism described in the previous paragraph. + +To add randomness to the PRNG, we compute the MAC of the input and XOR the +output into the start of the pool. Then we remix the pool and produce a new +output buffer. The initial MAC operation should make it very hard for chosen +inputs to harm the security of \type{Randpool}, and as HMAC should be able to +hold roughly 256 bits of state, it is unlikely that we are wasting much input +entropy (or, if we are, it doesn't matter, because we have a very abundant +supply). + +\subsection{ANSI X9.31} + +\type{ANSI\_X931\_PRNG} is the standard issue X9.31 Appendix A.2.4 PRNG, though +using AES-256 instead of 3DES as the block cipher. This PRNG implementation has +been checked against official X9.31 test vectors. + +Internally, the PRNG holds a pointer to another PRNG (typically Randpool). This +internal PRNG generates the key and seed used by the X9.31 algorithm, as well +as the date/time vectors. Each time an X9.31 PRNG object recieves entropy, it +simply passes it along to the PRNG it is holdin, and then pulls out some random +bits to generate a new key and seed. This PRNG considers itself seeded as soon +as the internal PRNG is seeded. + +As of version 1.4.7, the X9.31 PRNG is by default used for all random number +generation. + +\subsection{Entropy Sources} + +An \type{EntropySource} is an abstract representation of some method of gather +``real'' entropy. This tends to be very system dependent. The \emph{only} way +you should use an \type{EntropySource} is to pass it to a PRNG that will +extract entropy from it -- never use the output directly for any kind of key or +nonce generation! + +\type{EntropySource} has a pair of functions for getting entropy from some +external source, called \function{fast\_poll} and \function{slow\_poll}. These +pass a buffer of bytes to be written; the functions then return how many bytes +of entropy were actually gathered. \type{EntropySource}s are usually used to +seed the global PRNG using the functions found in the \namespace{Global\_RNG} +namespace. + +Note for writers of \type{EntropySource}s: it isn't necessary to use any kind +of cryptographic hash on your output. The data produced by an EntropySource is +only used by an application after it has been hashed by the +\type{RandomNumberGenerator} which asked for the entropy, and thus any hashing +you do will be wasteful of both CPU cycles and possibly entropy. + +\pagebreak + +\section{User Interfaces} + +Botan has recently changed some infrastructure to better accommodate more +complex user interfaces, in particular ones which are based on event +loops. Primary among these was the fact that when doing something like loading +a PKCS \#8 encoded private key, a passphrase might be needed, but then again it +might not (a PKCS \#8 key doesn't have to be encrypted). Asking for a +passphrase to decrypt an unencrypted key is rather pointless. Not only that, +but the way to handle the user typing the wrong passphrase was complicated, +undocumented, and inefficient. + +So now Botan has an object called \type{UI}, which provides a simple interface +for the aspects of user interaction the library has to be concerned +with. Currently, this means getting a passphrase from the user, and that's it +(\type{UI} will probably be extended in the future to support other operations +as they are needed). The base \type{UI} class is very stupid, because the +library can't directly assume anything about the environment that it's running +under (for example, if there will be someone sitting at the terminal, if the +application is even \emph{attached} to a terminal, and so on). But since you +can subclass \type{UI} to use whatever method happens to be appropriate for +your application, this isn't a big deal. + +There is (currently) a single function that can be overridden by subclasses of +\type{UI} (the \type{std::string} arguments are actually \type{const +std::string\&}, but shown as simply \type{std::string} to keep the line from +wrapping): + +\noindent +\type{std::string} \function{get\_passphrase}(\type{std::string} \arg{what}, + \type{std::string} \arg{source}, + \type{UI\_Result\&} \arg{result}) const; + +The \arg{what} argument specifies what the passphrase is needed for (for +example, PKCS \#8 key loading passes \arg{what} as ``PKCS \#8 private +key''). This lets you provide the user with some indication of \emph{why} your +application is asking for a passphrase; feel free to pass the string through +\function{gettext(3)} or moral equivalent for i18n purposes. Similarly, +\arg{source} specifies where the data in question came from, if available (for +example, a file name). If the source is not available for whatever reason, then +\arg{source} will be an empty string; be sure to account for this possibility +when writing a \type{UI} subclass. + +The function returns the passphrase as the return value, and a status code in +\arg{result} (either \type{OK} or \type{CANCEL\_ACTION}). If +\type{CANCEL\_ACTION} is returned in \arg{result}, then the return value will +be ignored, and the caller will take whatever action is necessary (typically, +throwing an exception stating that the passphrase couldn't be determined). In +the specific case of PKCS \#8 key decryption, a \type{Decoding\_Error} +exception will be thrown; your UI should assume this can happen, and provide +appropriate error handling (such as putting up a dialog box informing the user +of the situation, and canceling the operation in progress). + +There is an example \type{UI} which uses GTK+ available on the web site. The +\type{GTK\_UI} code is cleanly separated from the rest of the example, so if +you happen to be using GTK+, you can copy (and/or adapt) that code for your +application. If you write a \type{UI} object for another windowing system +(Win32, Qt, wxWindows, FOX, etc), and would like to make it available to users +in general (ideally under a permissive license such as public domain or +MIT/BSD), feel free to send in a copy. + +\subsection{Pulses} + +If you call a function in the library that turns out to take a long time (such +as generating a 4096-bit prime), your pretty GUI will block up while the +library does something, because the event loop is not being run. Not only does +this look bad, it prevents the user from doing something else while the library +works. The way around this is to register a pulse function, using +\function{UI::set\_pulse}(\type{pulse\_func} \arg{f}, \type{void*} \arg{opaque} += 0). During long running operations, the library will call +\arg{f}(\type{Pulse\_Type} \arg{type}, \arg{opaque}), where the \type{enum} +\arg{type} provides mildly useful information about the operation in progress +(for a full list of the defined \type{Pulse\_Type} values, see +\filename{ui.h}). The type code allows you do simple feedback such as that +GnuPG does during key generation (printing various characters as the prime +generation process proceeds, such as '-' for prime test failed, '+' for prime +test worked, and so on). The optional \arg{opaque} value allows you to pass +data back to your pulse function without making it a global variable. + +Generally the thing to do inside the pulse function is to run the GUI's event +loop, for example with GTK+: + +\begin{verbatim} + while(gtk_events_pending()) + gtk_main_iteration(); +\end{verbatim} + +which will flush out the event queue and make your GUI seem nice and +responsive. For a particularly long-running operation (one that takes more than +a second or two), you will probably want to put up a progress bar. While you +can update it directly from the pulse function, be warned that the pulse +function is called at irregular intervals, so your progress bar's movement +might seem choppy if you update it directly from the pulse. It may be a better +move to instead set up a timer (preferably through the GUI framework) that runs +every fixed timeslice, and updates the bar when the timer goes off. As long as +the pulse function is called often enough (which is should), simply running the +event loop and letting the timer function do the updates will work fine. + +\pagebreak + +\section{Policy Configuration} + +While Botan is performing operations on behalf on an application, there are +times where there needs to be a policy decision. For example, when generating +an X.509v3 certificate, should we include the key usage extension? Should it be +marked as a critical extension, or is non-critical OK? And so on and so +forth. It is not proper for a library to make these kinds of decisions for an +application; after all, different applications might have different needs (not +to mention the same application running at different sites). So, whenever it is +sane to do so, the library will read from an internal table to find out what it +should do when a policy decision is needed. + +Right now, the option table is populated by some fixed, reasonable values at +startup. These options can then be changed by the application, either +hard-coded into the source code as an application policy, or reading them from +a file (or options screen or whatever) and setting them as the user desires +(possibly placing application-policy limits on the range they can take). + +The library natively supports a simple format which is easy to parse and easy +for humans to read and write. If you're at all familiar with Windows .INI files +or OpenSSL's configs, it should be pretty easy to use. It's entirely possible +that you want to instead use an XML config (or whatever), but you'll have to +write you own parser for this (\filename{src/inifile.cpp} will provide some +ideas on what it is supposed to do). + +There are basically four different things stored in the options table: strings, +numbers, booleans, and times (\emph{not} dates; times are things like ``1 +hour'', ``15 minutes'', etc), though they are all represented by strings when +they are provided to the library. + +\subsection{Option Types} + +Strings are simply strings~--~no strings attached (sorry). A list is a +collection of strings, separated by a ':' character (no escaping is available, +so you can't actually have a ':' character in a list item). + +A number (more precisely, a non-negative integer less than $2^{32}$) is +specified as a string of decimal digits~--~no special formatters (such as a +``0x'' prefix) are supported. However, you can do simply arithmetic ('+' and +'*'), and they do commute correctly. There is no explicit grouping (\ie, with +parenthesis), but generally a simple expression is all thats needed for this +sort of thing. + +A boolean can take on the values true and false, which can be represented by +``true'' (and ``1'') or ``false'' (and ``0'') respectively. Unlike C, a value +of (say) ``7'' is not a boolean; it will be flagged as an error at runtime when +the library attempts to read it. Finally, a time is essentially +``\texttt{<integer>[s|m|h|d|y]}'', where integer is the magnitude and the +suffix (if present) provides a scaling value. For example ``5d'' represents 5 +days, and ``60'', ``60s'', and ``1m'' all represent 60 seconds. If no suffix is +provided, the scale defaults to seconds. + +\subsection{Setting and Getting Options} + +The header \filename{botan/conf.h} has the interface for setting policy +options. All of the functions are declared inside of the \namespace{Config} +namespace; there is 1 for setting options, and 4 for getting the values of +them. + +To add (or set) an option, call \function{add}(\type{std::string} \arg{option}, +\type{std::string} \arg{value}), which sets the value of \arg{option} to +\arg{value}. + +There are 5 functions to retrieve the values of options, one for each of the +types: + +\type{std::string} \function{get\_string}(\type{std::string} \arg{option}) + +\type{std::vector<std::string>} \function{get\_list}(\type{std::string} +\arg{option}) + +\type{u32bit} \function{get\_u32bit}(\type{std::string} \arg{option}) + +\type{u32bit} \function{get\_time}(\type{std::string} \arg{option}) + +\type{bool} \function{get\_bool}(\type{std::string} \arg{option}) + +The only one that might be confusing is \function{get\_time}, which returns the +time in seconds. + +As to defaults: strings default to the empty string, lists to an empty list, +integers default to 0, times default to no time (0 seconds), and booleans will +throw an exception if no value has been set. + +\subsection{Available Options} + +Generally, the defaults are chosen to provide a good level of security and +sense for typical applications. Currently, most of the options are for the +X.509 handling, since that's the place where most freedom is given to +implementations. Options are organized in a hierarchal fashion, with a +separating character of '/'. All options beginning with ``app/'' are reserved +for use by applications. + +\newcommand{\confopt}[4]{ + \textbf{``#1''}, (\textbf{#2}, default \textbf{#3}): #4. +} + +\begin{list}{$\cdot$} + \item \confopt{base/memory\_chunk}{integer}{``64*1024''}{how large a + block of memory to allocate at once} + + \item \confopt{base/default\_pbe}{string} + {``PBE-PKCS5v20(SHA-1,TripleDES/CBC)''}{ + The default algorithm for encrypting PKCS \#8 private keys} + + \item \confopt{base/pkcs8\_tries}{integer}{3}{how many times + \function{PKCS8::load\_key} will ask a \type{UI} object for a + passphrase to decrypt the key before it gives up. If set to 0, it will + continue to query the \type{UI} object until the object indicates to + cancel the action.} + + \item \confopt{pk/blinder\_size}{integer}{64}{how long (in bits) the + blinding factor will be when doing private-key PK operations; if set to + zero then blinding is not performed} + + \item \confopt{pk/test/public}{string}{``basic''}{How much testing to + perform on imported public keys; can be ``basic'' or ``all''} + + \item \confopt{pk/test/private}{string}{``basic''}{How much testing to + perform on imported private keys; can be ``basic'' or ``all''} + + \item \confopt{pk/test/private\_gen}{string}{``all''}{How much testing to + perform on generated private keys; can be ``basic'' or ``all''} + + \item \confopt{pem/search}{integer}{``4*1024''}{how large an area (in bytes) + to search for PEM signatures in the heuristic that decides if data is + PEM encoded, or raw BER data} + + \item \confopt{pem/forgive}{integer}{``8''}{how many characters that + 'look like' a PEM header will be forgiven, \ie how characters match + before we decide it really is the PEM header, and any bad characters + imply a malformed header} + + \item \confopt{pem/width}{integer}{``64''}{how long each PEM line will be + encoded as; it should not be smaller than 50 or greater than 80} + + \item \confopt{rng/min\_entropy}{integer}{384}{how many bits of entropy must + be collected before the PRNG is considered seeded} + + \item \confopt{rng/es\_files}{list}{``/dev/urandom:/dev/random''}{what paths + to attempt reads from for entropy, typically in-kernel devices} + + \item \confopt{rng/egd\_path}{list}{``/var/run/egd-pool:/dev/egd-pool''}{ + what paths to attempt to use as an EGD socket} + + \item \confopt{rng/ms\_capi\_prov\_type}{list}{``INTEL\_SEC:RSA\_FULL''}{ + what providers the CAPI entropy source should attempt to use, in order} + + \item \confopt{rng/unix\_path}{list}{``/usr/ucb:/usr/etc:/etc''}{extra path + fields to use when executing programs to gather entropy} + + \item \confopt{x509/validity\_slack}{time}{``24h''}{how much slack to + allow when checking time validity on X.509 certificates} + + \item \confopt{x509/v1\_assume\_ca}{boolean}{false}{if true, then v1/v2 + X.509 certificates are considered CA certificates by default. If not + true, then no v1/v2 certificate is considered valid for CA use} + + \item \confopt{x509/cache\_verify\_results}{time}{``30m''}{how long + to cache certificate verification results in a \type{X509\_Store}. Set + it to 0 if you don't want to cache the results, though this will cause + a lot of unnecessary overhead} + + \item \confopt{x509/ca/allow\_ca}{boolean}{``false''}{whether a CA + will allow new certificates to be marked for CA usage} + + \item \confopt{x509/ca/basic\_constraints}{string}{``always''}{can be either + ``always'' or ``ca\_only''; if ``always'' then the basic constraints + extension is included in new user certs as well as new CA certs} + + \item \confopt{x509/ca/default\_expire}{time}{``1y''}{how long, by + default, a newly generated certificate is valid for} + + \item \confopt{x509/ca/signing\_offset}{time}{``30s''}{when generating a + PKCS \#10 certificate request, it will be marked as becoming valid + this much time before the current time; helps protect against slightly + off clocks} + + \item \confopt{x509/ca/rsa\_hash}{string}{``SHA-1''}{what hash to use + with an RSA key (SHA-1 is always used with DSA)} + + \item \confopt{x509/ca/str\_type}{string}{``latin1''}{what encoding to use + by default (can be ``latin1'' or ``utf8'')} + + \item \confopt{x509/crl/unknown\_critical}{string}{``ignore''}{what + to do when a CRL with an unknown critical extension is + processed. Options are ``ignore'' and ``throw''. For X.509v4 + compliance, use ``ignore'', for PKIX compliance, use ``throw''} + + \item \confopt{x509/crl/next\_update}{time}{``7d''}{new CRLs are marked as + expiring in this much time} +\end{list} + +Here, in a separate list, are the options which control which extension are +included in a newly generated X.509v3 certificate, and if they should be marked +as critical extensions or not. Each one begins with ``x509/exts/'' (\ie, what is +referred to as ``basic\_constraints'' below is actually +``x509/exts/basic\_constraints''), and can take on a value of ``yes'', ``no'', +``noncritical'', or ``critical''. A value of ``no'' means the extension is not +included under any circumstances. A value of ``yes'' or ``noncritical'' (they +have the same meaning), means that the extension is included in the certificate +if there is some data to populate it with, and that the extension should be +marked as non-critical. Finally, ``critical'' means that the extension should +be marked as a critical extension. Unless otherwise noted, the option will +default to ``yes'': including the extension if data is available to fill it in, +and mark it as a non-critical extension. + +A word about X.509v3 extensions: each extension can be marked either critical +or non-critical. A non-critical extension may be ignored by a compliant X.509v3 +implementation (though for the common extensions, it is fairly rare for an +implementation to actually do so). On the other hand, a critical extension +forces an all-or-nothing situation: if an implementation can't handle an +extension marked critical, it is required to reject the certificate outright. + +For the full meaning of the extensions, it will probably be helpful to read an +authoritative X.509 reference, such as RFC 2459 or ISO's X.509 v3/v4 documents. +The default options here were chosen to comply with the IETF PKIX X.509v3 +profile, which is probably the most commonly supported X.509 profile, at least +in the United States. + +\begin{list}{$\cdot$} + \item ``basic\_constraints'' (default ``critical''): Control the use of the + Basic Constraints extension, which marks if a certificate is a CA or + not. Changing this is \emph{not} recommended, as this should always + be a critical extension (doing otherwise violates most if not all + X.509v3 profiles). + \item + \item ``subject\_key\_id'': Controls the use of the subject key identifier. + Not many implementations make use of this extension, but it is not + harmful, and it is recommended it be included in all new certificates. + + \item ``authority\_key\_id'': See comments on ``subject\_key\_id'' + + \item ``subject\_alternative\_name'': Contains various pieces of information + that don't fit into the standard certificate name, like email + addresses and URIs. Very commonly used. + + \item ``issuer\_alternative\_name'': Like ``subject\_alternative\_name'', + but not used nearly as often. + + \item ``key\_usage'' (default ``critical''): Marks what uses this + certificate is valid for. + + \item ``extended\_key\_usage'': Similar to ``key\_usage'', but more general + and much less commonly used. +\end{list} + +\pagebreak + +\subsection{Configuration Files} + +Botan has a number of options, which can be configured by calling the +appropriate functions, documented earlier in this section. But this is somewhat +inconvenient for the users of applications which use Botan. So Botan also +supports reading options from a file which looks rather like Windows .INI files +or OpenSSL configurations. You can find an example config (which simply matches +the compiled-in defaults) in \filename{doc/botan.rc} + +Each set of options is part of a 'section', for example, ``base'', ``rng'', or +``x509''. These names are essentially arbitrary, and are (in theory) chosen on +the basis of what the options pertain to. To set the option +``x509/ca/default\_expire'' (which tells \type{X509\_CA} how long newly minted +X.509 certificates should be valid for), you could use either of the following +methods: + +\begin{verbatim} +[x509/ca] # section is x509/ca +default_expire = 1y # x509/ca + default_expire -> x509/ca/default_expire + +# same as above +[x509] # section is x509 +# other x509/ options in here... +ca/default_expire = 1y # x509 + ca/default_expire -> x509/ca/default_expire +\end{verbatim} + +There are also two special sections, ``oids'' and ``aliases''. The aliases +section is easier to understand, and probably more useful for the average user. +By adding a new line in an alias section, \verb|alias = officialname|, you can +create a new way to reference a particular algorithm (in those cases when you +ask for an algorithm object with a string specifying its type). For example, if +the line \verb|MyAlgo = Blowfish| was included in an aliases section, then one +could do this: + +\begin{verbatim} +Pipe pipe(get_cipher(``MyAlgo/CBC/PKCS7'', key, iv, ENCRYPTION)); +\end{verbatim} + +and get a Blowfish CBC encryptor. Initially this was implemented due to the +number of algorithms with multiple names (such as ``SHA1'', ``SHA-1'', and +``SHA-160''), but might also be useful in other, more interesting, contexts. + +The OIDs section gives a mapping between ASN.1 OIDs and the algorithm or object +it represents, in the form \verb|name = oid|, where oid is the usual +decimal-dotted representation. For readability and easy of extension in +configuration files, a simple variable interpolation scheme is also +available. Consider the following: + +\begin{verbatim} +[oids] +ISO_MEMBER = 1.2 +US_BODY = ISO_MEMBER.840 # US_BODY = 1.2.840 +RSA_DSI = US_BODY.113549 # RSA_DSI = 1.2.840.113549 +\end{verbatim} + +This only works when the variable name is at the start of the string; since the +primary reason for its inclusion is for with OIDs, this is acceptable. In some +cases, adding a new OID in is sufficient for code to work with new algorithms +(though not always). For example, by setting the proper OIDs, you can make it +possible to import, export, create, and process X.509 certificates that use +Rabin-Williams. + +\subsubsection{Syntax} + +Each line is either a comment, blank, a section name, or a name/value pair +separated by a '='. Comments start with the '\#' character and continue to the +end of line. The reader allows escaping, so if you wanted to include an actual +\# sign you could use \verb|\#|, or include it in a string ('\#' or ``\#''). A +section name is specified by \verb|[somename]|; a section name must have at +least one character, and a section must appear before any name/value pairs. A +name must be alphanumeric, but a value can contain spaces or other strange +things (you must either enclose the argument in quotes or escape each space +with a backslash). An example showing some of the trickier parts of how input +is interpreted follows (but the reader is cautioned that relying on this +behavior is not a good idea): + +\begin{verbatim} +[examples] +foo1 = a b c # stored as abc (not quoted, ws removed) +foo2 = 'a b c' # stored as a b c (quoted, keep ws) +foo3 = "a b c" # stored as a b c (quoted, keep ws) +tricky = "Jack \"I like pie\" Lloyd" # stored as Jack "I like pie" Lloyd +simpler = "Jack 'I like pie' Lloyd" # no escapes needed + +hashmark = "#" # set to a hash +hashmark2 = \# # also set to a hash + +[oids] +RW = 1.2.3.4.5.6 # Now RW keys can be imported/exported! +NR = 1.2.3.4.5.7 # Now NR can be imported/exported too. + # Note these OIDs are *not* allocated for RW/NR, in fact I have no idea who + # owns that section of the OID space, but it's certainly not me. Someone will + # have to allocate OIDs for RW/NR before this is 'legal' + +some_thing = 1.2.3 # some OID +another_thing = some_thing.4.5 # another_thing = 1.2.3.4.5 +\end{verbatim} + +\pagebreak + +\section{Miscellaneous} + +This section has documentation for anything that just didn't fit into any of +the major categories. Many of them (Timers, Allocators) will rarely be used in +actual application code, but others, like the S2K algorithms, have a wide +degree of applicability. + +\subsection{S2K Algorithms} + +There are various procedures (usually fairly ad-hoc) for turning a passphrase +into a (mostly) arbitrary length key for a symmetric cipher. A general +interface for such algorithms is presented in \filename{s2k.h}. The main +function is \function{derive\_key}, which takes a passphrase, and the desired +length of the output key, and returns a key of that length, deterministically +produced from the passphrase. If an algorithm can't produce a key of that size, +it will throw an exception (most notably, PKCS \#5's PBKDF1 can only produce +strings between 1 and $n$ bytes, where $n$ is the output size of the underlying +hash function). + +Most such algorithms allow the use of a ``salt'', which provides some extra +randomness and helps against dictionary attacks on the passphrase. Simply call +\function{change\_salt} (there are variations of it for most of the ways you +might wish to specify a salt, check the header for details) with a block of +random data. You can also have the class generate a new salt for you with +\function{new\_random\_salt}; the salt that was generated can be retrieved with +\function{current\_salt}. + +Additionally some algorithms allow you to set some sort of iteration count, +which will make the algorithm take longer to compute the final key (reducing +the speed of brute-force attacks of various kinds). This can be changed with +the \function{set\_iterations} function. Most standards recommend an iteration +count of at least 1000. ``PBKDF2(SHA-1)'', with an 8-byte salt and an iteration +count of 2048, is recommend for new applications. Currently defined S2K +algorithms are ``PBKDF1(digest)'', ``PBKDF2(digest)'', and +``OpenPGP-S2K(digest)''; you can retrieve any of these using the +\function{get\_s2k}, found in \filename{lookup.h}. As of this writing, +``PBKDF2(SHA-1)'' with 10000 iterations and an 8 byte salt is recommend for new +applications. + +\subsubsection{OpenPGP S2K} + +There are some oddities about OpenPGP's S2K algorithms which are documented +here. For one thing, it uses the iteration count in a strange manner; instead +of specifying how many times to iterate the hash, it tells how many +\emph{bytes} should be hashed in total (including the salt). So the exact +iteration count will depend on the size of the salt (which is fixed at 8 bytes +by the OpenPGP standard, though the implementation will allow any salt size) +and the size of the passphrase. + +To get what OpenPGP calls ``Simple S2K'', set iterations to 0 (the default for +OpenPGP S2K), and do not specify a salt. To get ``Salted S2K'', again leave the +iteration count at 0, but give an 8-byte salt. ``Salted and Iterated S2K'' +requires an 8-byte salt and some iteration count (this should be significantly +larger than the size of the longest passphrase that might reasonably be used; +somewhere from 1024 to 65536 would probably be about right). Using both a +reasonably sized salt and a large iteration count is highly recommended to +prevent password guessing attempts. + +\subsection{Checksums} + +Checksums are very similar to hash functions, and in fact share the same +interface. But there are some significant differences, the major ones being +that the output size is very small (usually in the range of 2 to 4 bytes), and +is not cryptographically secure. But for their intended purpose (error +checking), they perform very well. Some examples of checksums included in Botan +are the Adler32 and CRC32 checksums. + +\subsection{Exceptions} + +Sooner or later, something is going to go wrong. Botan's behavior when +something unusual occurs, like most C++ software, is to throw an exception. +Exceptions in Botan are derived from the \type{Exception} class. You can see +most of the major varieties of exceptions used in Botan by looking at +\filename{exceptn.h}. The only function you really need to concern yourself +with is \type{const char*} \function{what()}. This will return an error message +relevant to the error that occurred. For example: + +\begin{verbatim} +try { + // various Botan operations + } +catch(Botan::Exception& e) + { + cout << "Botan exception caught: " << e.what() << endl; + // error handling, or just abort + } +\end{verbatim} + +Botan's exceptions are derived from \type{std::exception}, so you don't need +to explicitly check for Botan exceptions if you're already catching the ISO +standard ones. + +\subsection{Threads and Mutexes} + +Botan includes a mutex system, which is used internally to lock some shared +data structures which must be kept shared for efficiency reasons (mostly, these +are in the allocation systems~--~handing out 1000 separate allocators hurts +performance and makes caching memory blocks useless). This system is supported +by the \texttt{mux\_pthr} module, implementing the \type{Mutex} interface for +systems that have POSIX threads. + +If your application is using threads, you \emph{must} add the option +``thread\_safe'' to the options string when you create the +\type{LibraryInitializer} object. If you specify this option and no mutex type +is available, an exception is thrown, since otherwise you would probably be +facing a nasty crash. + +There are a few functions that shouldn't be called from threads. If you want to +use them, you'll have to either do locking in your own code, or only call them +from a single thread (presumably the main thread, which initialized the +library, but that isn't required). It is assumed that most of them are called +at most once, and then the application runs. Thread-unsafe functions in Botan +include: + +\begin{verbatim} + add_engine(Engine*) + startup_engines() + shutdown_engines() + set_mutex_type(Mutex*) + set_timer_type(Timer*) + setup_global_rng(RandomNumberGenerator*, RandomNumberGenerator*) + destroy_global_rng() +\end{verbatim} + +This list is \emph{not} complete. As you can see, most of them are used only at +startup/shutdown; the functions/objects you would tend to use regularly in an +application should be thread safe at the object level. + +\subsection{Secure Memory} + +A major concern with mixing modern multiuser OSes and cryptographic code is +that at any time the code (including secret keys) could be swapped to disk, +where it can later be read by an attacker. Botan stores almost everything (and +especially anything sensitive) in memory buffers which a) clear out their +contents when their destructors are called, and b) have easy plugins for +various memory locking functions, such as the \function{mlock}(2) call on many +Unix systems. + +Two of the allocation method used (``malloc'' and ``mmap'') don't require any +extra privileges on Unix, but locking memory does. At startup, each allocator +type will attempt to allocate a few blocks (typically totaling 128k), so if you +want, you can run your application \texttt{setuid} \texttt{root}, and then drop +privileges immediately after creating your \type{LibraryInitializer}. If you +end up using more than what's been allocated, some of your sensitive data might +end up being swappable, but that beats running as \texttt{root} all the +time. BTW, I would note that, at least on Linux, you can use a kernel module to +give your process extra privileges (such as the ability to call +\function{mlock}) without being root. For example, check out my Capability +Override LSM (\texttt{http://www.randombit.net/projects/cap\_over/}), which +makes this pretty easy to do. + +These classes should also be used within your own code for storing sensitive +data. They are only meant for primitive data types (int, long, etc): if you +want a container of higher level Botan objects, you can just use a +\verb|std::vector|, since these objects know how to clear themselves when they +are destroyed. You cannot, however, have a \verb|std::vector| (or any other +container) of \type{Pipe}s or \type{Filter}s, because these types have pointers +to other \type{Filter}s, and implementing copy constructors for these types +would be both hard and quite expensive (vectors of pointers to such objects is +fine, though). + +These types are not described in any great detail: for more information, +consult the definitive sources~--~the header files \filename{secmem.h} and +\filename{allocate.h}. + +\type{SecureBuffer} is a simple array type, whose size is specified at compile +time. It will automatically convert to a pointer of the appropriate type, and +has a number of useful functions, including \function{clear()}, and +\type{u32bit} \function{size()}, which returns the length of the array. It is a +template that takes as parameters a type, and a constant integer which is how +long the array is (for example: \verb|SecureBuffer<byte, 8> key;|). + +\type{SecureVector} is a variable length array. Its size can be increased or +decreased as need be, and it has a wide variety of functions useful for copying +data into it's buffer. Like \type{SecureBuffer}, it implements \function{clear} +and \function{size}. + +\subsection{Allocators} + +The containers described above get their memory from allocators. As a user of +the library, you can add new allocator methods at run time for containers, +including the ones used internally by the library, to use. The interface to +this is in \filename{allocate.h}. Basically how it works is that code needing +an allocator uses \function{get\_allocator}, which returns a pointer to an +allocator. This pointer should not be freed: the caller does not own the +allocator (it is shared among multiple users, and locks itself as needed). It +is possible to call \function{get\_allocator} with a specific name to request a +particular type of allocator, otherwise, a default allocator type is returned. + +At start time, the only allocator known is a \type{Default\_Allocator}, which +just allocates memory using \function{malloc}, and \function{memset}s it to 0 +when the memory is released. It is known by the name ``malloc''. If you ask for +another type of allocator (``locking'' and ``mmap'' are currently used), and it +is not available, some other allocator will be returned. + +You can add in a new allocator type using \function{add\_allocator\_type}. This +function takes a string and a pointer to an allocator. The string gives this +allocator type a name to which it can be referred when one is requesting it +with \function{get\_allocator}. If an error occurs (such as the name being +already registered), this function returns false. It will return true if the +allocator was successfully registered. If you ask it to, +\type{LibraryInitializer} will do this for you. + +Finally, you can set the default allocator type that will be returned using +the policy setting ``default\_alloc'' to the name of any previously registered +allocator. + +\subsection{Timers} + +Botan includes a pair of functions, \function{system\_time} and +\function{system\_clock}, which are used extensively in some areas of the code +(especially in the random number generators). These functions by default use +\function{std::time} and \function{std::clock}, but often you can do better +with system-dependent functions, especially for the system clock (for example, +returning the microseconds value from \function{gettimeofday}, or the +nanoseconds value from the POSIX.1b \function{clock\_gettime}, is far +superior). Modules for this exist for several systems. + +You can register a new timer method with \function{set\_timer\_type}. For +example, if the \texttt{timer\_unix} module is available, one could call +\function{set\_timer\_type}(new \type{Unix\_Timer}), in which case +\function{system\_clock} will return a more ``interesting'' value based on the +return of the \function{gettimeofday} function call. This is done automatically +by the \type{LibraryInitializer} object. + +\pagebreak + +\section{Botan's Modules} + +Botan comes with a variety of modules which can be compiled into the system. +These will not be available on all installations of the library, but you can +check for their availability based on whether or not certain macros are +defined. + +\subsection{Pipe I/O for Unix File Descriptors} + +This is a fairly minor feature, but it comes in handy sometimes. In all +installations of the library, Botan's \type{Pipe} object overloads the +\keyword{<<} and \keyword{>>} operators for C++ iostream objects, which is +usually more than sufficient for doing I/O. + +However, there are cases where the iostream hierarchy does not map well to +local 'file types', so there is also the ability to do I/O directly with Unix +file descriptors. This is most useful when you want to read from or write to +something like a TCP or Unix-domain socket, or a pipe, since for simple file +access it's usually easier to just use C++'s file streams. + +If \macro{BOTAN\_EXT\_PIPE\_UNIXFD\_IO} is defined, then you can use the +overloaded I/O operators with Unix file descriptors. For an example of this, +check out the \filename{hash\_fd} example, included in the Botan distribution. + +\subsection{Entropy Sources} + +All of these are used by the \function{Global\_RNG::seed} function if they are +available. Since this function is called by the \type{LibraryInitializer} class +when it is created, it is fairly rare that you will need to deal with any of +these classes directly. Even in the case of a long-running server that needs to +renew its entropy poll, it is easier to simply call +\function{Global\_RNG::seed} (see the section entitled ``The Global PRNG'' for +more details). + +\noindent +\type{EGD\_EntropySource}: Query an EGD socket. If the macro +\macro{BOTAN\_EXT\_ENTROPY\_SRC\_EGD} is defined, it can be found in +\filename{es\_egd.h}. The constructor takes a \type{std::vector<std::string>} +that specifies the paths to look for an EGD socket. + +\noindent +\type{Unix\_EntropySource}: This entropy source executes programs common on +Unix systems (such as \filename{uptime}, \filename{vmstat}, and \filename{df}) +and adds it to a buffer. It's quite slow due to process overhead, and (roughly) +1 bit of real entropy is in each byte that is output. It is declared in +\filename{es\_unix.h}, if \macro{BOTAN\_EXT\_ENTROPY\_SRC\_UNIX} is +defined. If you don't have \filename{/dev/urandom} \emph{or} EGD, this is +probably the thing to use. For a long-running process on Unix, keep on object +of this type around and run fast polls ever few minutes. + +\noindent +\type{FTW\_EntropySource}: Walk through a filesystem (the root to start +searching is passed as a string to the constructor), reading files. This tends +to only be useful on things like \filename{/proc} which have a great deal of +variability over time, and even then there is only a small amount of entropy +gathered: about 1 bit of entropy for every 16 bits of output (and many hundreds +of bits are read in order to get that 16 bits). It is declared in +\filename{es\_ftw.h}, if \macro{BOTAN\_EXT\_ENTROPY\_SRC\_FTW} is defined. Only +use this as a last resort. I don't really trust it, and neither should you. + +\noindent +\type{Win32\_CAPI\_EntropySource}: This routines gathers entropy from a Win32 +CAPI module. It takes an optional \type{std::string} which will specify what +type of CAPI provider to use. Generally the CAPI RNG is always the same +software-based PRNG, but there are a few which may use a hardware RNG. By +default it will use the first provider listed in the option +``rng/ms\_capi\_prov\_type'' which is available on the machine (currently the +providers ``RSA\_FULL'', ``INTEL\_SEC'', ``FORTEZZA'', and ``RNG'' are +recognized). + +\noindent +\type{BeOS\_EntropySource}: Query system statistics using various BeOS-specific +APIs. + +\noindent +\type{Pthread\_EntropySource}: Attempt to gather entropy based on jitter +between a number of threads competing for a single mutex. This entropy source +is \emph{very} slow, and highly questionable in terms of security. However, it +provides a worst-case fallback on systems which don't have Unix-like features, +but do support POSIX threads. This module is currently unavailable due to +problems on some systems. + +\subsection{Compressors} + +There are two compression algorithms supported by Botan, Zlib and Bzip2 (Gzip +and Zip encoding will be supported in future releases). Only lossless +compression algorithms are currently supported by Botan, because they tend to +be the most useful for cryptography. However, it is very reasonable to consider +supporting something like GSM speech encoding (which is lossy), for use in +encrypted voice applications. + +You should always compress \emph{before} you encrypt, because encryption seeks +to hide the redundancy that compression is supposed to try to find and remove. + +\subsubsection{Bzip2} + +To test for Bzip2, check to see if \macro{BOTAN\_EXT\_COMPRESSOR\_BZIP2} is +defined. If so, you can include \filename{bzip2.h}, which will declare a pair +of \type{Filter} objects: \type{Bzip2\_Compression} and +\type{Bzip2\_Decompression}. + +You should be prepared to take an exception when using the decompressing +filter, for if the input is not valid Bzip2 data, that is what you will +receive. You can specify the desired level of compression to +\type{Bzip2\_Compression}'s constructor as an integer between 1 and 9, 1 +meaning worst compression, and 9 meaning the best. The default is to use 9, +since small values take the same amount of time, just use a little less memory. + +The Bzip2 module was contributed by Peter J. Jones. + +\subsubsection{Zlib} + +Zlib compression works pretty much like Bzip2 compression. The only differences +in this case are that the macro is \macro{BOTAN\_EXT\_COMPRESSOR\_ZLIB}, the +header you need to include is called \filename{botan/zlib.h} (remember that you +shouldn't just \verb|#include <zlib.h>|, or you'll get the regular zlib API, +which is not what you want). The Botan classes for Zlib +compression/decompression are called \type{Zlib\_Compression} and +\type{Zlib\_Decompression}. + +Like Bzip2, a \type{Zlib\_Decompression} object will throw an exception if +invalid (in the sense of not being in the Zlib format) data is passed into it. + +In the case of zlib's algorithm, a worse compression level will be faster than +a very high compression ratio. For this reason, the Zlib compressor will +default to using a compression level of 6. This tends to give a good trade off +in terms of time spent to compression achieved. There are several factors you +need to consider in order to decide if you should use a higher compression +level: + +\begin{list}{$\cdot$} + \item Better security: the less redundancy in the source text, the harder it + is to attack your ciphertext. This is not too much of a concern, + because with decent algorithms using sufficiently long keys, it doesn't + really matter \emph{that} much (but it certainly can't hurt). + \item + + \item Decreasing returns. Some simple experiments by the author showed + minimal decreases in the size between level 6 and level 9 compression + with large (1 to 3 megabyte) files. There was some difference, but it + wasn't that much. + + \item CPU time. Level 9 zlib compression is often two to four times as slow + as level 6 compression. This can make a substantial difference in the + overall runtime of a program. +\end{list} + +While the zlib compression library uses the same compression algorithm as the +gzip and zip programs, the format is different. The zlib format is defined in +RFC 1950. + +\pagebreak + +\section{BigInt} + +\type{BigInt} is Botan's implementation of a multiple-precision +integer. Thanks to C++'s operator overloading features, using \type{BigInt} is +often quite similar to using a native integer type. The number of functions +related to \type{BigInt} is quite large. You can find most of them in +\filename{bigint.h} and \filename{numthry.h}. + +Due to the sheer number of functions involved, only a few, which a regular user +of the library might have to deal with, are mentioned here. Fully documenting +the MPI library would take a significant while, so if you need to use it now, +the best way to learn is to look at the headers. + +Probably the most important are the encoding/decoding functions, which +transform the normal representation of a \type{BigInt} into some other form, +such as a decimal string. The most useful of these functions are + +\type{SecureVector<byte>} \function{BigInt::encode}(\type{BigInt}, +\type{Encoding}) + +\noindent +and + +\type{BigInt} \function{BigInt::decode}(\type{SecureVector<byte>}, +\type{Encoding}) + +\type{Encoding} is an enum which has values \type{Binary}, \type{Octal}, +\type{Decimal}, and \type{Hexadecimal}. The parameter will default to +\type{Binary}. These functions are static member functions, so they would be +called like this: + +\begin{verbatim} + BigInt n1; // some number + SecureVector<byte> n1_encoded = BigInt::encode(n1); + BigInt n2 = BigInt::decode(n1_encoded); + // now n1 == n2 +\end{verbatim} + +There are also C++-style I/O operators defined for use with \type{BigInt}. The +input operator understands negative numbers, hexadecimal numbers (marked with a +leading ``0x''), and octal numbers (marked with a leading '0'). The '-' must +come before the ``0x'' or '0' marker. The output operator will never adorn the +output; for example, when printing a hexadecimal number, there will not be a +leading ``0x'' (though a leading '-' will be printed if the number is +negative). If you want such things, you'll have to do them yourself. + +\type{BigInt} has constructors that can create a \type{BigInt} from an unsigned +integer or a string. You can also decode a \type{byte}[] / length pair into a +BigInt. There are several other \type{BigInt} constructors, which I would +seriously recommend you avoid, as they are only intended for use internally by +the library, and may arbitrarily change, or be removed, in a future release. + +An essentially random sampling of \type{BigInt} related functions: + +\type{u32bit} \function{BigInt::bytes}(): Return the size of this \type{BigInt} +in bytes. + +\type{BigInt} \function{random\_prime(\type{u32bit} \arg{b})}: Return a prime +number \arg{b} bits long. + +\type{BigInt} \function{gcd}(\type{BigInt} \arg{x}, \type{BigInt} \arg{y}): +Returns the greatest common divisor of \arg{x} and \arg{y}. Uses the binary +GCD algorithm. + +\type{bool} \function{is\_prime}(\type{BigInt} \arg{x}): Returns true if +\arg{x} is a (possible) prime number. Uses the Miller-Rabin probabilistic +primality test with fixed bases. For higher assurance, use +\function{verify\_prime}, which uses more rounds and randomized 48-bit bases. + +\subsection{Efficiency Hints} + +If you can, always use expressions of the form \verb|a += b| over +\verb|a = a + b|. The difference can be \emph{very} substantial, because the +first form prevents at least one needless memory allocation, and possibly as +many as three. + +If you're doing repeated modular exponentiations with the same modulus, create +a \type{BarrettReducer} ahead of time. If the exponent or base is a constant, +use the classes in \filename{mod\_exp.h}. This stuff is all handled for you by +the normal high-level interfaces, of course. + +\subsection{A Warning} + +Don't ever even consider using the low-level MPI functions (those that begin +with \texttt{bigint\_}). These are completely internal to the library, and make +arbitrarily strange and undocumented assumptions about their inputs, and don't +check to see if they are actually true, on the assumption that only the library +itself calls them, and that the library knows what the assumptions are. The +interfaces for these functions can change completely without notice. These +functions aren't visible without effort on your part specifically to that end, +so you will get no sympathy if you decide to use any of them. + +\pagebreak + +\section{Removing Algorithms} + +You may well want to remove some of Botan's algorithms in order to fit it into +a memory-constrained system, where you're counting the kilobytes. For the most +part, this is trivial to do, and Botan's interface makes it easy for +applications to test for the presence of an algorithm at runtime, so a +well-behaved application can work without any need for porting on such an +version of Botan. + +In some versions of 1.3.x, you can use the 'minimal' module, which removes +large amount of Botan, including most ciphers and hashes (except AES, DES/3DES, +SHA-1, HMAC, RSA, DSA, and Diffie-Hellman), DLIES, EAX and CTS modes, and a few +other odds and ends. You can check for this being the case by seeing if +\macro{BOTAN\_EXT\_MINIMAL} is defined, though for the most part it's better to +use the lookup interface (since you have no way of knowing what exactly the +minimal module might remove from release to release, and certainly not if the +shared object you're linking to has a particular algorithm). This module was +removed just before 1.4.0, as there is a better way to handle all of this in +the new engine code, which is aware of things outside public key algorithms. + +Removing things like the PK signature encoding schemes (EMSA2, EMSA3...) is +somewhat more complicated and not documented here (thought it is actually quite +simple if you know how to do it -- the minimal module shows how). This tutorial +(of sorts) will go through the steps required to compile a version of Botan +without the Blowfish block cipher (which has been included since the first +release of Botan, in the spring of 2001). + +The first step is to remove the files \filename{include/blowfish.h}, +\filename{src/blowfish.cpp}, and \filename{src/blfs\_tab.cpp}, which actually +implement the algorithm. Then minor editing of \filename{src/algolist.cpp} is +required. First, remove the line that includes the Blowfish header +\filename{botan/blowfish.h}. Then look in \function{get\_block\_cipher} for the +code that adds a Blowfish block cipher object to the internal lookup table, and +remove it. Run the configure script, and then \textbf{make} the library. Tada! +Done. + +So how does an application test for such a situation? The first is to simply +try to pass the name ``Blowfish'' to constructor of \type{CBC\_Encryption} or +other Botan \type{Filter}, and catch the resulting exception. This is not +particularly flexible, though. If an application wants to check on the status +of Botan's support for a particular algorithm, it can call some status +functions found in \filename{lookup.h}, called \function{have\_block\_cipher}, +\function{have\_stream\_cipher}, \function{have\_hash}, and +\function{have\_mac}, passing in the name of the desired algorithm. If Botan +knows about it, the function will return true. + +There are a handful of algorithms which are considered ``sacred'', in that an +application can always expect that they exist, and a distributor or other +end-user should not remove them without considering the possibly serious +consequences. At this time, these are: AES, DES, TripleDES, SHA-1, and HMAC. +This allows a workable fallback strategy for applications. + +One other useful application of this is to remove patented algorithms, for +example if Botan were to be included as part of a commercial Linux +distribution. + +For the most part, applications don't have to really worry about this, simply +because the cases this will be required are fairly rare. Checking for the +availability of patented algorithms like RC5 and IDEA before using them might +be a good idea, though. + +Another advantage of this is that an application can be written to take +advantage of an algorithm which is not currently part of Botan. If it's not +available, one can simply fall back on another algorithm, and when/if it is +added to Botan, the application will start using it automagically. + +\pagebreak + +\section{Writing Modules} + +It's a lot simpler to write modules for Botan that it is to write code in the +core library, for several reasons. First, a module can rely on external +libraries and services beyond the base ISO C++ libraries, and also machine +dependent features (assembler, anyone?). Also, the code can be added at +configuration time on the user's end with very little effort (\ie the code can +be distributed separately and without depending on patching anything). + +Creating a module is quite simple. First, there must be a subdirectory in the +\filename{modules} directory for it. The name of the module is the same as the +name of this directory. Inside this directory, there needs to be a file, with +exactly the same name as the directory (that's so the configuration script +knows where to look). This file contains directives it uses to specify what +this module does, what systems it runs on, and so on. Comments start with a +\verb|#| character and continue to end of line. + +Recognized directives include: + +\newcommand{\directive}[2]{ + \vskip 4pt + \noindent + \texttt{#1}: #2 +} + +\directive{realname <name>}{Specify that the 'real world' name of this module + is \texttt{<name>}.} + +\directive{note <note>}{Add a note that will be seen by the end-user at +configure time.} + +\directive{require\_version <version>}{Require at configure time that the +version of Botan in use be at least \texttt{<version>}. If not, the module will +be ignored. Note that this directive is ignored prior to 1.4.3.} + +\directive{define <macro>}{Define \macro{BOTAN\_EXT\_<macro>} in + \filename{config.h}. This may only be used if the module creates + user-visible changes. There is a set of conventions that should be followed + in deciding what to call this macro (where xxx denotes some descriptive and + distinguishing characteristic of the thing implemented, such as + \macro{ALLOC\_MLOCK} or \macro{MUTEX\_PTHREAD}): + +\begin{itemize} +\item Allocator: \macro{ALLOC\_xxx} +\item Compressors: \macro{COMPRESSOR\_xxx} +\item EntropySource: \macro{ENTROPY\_SRC\_xxx} +\item Engines: \macro{ENGINE\_xxx} +\item Mutex: \macro{MUTEX\_xxx} +\item Timer: \macro{TIMER\_xxx} +\end{itemize} +} + +\directive{<lib> / </lib>}{This specifies any extra libraries to be linked in. +It is a mapping from OS to library name, for example \texttt{linux -> rt}, +which means that on Linux librt should be linked in. You can also use ``all'' +to force the library to be linked in on all systems.} + +\directive{add\_file <file>}{Tell the configuration script to add the file + given into the source tree. This file must exist in the module directory.} + +\directive{ignore\_file <file>}{Tell the configuration script to ignore the + file given in the main source tree.} + +\directive{replace\_file <file>}{Tell the configuration script to ignore the + file given in the main source tree, and instead use the one in the module's + directory.} + +\directive{local\_only <file>}{Mark this header file as being for the build + only; it will not be installed. This is useful for headers used internally + that are not exposed to the client.} + +\vskip 10pt +\noindent +Additionally, the module file can contain blocks, delimited by +the following pairs: + +\texttt{<os> / </os>}, \texttt{<arch> / </arch>}, \texttt{<cc> / </cc>} + +\noindent +For example, putting ``alpha'' and ``ia64'' in a \texttt{<arch>} block will +make the configuration script only allow the module to be compiled on those +architectures. Not having a block means any value is acceptable. + +\section{Writing BigInt assembly modules} + +Write me... + +\pagebreak + +\section{Compliance with Standards} + +Botan is/should be compatible with many cryptographic standards, including the +following: + +\newcommand{\standard}[2]{ + \vskip 4pt + * #1: \textbf{#2} +} + +\standard{RSA}{PKCS \#1 v2.1, ANSI X9.31} + +\standard{DSA}{ANSI X9.30, FIPS 186-2} + +\standard{Diffie-Hellman}{ANSI X9.42, PKCS \#3} + +\standard{Certificates}{ITU X.509, RFC 3280/3281 (PKIX), PKCS \#9 v2.0, +PKCS \#10} + +\standard{Private Key Formats}{PKCS \#5 v2.0, PKCS \#8} + +\standard{DES/DES-EDE}{FIPS 46-3, ANSI X3.92, ANSI X3.106} + +\standard{SHA-1}{FIPS 180-2} + +\standard{HMAC}{ANSI X9.71, FIPS 198} + +\standard{ANSI X9.19 MAC}{ANSI X9.9, ANSI X9.19} + +\vskip 8pt +\noindent +There is also support for the very general standards of \textbf{IEEE 1363-2000} +and \textbf{1363a}. Most of the contents of such are included in the standards +mentioned above, in various forms (usually with extra restrictions which 1363 +does not impose). + +\pagebreak + +\section{Recommended Algorithms} + +This section is by no means the last word on selecting which algorithms to use. +However, Botan includes a sometimes bewildering array of possible algorithms, +and unless you're familiar with the latest developments in the field, it can be +hard to know what is secure and what is not. The following attributes of the +algorithms were evaluated when making this list: security, standardization, +patent status, support by other implementations, and efficiency (in roughly +that order). + +It is intended as a set of simple guidelines for developers, and nothing more. +It's entirely possible that there are algorithms in Botan that will turn out to +be more secure than the ones listed, but the algorithms listed here are +(currently) thought to be safe. + +\begin{list}{$\cdot$} + \item Block ciphers: TripleDES or AES in CBC mode with ``PKCS7'' padding. + \item + + \item Stream Ciphers: Use any of the recommended block ciphers in CTR mode. + + \item Hash functions: SHA-1, SHA-256, SHA-512 + + \item MACs: HMAC with any recommended hash function + + \item Public Key Encryption: RSA with ``EME1(SHA-1)'' + + \item Public Key Signatures: RSA with EMSA4 and any recommended hash, or DSA + with ``EMSA1(SHA-1)'' + + \item Key Agreement: Diffie-Hellman, with ``KDF2(SHA-1)'' +\end{list} + +\pagebreak + +\section{Algorithms Listing} + +Botan includes a very sizable number of cryptographic algorithms. In nearly all +cases, you never need to know the header file or type name to use +them. However, you do need to know what string (or strings) are used to +identify that algorithm. Generally, these names conform to those set out by +SCAN (Standard Cryptographic Algorithm Naming), which is a document which +specifies how strings are mapped onto algorithm objects, which is useful for a +wide variety of crypto APIs (SCAN is oriented towards Java, but Botan and +several other non-Java libraries also make at least some use of it). For full +details, read the SCAN document, which can be found at +\textbf{http://www.users.zetnet.co.uk/hopwood/crypto/scan/} + +Many of these algorithms can take options (such as the number of rounds in a +block cipher, the output size of a hash function, etc). These are shown in the +following list; all of them default to reasonable values (unless otherwise +marked). There are algorithm-specific limits on most of them. When you see +something like ``HASH'' or ``BLOCK'', that means you should insert the name of +some algorithm of that type. There are no defaults for those options. + +A few very obscure algorithms are skipped; if you need one of them, you'll know +it, and you can look in the appropriate header to see what that classes' +\function{name} function returns (the names tend to match that in SCAN, if it's +defined there). + +\begin{list}{$\cdot$} + \item ROUNDS: The number of rounds in a block cipher. + \item + \item OUTSZ: The output size of a hash function or MAC + \item PASS: The number of passes in a hash function (more passes generally + means more security). +\end{list} + +\vskip .05in +\noindent +\textbf{Block Ciphers:} ``AES'', ``Blowfish'', ``CAST-128'', ``CAST-256'', +``DES'', ``DESX'', ``TripleDES'', ``GOST'', ``IDEA'', ``MARS'', +``MISTY1(ROUNDS)'', ``RC2'', ``RC5(ROUNDS)'', ``RC6'', ``SAFER-SK(ROUNDS)'', +``SEED'', ``Serpent'', ``Skipjack'', ``SQUARE'', ``TEA'', ``Twofish'', ``XTEA'' + +\noindent +\textbf{Stream Ciphers:} ``ARC4'', ``MARK4'', ``Turing'', ``WiderWake4+1-BE'' + +\noindent +\textbf{Hash Functions:} ``FORK-256'', ``HAS-160'', ``MD2'', ``MD4'', ``MD5'', +``RIPEMD-128'', ``RIPEMD-160'', ``SHA-160'', ``SHA-256'', ``SHA-384'', +``SHA-512'', ``Tiger(OUTSZ,PASS)'', ``Whirlpool'' + +\noindent +\textbf{MACs:} ``HMAC(HASH)'', ``CMAC(BLOCK)'', ``X9.19-MAC'' + +\pagebreak + +\section{More Information} + +\subsection{Support} + +Questions or problems you have with Botan can be directed to the development +mailing list (currently called \texttt{opencl-devel}). Joining this list is +highly recommended if you're going to be using Botan, since often advance +notice of upcoming changes is sent there. ``Philosophical'' bug reports, +announcements of programs using Botan, and basically anything else having to do +with Botan are also welcome. + +\subsection{Compatibility} + +Generally, cryptographic algorithms are well standardized, and thus +compatibility between implementations is relatively simple (of course, not all +algorithms are supported by all implementations). But there are a few +algorithms which are poorly specified, and these should be avoided if you wish +your data to be processed in the same way by another implementation (including +future versions of Botan). + +The block cipher GOST has a particularly poor specification: there are no +standard Sboxes, and the specification does not give test vectors even for +sample boxes, which leads to issues of endian conventions, etc. + +If you wish maximum portability between different implementations of an +algorithm, it's best to stick to strongly defined and well standardized +algorithms, TripleDES, AES, HMAC, and SHA-1 all being good examples. + +\subsection{Patents} + +Some of the algorithms implemented by Botan may be covered by patents in some +locations. Algorithms known to have patent claims on them in the United States +and which are not available in a license-free/royalty-free manner include: +IDEA, MISTY1, RC5, RC6, and Nyberg-Rueppel. + +You must not assume that, just because an algorithm is not listed here, it is +not encumbered by patents. If you have any concerns about the patent status of +any algorithm you are considering using in an application, please discuss it +with your attorney. + +\subsection{Further Reading and Information} + +It's a very good idea if you have some knowledge of cryptography prior to +trying to use this stuff. You really should read one or more of these books +before seriously using the library (note that the Handbook of Applied +Cryptography is available online, and I highly recommend you read it): + +\setlength{\parskip}{5pt} + +\noindent +\textit{Handbook of Applied Cryptography}, Alfred J. Menezes, +Paul C. Van Oorschot, and Scott A. Vanstone; CRC Press + +\noindent +\textit{Security Engineering -- A Guide to Building Dependable Distributed +Systems}, Ross Anderson; Wiley + +\noindent +\textit{Cryptography: Theory and Practice}, Douglas R. Stinson; CRC Press + +\noindent +\textit{Applied Cryptography, 2nd Ed.}, Bruce Schneier; Wiley + +\noindent +Once you've got the basics down, these are good things to at least take a look +at: IEEE 1363 and 1363a, SCAN, NESSIE, PKCS \#1 v2.1, the security related FIPS +documents, and the CFRG RFCs. + +\pagebreak + +\subsection{Contact Information} + +A PGP key with a fingerprint of +\verb|621D AF64 11E1 851C 4CF9 A2E1 6211 EBF1 EFBA DFBC| is used to sign all +Botan releases. This key can be found in the file \filename{doc/pgpkeys.asc}; +PGP keys for the developers are also stored there. + +Another key, with fingerprint +\verb|33E3 9768 1D13 E7B4 1A01 BBCE A63F 2CBD FA02 FBCC|, was used for signing +releases up to version 1.4.2. This key has been retired. + +\vskip 5pt \noindent +Main email contact: \verb|[email protected]| + +\vskip 5pt \noindent +Web Site: \verb|http://botan.randombit.net| + +\pagebreak + +\section{License} + +Copyright \copyright 2000-2006, The Botan Project + +This work is licensed under the Creative Commons Attribution-ShareAlike 2.5 +License. To view a copy of this license, visit +http://creativecommons.org/licenses/by-sa/2.5/ or send a letter to Creative +Commons, 543 Howard Street, 5th Floor, San Francisco, California, 94105, USA. + +\end{document} diff --git a/doc/botan.rc b/doc/botan.rc new file mode 100644 index 000000000..aaa1b3f91 --- /dev/null +++ b/doc/botan.rc @@ -0,0 +1,225 @@ +# Botan configuration (v1.4.2) + +# This config, as shipped, matches the library defaults, but is much easier to +# tweak than recompiling everything. You can use it as a base for your own +# configurations. Read section 10.4 "Configuration Files" in the API doc for +# more information. + +[base] +memory_chunk = 32*1024 # size of the chunk of memory allocated at once +default_pbe = PBE-PKCS5v20(SHA-1,TripleDES/CBC) +pkcs8_tries = 3 + +[pk] +blinder_size = 64 +test/public = basic +test/private = basic +test/private_gen = all + +[pem] +search = 4*1024 +forgive = 8 +width = 64 + +[rng] +# LibraryInitializer will try to acquire at least this many bits of entropy +min_entropy = 384 +es_files = /dev/urandom:/dev/random # path for random devices +egd_path = /var/run/egd-pool:/dev/egd-pool # path to search for an EGD socket +ms_capi_prov_type = INTEL_SEC:RSA_FULL # prefered MS CryptoAPI providers +unix_path = /usr/ucb:/usr/etc:/etc + +[x509] +validity_slack = 24h # how much wiggle room is given when checking validity +v1_assume_ca = false # should v1/v2 certificates be considered CA certs? +cache_verify_results = 30m # how long to cache verification results + +[x509/ca] +allow_ca = false # should PKCS #10 requests be able to ask to be a CA? + # should basic_constraints be included in all certs, including end-user? +basic_constraints = always +default_expire = 1y # default expire time for new certs +signing_offset = 30s # offset the PKCS #10 validity times by this amount +rsa_hash = SHA-1 # what hash to use when using RSA to sign new certs +str_type = latin1 # default string encoding (latin1 or utf8) + +[x509/crl] +# can be 'ignore' or 'throw': ignore matches X.509-2000 behavior, throw is PKIX +unknown_critical = ignore + +# When generating a new CRL, this is the default next update time. Can also be +# set in the call to X509_CA::update_crl/X509_CA::new_crl as the last arg +next_update = 7d + +[x509/exts] +# Each of these can be one of: +# - critical: Extension is marked as critical, if we have the info for it +# - yes or noncritical: Extension is included if needed, but not critical +# - no: Extension is not included, even if the information is available +basic_constraints = critical +subject_key_id = yes +authority_key_id = yes +subject_alternative_name = yes +issuer_alternative_name = yes +key_usage = critical +extended_key_usage = yes +crl_number = yes + +[aliases] +Rijndael = AES +3DES = TripleDES +DES-EDE = TripleDES +CAST5 = CAST-128 +3-Way = ThreeWay +SHARK = SHARK-E +SEAL = SEAL-3.0-BE +SHA1 = SHA-160 +SHA-1 = SHA-160 # Don't change or remove this +MARK-4 = ARC4(256) + +OpenPGP.Cipher.1 = IDEA +OpenPGP.Cipher.2 = TripleDES +OpenPGP.Cipher.3 = CAST-128 +OpenPGP.Cipher.4 = Blowfish +OpenPGP.Cipher.5 = SAFER-SK(13) +OpenPGP.Cipher.7 = AES-128 +OpenPGP.Cipher.8 = AES-192 +OpenPGP.Cipher.9 = AES-256 +OpenPGP.Cipher.10 = Twofish + +OpenPGP.Digest.1 = MD5 +OpenPGP.Digest.2 = SHA-1 +OpenPGP.Digest.3 = RIPEMD-160 +OpenPGP.Digest.5 = MD2 +OpenPGP.Digest.6 = Tiger(24,3) +OpenPGP.Digest.7 = HAVAL(20,5) +OpenPGP.Digest.8 = SHA-256 + +TLS.Digest.0 = Parallel(MD5,SHA-1) + +EME-PKCS1-v1_5 = PKCS1v15 +OAEP-MGF1 = EME1 +EME-OAEP = EME1 +X9.31 = EMSA2 +EMSA-PKCS1-v1_5 = EMSA3 +PSS-MGF1 = EMSA4 +EMSA-PSS = EMSA4 + +[oids] +ISO_MEMBER = 1.2 +US_BODY = ISO_MEMBER.840 +X500 = 2.5 + +RSA_DSI = US_BODY.113549 +ANSI_X957 = US_BODY.10040 +ANSI_X942 = US_BODY.10046 +NIST_ALGO = 2.16.840.1.101.3.4 +PKIX_USAGE = 1.3.6.1.5.5.7.3 +GNU_PROJECT = 1.3.6.1.4.1.11591 +OIW_ALGO = 1.3.14.3.2 +DN_ATTR = X500.4 +X509_KU = X500.29 + +PKCS = RSA_DSI.1 +PKCS1 = PKCS.1 +PKCS5 = PKCS.5 +PKCS7 = PKCS.7 +PKCS9 = PKCS.9 + +DES/CBC = OIW_ALGO.7 +TripleDES/CBC = RSA_DSI.3.7 +RC2/CBC = RSA_DSI.3.2 +CAST-128/CBC = US_BODY.113533.7.66.10 +AES-128/CBC = NIST_ALGO.1.2 +AES-192/CBC = NIST_ALGO.1.22 +AES-256/CBC = NIST_ALGO.1.42 + +MD5 = RSA_DSI.2.5 +SHA-160 = OIW_ALGO.26 +Tiger(24,3) = GNU_PROJECT.12.2 + +KeyWrap.TripleDES = PKCS9.16.3.6 +KeyWrap.RC2 = PKCS9.16.3.7 +KeyWrap.CAST-128 = US_BODY.113533.7.66.15 +KeyWrap.AES-128 = NIST_ALGO.1.5 +KeyWrap.AES-192 = NIST_ALGO.1.25 +KeyWrap.AES-256 = NIST_ALGO.1.45 + +Compression.Zlib = PKCS9.16.3.8 + +RSA = PKCS1.1 +RSA = X500.8.1.1 +DSA = ANSI_X957.4.1 +DH = ANSI_X942.2.1 + +DSA/EMSA1(SHA-160)/DER = ANSI_X957.4.3 +DSA/EMSA1(SHA-160) = ANSI_X957.4.3 +RSA/EMSA3(MD2) = PKCS1.2 +RSA/EMSA3(MD5) = PKCS1.4 +RSA/EMSA3(SHA-160) = PKCS1.5 +RSA/EMSA3(SHA-256) = PKCS1.11 +RSA/EMSA3(SHA-384) = PKCS1.12 +RSA/EMSA3(SHA-512) = PKCS1.13 +RSA/EMSA3(RIPEMD-160) = 1.3.36.3.3.1.2 + +PBE-PKCS5v15(MD2,DES/CBC) = PKCS5.1 +PBE-PKCS5v15(MD2,RC2/CBC) = PKCS5.4 +PBE-PKCS5v15(MD5,DES/CBC) = PKCS5.3 +PBE-PKCS5v15(MD5,RC2/CBC) = PKCS5.6 +PBE-PKCS5v15(SHA-160,DES/CBC) = PKCS5.10 +PBE-PKCS5v15(SHA-160,RC2/CBC) = PKCS5.11 +PBE-PKCS5v20 = PKCS5.13 +PKCS5.PBKDF2 = PKCS5.12 + +CMS.DataContent = PKCS7.1 +CMS.SignedData = PKCS7.2 +CMS.EnvelopedData = PKCS7.3 +CMS.DigestedData = PKCS7.5 +CMS.EncryptedData = PKCS7.6 +CMS.AuthenticatedData = PKCS9.16.1.2 +CMS.CompressedData = PKCS9.16.1.9 + +PKCS9.EmailAddress = PKCS9.1 +PKCS9.UnstructuredName = PKCS9.2 +PKCS9.ContentType = PKCS9.3 +PKCS9.MessageDigest = PKCS9.4 +PKCS9.ChallengePassword = PKCS9.7 +PKCS9.ExtensionRequest = PKCS9.14 + +X520.CommonName = DN_ATTR.3 +X520.Surname = DN_ATTR.4 +X520.SerialNumber = DN_ATTR.5 +X520.Country = DN_ATTR.6 +X520.Locality = DN_ATTR.7 +X520.State = DN_ATTR.8 +X520.Organization = DN_ATTR.10 +X520.OrganizationalUnit = DN_ATTR.11 +X520.Title = DN_ATTR.12 +X520.GivenName = DN_ATTR.42 +X520.Initials = DN_ATTR.43 +X520.GenerationalQualifier = DN_ATTR.44 +X520.DNQualifier = DN_ATTR.46 +X520.Pseudonym = DN_ATTR.65 + +X509v3.SubjectKeyIdentifier = X509_KU.14 +X509v3.KeyUsage = X509_KU.15 +X509v3.SubjectAlternativeName = X509_KU.17 +X509v3.IssuerAlternativeName = X509_KU.18 +X509v3.BasicConstraints = X509_KU.19 +X509v3.CRLNumber = X509_KU.20 +X509v3.ReasonCode = X509_KU.21 +X509v3.HoldInstructionCode = X509_KU.23 +X509v3.InvalidityDate = X509_KU.24 +X509v3.CertificatePolicies = X509_KU.32 +X509v3.AuthorityKeyIdentifier = X509_KU.35 +X509v3.PolicyConstraints = X509_KU.36 +X509v3.ExtendedKeyUsage = X509_KU.37 + +PKIX.ServerAuth = PKIX_USAGE.1 +PKIX.ClientAuth = PKIX_USAGE.2 +PKIX.CodeSigning = PKIX_USAGE.3 +PKIX.EmailProtection = PKIX_USAGE.4 +PKIX.IPsecEndSystem = PKIX_USAGE.5 +PKIX.IPsecTunnel = PKIX_USAGE.6 +PKIX.IPsecUser = PKIX_USAGE.7 +PKIX.TimeStamping = PKIX_USAGE.8 diff --git a/doc/bugs.txt b/doc/bugs.txt new file mode 100644 index 000000000..c2f2b15b1 --- /dev/null +++ b/doc/bugs.txt @@ -0,0 +1,50 @@ +This document lists some of the known bugs: + +* Due to compatibility issues with RedHat's gcc 2.96 compiler, std::vector + accesses are done using [] instead of at(). This is generally OK (we are only + *supposed* to read valid indexes), but at times bugs have caused it so that + an invalid index is accessed, which is either not noticed until later, or + causes a SIGSEGV on the spot. + + Other GCC related problems: we can't use std::is* or std::to* because GCC pre + 3.0 (including RedHat's 2.96) barfs on it. And std::string is missing the + clear() function (this is not a real problem, just an annoyance). + + There are various other problems related to linkage, std::multimap handling, + and so forth that I won't bother getting into here. + + Currently there are a fairly large number of workarounds for GCC 2.x. Botan + 1.4.x will be the last stable series that supports this compiler; all said + workarounds will be removed in 1.5.0 (and stable trees past 1.4.x). As of now + (June 2004) almost all operating systems that have used 2.95.x or 2.96 in the + past (most Linux distros, MacOS X 10.1, NetBSD 1.6, FreeBSD 4, QNX 6.2) have + been replaced by newer versions that use a more modern GCC. + +* In some of the nastier cases (like indefinite length constructions), the BER + decoder will use small integer multiples the total size of the object. In the + usual case of something being a few tens/100s of kilobytes at most, this is + not a big deal, but reading in a 50 meg BER file could be very costly in some + cases. + +* Several places assume ASCII characters. However as of 1.3.5 this is (or + should be) contained to charset.cpp; everything else should be character set + independent. The version as shipped assumes ASCII/Latin-1, and modifications + would be needed to use it on EBCDIC systems. + + I suspect there are other cases I'm not thinking of, and it's likely that + EBCDIC will need a significant amount of debugging (and thought) to work + correctly. + +* T61 strings are treated like ISO 8859-1 strings, even though this is + technically incorrect. Since everyone else does this too, it shouldn't be a + problem. + +* X509_Time::X509_Time(u64bit) is not thread safe (it uses gmtime). + Unfortunately we can't really rely on gmtime_r being available. The window + of contention is minimal (we copy the return value into a buffer allocated on + the stack) but it is there. If you're very paranoid about this, edit the + get_tm function in asn1_tm.cpp to use gmtime_r + +* There are various low-level threading issues, but mostly in places where a + threaded program should not be touching things (set_mutex_type, for example). + The list of such unsafe functions should be documented. diff --git a/doc/building.tex b/doc/building.tex new file mode 100644 index 000000000..392b92d5c --- /dev/null +++ b/doc/building.tex @@ -0,0 +1,379 @@ +\documentclass{article} + +\setlength{\textwidth}{6.5in} +\setlength{\textheight}{9in} + +\setlength{\headheight}{0in} +\setlength{\topmargin}{0in} +\setlength{\headsep}{0in} + +\setlength{\oddsidemargin}{0in} +\setlength{\evensidemargin}{0in} + +\title{\textbf{Botan Build Guide}} +\author{Jack Lloyd \\ + \texttt{[email protected]}} +\date{} + +\newcommand{\filename}[1]{\texttt{#1}} +\newcommand{\module}[1]{\texttt{#1}} + +\newcommand{\type}[1]{\texttt{#1}} +\newcommand{\function}[1]{\textbf{#1}} +\newcommand{\macro}[1]{\texttt{#1}} + +\begin{document} + +\maketitle + +\tableofcontents + +\parskip=5pt +\pagebreak + +\section{Introduction} + +This document describes how to build Botan on Unix/POSIX and MS Windows +systems. The POSIX oriented descriptions should apply to most common Unix +systems today (including MacOS X), along with POSIX-ish systems like BeOS, QNX, +and Plan 9. Currently, systems other than Windows and POSIX (for example, VMS, +MacOS 9, and OS/390) are not supported by the build system, primarily due to +lack of access. Please contact the maintainer if you would like to build Botan +on such a system. + +\section{For the Impatient} + +\begin{verbatim} +$ ./configure.pl +$ make +$ make install +\end{verbatim} + +Or \verb|nmake|, if you're compiling on Windows with Visual C++. The +autoconfiguaration abilities of \filename{configure.pl} were only recently +added, so they may break if you run it on something unusual. In addition, you +are certain to get more features, and possibly better optimization, by +explicitly specifying how you want to library configured. How to do this is +detailed below. + +\section{Building the Library} + +The first step is to run \filename{configure.pl}, which is a Perl script that +creates various directories, config files, and a Makefile for building +everything. It is run as \verb|./configure.pl CC-OS-CPU <extra args>|. The +script requires at least Perl 5.005, and preferably 5.6 or higher. + +The tuple CC-OS-CPU specifies what system Botan is being built for, in terms of +the C++ compiler, the operating system, and the CPU model. For example, to use +GNU C++ on a FreeBSD box that has an Alpha EV6 CPU, one would use +``gcc-freebsd-alphaev6'', and for Visual C++ on Windows with a Pentium II, +``msvc-windows-pentium2''. To get the list of values for \verb|CC|, \verb|OS|, +and \verb|CPU| that \filename{configure.pl} supports, run it with the +``\verb|--help|'' option. + +You can put basically anything reasonable for CPU: the script knows about a +large number of different architectures, their sub-models, and common aliases +for them. The script does not display all the possibilities in it's help +message because there are simply too many entries (if you're curious about what +exactly is available, you can look at the \verb|%ARCH|, \verb|%ARCH_ALIAS|, and +\verb|%SUBMODEL_ALIAS| hashes at the start of the script). You should only +select the 64-bit version of a CPU (like ``sparc64'' or ``mips64'') if your +operating system knows how to handle 64-bit object code -- a 32-bit kernel on a +64-bit CPU will generally not like 64-bit code. For example, +gcc-solaris-sparc64 will not work unless you're running a 64-bit Solaris kernel +(for 32-bit Solaris running on an UltraSPARC system, you want +gcc-solaris-sparc32-v9). You may or may not have to install 64-bit versions of +libc and related system libraries as well. + +The script also knows about the various extension modules available. You can +enable one or more with the option ``\verb|--modules=MOD|'', where \verb|MOD| +is some name that identifies the extension (or a comma separated list of +them). Modules provide additional capabilities which require the use of non +portable APIs. + +Not all OSes or CPUs have specific support in \filename{configure.pl}. If the +CPU architecture of your system isn't supported by \filename{configure.pl}, use +'generic'. This setting disables machine-specific optimization +flags. Similarly, setting OS to 'generic' disables things which depend greatly +on OS support (specifically, shared libraries). + +However, it's impossible to guess which options to give to a system compiler. +Thus, if you want to compile Botan with a compiler which +\filename{configure.pl} does not support, the script will have to be updated. +Preferably, mail the man pages (or similar documentation) for the C and C++ +compilers and the system linker to the author, or download the Botan-config +package from the Botan web site, and do it yourself. Modifying +\filename{configure.pl} on it's own is useless aside from one-off hacks, +because the script is auto-generated by \emph{another} Perl script, which reads +a little mini-language that tells it all about the systems in question. + +The script tries to guess what kind of makefile to generate, and it almost +always guesses correctly (basically, Visual C++ uses NMAKE with Windows +commands, and everything else uses POSIX make with POSIX commands). Just in +case, you can override it with \verb|--make-style=somestyle|. The styles Botan +currently knows about are 'unix' (normal Unix makefiles), and 'nmake', the make +variant commonly used by Windows compilers. + +\pagebreak + +\subsection{POSIX / Unix} + +The basic build procedure on Unix and Unix-like systems is: + +\begin{verbatim} + $ ./configure.pl CC-OS-CPU --module-set=[unix|beos] --modules=<other mods> + $ make + # You may need to set your LD_LIBRARY_PATH or equivalent for ./check to run + $ make check # optional, but a good idea + $ make install +\end{verbatim} + +The 'unix' module set should work on most POSIX/Unix systems out there +(including MacOS X), while the 'beos' module is specific to BeOS. While the two +sets share a number of modules, some normal Unix ones don't work on BeOS (in +particular, BeOS doesn't have a working \function{mmap} function), and BeOS has +a few extras just for it. The library will pick a default module set for you +based on the value of OS, so there is rarely a reason to specify that. + +The \verb|make install| target has a default directory in which it will install +Botan (on everything that's a real Unix, it's \verb|/usr/local|). You can +override this by using the \texttt{--prefix} argument to +\filename{configure.pl}, like so: + +\verb|./configure.pl --prefix=/opt <other arguments>| + +On Unix, the makefile has to decide who should own the files once they are +installed. By default, it uses \texttt{root:root}, but on some systems (for +example, MacOS X), there is no \texttt{root} group. Also, if you don't have +root access on the system you will want them to be installed owned by something +other than root (like yourself). You can override the defaults at install time +by setting the \texttt{OWNER} and \texttt{GROUP} variables from the command +line. + +\verb|make OWNER=lloyd GROUP=users install| + +On some systems shared libraries might not be immediately visible to the +runtime linker. For example, on Linux you may have to edit +\filename{/etc/ld.so.conf} and run \texttt{ldconfig} (as root) in order for new +shared libraries to be picked up by the linker. An alternative is to set your +\texttt{LD\_LIBRARY\_PATH} shell variable to include the directory that the +Botan libraries were installed into. + +\subsection{MS Windows} + +The situation is not much different here. We'll assume you're using Visual C++ +(for Cygwin, the Unix instructions are probably more relevant). You need to +have a copy of Perl installed, and have both Perl and Visual C++ in your path. + +\begin{verbatim} + > perl configure.pl msvc-windows-<CPU> --module-set=win32 + > nmake + > nmake check # optional, but recommended +\end{verbatim} + +By default, the configure script will include the 'win32' module set for you. +This includes a pair of entropy sources for use on Windows; at some point in +the future it will also add support for high-resolution timers, mutexes for +thread safety, and other useful things. + +For Win95 pre OSR2, the \verb|es_capi| module will not work, because CryptoAPI +didn't exist. All versions of NT4 lack the ToolHelp32 interface, which is how +\verb|es_win32| does it's slow polls, so a version of the library built with +that module will not load under NT4. Later systems (98/ME/2000/XP) support both +methods, so this shouldn't be much of an issue. + +Unfortunately, there currently isn't an install script usable on +Windows. Basically all you have to do is copy the newly created +\filename{libbotan.lib} to someplace where you can find it later (say, +\verb|C:\Botan\|). Then copy the entire \verb|include\botan| directory, which +was constructed when you built the library, into the same directory. + +When building your applications, all you have to do is tell the compiler to +look for both include files and library files in \verb|C:\Botan|, and it will +find both. + +\pagebreak + +\subsection{Configuration Parameters} + +There are some configuration parameters which you may want to tweak before +building the library. These can be found in \filename{config.h}. This file is +overwritten every time the configure script is run (and does not exist until +after you run the script for the first time). + +Also included in \filename{config.h} are macros which are defined if one or +more extensions are available. All of them begin with \verb|BOTAN_EXT_|. For +example, if \verb|BOTAN_EXT_COMPRESSOR_BZIP2| is defined, then an application +using Botan can include \filename{<botan/bzip2.h>} and use the Bzip2 filters. + +\macro{BOTAN\_MP\_WORD\_BITS}: This macro controls the size of the words used +for calculations with the MPI implementation in Botan. You can choose 8, 16, +32, or 64, with 32 being the default. You can use 8, 16, or 32 bit words on +any CPU, but the value should be set to the same size as the CPU's registers +for best performance. You can only use 64-bit words if the \module{mp\_asm64} +module is used; this offers vastly improved performance of public key +algorithms on certain 64-bit CPUs - this is set by default if the module is +used. Unless you are building for a 8 or 16-bit CPU, probably this isn't worth +messing with. + +\macro{BOTAN\_VECTOR\_OVER\_ALLOCATE}: The memory container +\type{SecureVector} will over-allocate requests by this amount (in +elements). In several areas of the library, we grow a vector fairly often. By +over-allocating by a small amount, we don't have to do allocations as often +(which is good, because the allocators can be quite slow). If you \emph{really} +want to reduce memory usage, set it to 0. Otherwise, the default should be +perfectly fine. + +\macro{BOTAN\_DEFAULT\_BUFFER\_SIZE}: This constant is used as the size of +buffers throughout Botan. A good rule of thumb would be to use the page size of +your machine. The default should be fine for most, if not all, purposes. + +\macro{BOTAN\_GZIP\_OS\_CODE}: The OS code is included in the Gzip header when +compressing. The default is 255, which means 'Unknown'. You can look in RFC +1952 for the full list; the most common are Windows (0) and Unix (3). There is +also a Macintosh (7), but it probably makes more sense to use the Unix code on +OS X. This is only used if the \texttt{comp\_zlib} module (which includes a +gzip compressor) is built. + +\pagebreak + +\section{Modules} + +There are a fairly large number of modules included with Botan. Some of these +are extremely useful, while others are only necessary in very unusual +circumstances. The modules included with this release are: + +\newcommand{\mod}[2]{\textbf{#1}: #2} + +\begin{list}{$\cdot$} + \item \mod{alloc\_mmap}{Allocates memory using memory mappings of temporary + files. This means that if the OS swaps all or part of the application, + the sensitive data will be swapped to where we can later clean it, + rather than somewhere in the swap partition.} + + \item \mod{comp\_bzip2}{Enables an application to perform bzip2 compression + and decompression using the library. Available on any system that has + bzip2.} + + \item \mod{comp\_zlib}{Enables an application to perform zlib compression and + decompression using the library. Available on any system that has + zlib.} + + \item \mod{eng\_aep}{An engine that uses any available AEP accelerator card + to speed up PK operations. You have to have the AEP drivers installed + for this to link correctly, but you don't have to have a card + installed - it will automatically be enabled if a card is detected at + run time.} + + \item \mod{eng\_gmp}{An engine that uses GNU MP to speed up PK operations. + GNU MP 4.1 or later is required.} + + \item \mod{eng\_ossl}{An engine that uses OpenSSL's BN library to speed up PK + operations. OpenSSL 0.9.7 or later is required.} + + \item \mod{es\_beos}{An entropy source that uses BeOS-specific APIs to gather + (hopefully unpredictable) data from the system.} + + \item \mod{es\_capi}{An entropy source that uses the Win32 CryptoAPI function + \texttt{CryptGenRandom} to gather entropy. Supported on NT4, Win95 + OSR2, and all later Windows systems.} + + \item \mod{es\_egd}{An entropy source that accesses EGD (the entropy + gathering daemon). Common on Unix systems that don't have + \texttt{/dev/random}.} + + \item \mod{es\_ftw}{Gather entropy by reading files from a particular file + tree. Usually used with \texttt{/proc}; most other file trees don't + have sufficient variability over time to be useful.} + + \item \mod{es\_unix}{Gather entropy by running various Unix programs, like + \texttt{arp} and \texttt{vmstat}, and reading their output in the + hopes that at least some of it will be unpredictable to an attacker.} + + \item \mod{es\_win32}{Gather entropy by walking through various pieces of + information about processes running on the system. Does not run on + NT4, but should run on all other Win32 systems.} + + \item \mod{fd\_unix}{Let the users of \texttt{Pipe} perform I/O with Unix + file descriptors in addition to \texttt{iostream} objects.} + + \item \mod{ml\_unix}{Add hooks for locking memory into RAM. Usually requires + the application to run as \texttt{root} to actually work, but if the + application is not allowed to call \texttt{mlock}, no harm results.} + + \item \mod{mp\_asm64}{Use inline assembly to access the multiply instruction + available on some 64-bit CPUs. This module only runs on Alpha, AMD64, + IA-64, MIPS64, and PowerPC-64. Typically PKI operations are several + times as fast with this module than without.} + + \item \mod{mux\_pthr}{Add support for using \texttt{pthread} mutexes to + lock internal data structures. Important if you are using threads + with the library.} + + \item \mod{mux\_qt}{Add support for using Qt mutexes to lock internal data + structures.} + + \item \mod{tm\_hard}{Use the contents of the CPU cycle counter when + generating random bits to further randomize the results. Works on x86 + (Pentium and up), Alpha, and SPARCv9.} + + \item \mod{tm\_posix}{Use the POSIX realtime clock as a high-resolution + timer.} + + \item \mod{tm\_unix}{Use the traditional Unix \texttt{gettimeofday} as a high + resolution timer.} + + \item \mod{tm\_win32}{Use Win32's \texttt{GetSystemTimeAsFileTime} as a high + resolution timer.} + +\end{list} + +\pagebreak + +\section{Building Applications} + +\subsection{Unix} + +Botan usually links in several different system libraries (such as +\texttt{librt} and \texttt{libz}), depending on which modules are configured at +compile time. In many environments, particularly ones using static libraries, +an application has to link against the same libraries as Botan for the linking +step to succeed. But how does it figure out what libraries it \emph{is} linked +against? + +The answer is to ask the \filename{botan-config} script. This basically solves +the same problem all the other \filename{*-config} scripts solve, and in +basically the same manner. At some point in the future, a transition to +\filename{pkg-config} will be made (as it's less work, and has more features), +but right now it doesn't exist on most Unix systems, while a plain Bourne shell +script will run fine on anything. + +There are 4 options: + +\texttt{--prefix[=DIR]}: If no argument, print the prefix where Botan is +installed (such as \filename{/opt} or \filename{/usr/local}). If an argument is +specified, other options given with the same command will execute as if Botan +as actually installed at \filename{DIR} and not where it really is; or at least +where \filename{botan-config} thinks it really is. I should mention that it + +\texttt{--version}: Print the Botan version number. + +\texttt{--cflags}: Print options that should be passed to the compiler whenever +a C++ file is compiled. Typically this is used for setting include paths. + +\texttt{--libs}: Print options for which libraries to link to (this includes +\texttt{-lbotan}). + +Your \filename{Makefile} can run \filename{botan-config} and get the options +necessary for getting your application to compile and link, regardless of +whatever crazy libraries Botan might be linked against. + +\subsection{MS Windows} + +No special help exists for building applications on Windows. However, given +that typically Windows software is distributed as binaries, this is less of a +problem - only the developer needs to worry about it. As long as they can +remember where they installed Botan, they just have to set the appropriate +flags in their Makefile/project file. + +\end{document} diff --git a/doc/credits.txt b/doc/credits.txt new file mode 100644 index 000000000..13e0e7374 --- /dev/null +++ b/doc/credits.txt @@ -0,0 +1,41 @@ + This is the credits file of people that have contributed to Botan. It uses + the same format as the Linux credits file. Please keep it sorted by last + name. + + The fields are: + N - name + E - email + W - web URL + P - PGP keyid/fingerprint + D - Description + S - Snail-mail +---------- + +N: Matthew Gregan +D: Binary file I/O support, allocator fixes + +N: Hany Greiss +D: Windows porting + +N: Matt Johnston +D: Allocator fixes and optimizations, decompressor fixes + +N: Peter J. Jones +D: Bzip2 compression module +S: Colorado, USA + +N: Justin Karneges +D: The Qt-related modules + +N: Jack Lloyd +W: http://www.randombit.net/ +P: 3F69 2E64 6D92 3BBE E7AE 9258 5C0F 96E8 4EC1 6D6B +D: Original author +S: Washington DC, USA + +N: Luca Piccarreta +D: MS Windows mutex module, x86/amd64 assembler +S: Italy diff --git a/doc/deprecated.txt b/doc/deprecated.txt new file mode 100644 index 000000000..d4f0b1d52 --- /dev/null +++ b/doc/deprecated.txt @@ -0,0 +1,6 @@ +The following functions/classes/headers/macros/names are deprecated, and will +be removed in a later release. If there is a suggested replacement API, it is +mentioned here. + +Deprecated APIs / Classes +---------------------------------------------------- diff --git a/doc/examples/Makefile b/doc/examples/Makefile new file mode 100644 index 000000000..351d802da --- /dev/null +++ b/doc/examples/Makefile @@ -0,0 +1,124 @@ +# Assumes Botan was compiled with GCC + +BOTAN_DIR = ../.. + +CXX = g++ +WARNINGS = -ansi -W -Wall +#CXX = icc +#WARNINGS = -w1 + +INCLUDES = `$(BOTAN_DIR)/botan-config --cflags` +LIBS = `$(BOTAN_DIR)/botan-config --libs` +FLAGS = $(INCLUDES) $(WARNINGS) -I$(BOTAN_DIR)/build/include -L$(BOTAN_DIR) + +X509_EX = ca pkcs10 self_sig x509info asn1 +RSA_EX = rsa_kgen rsa_enc rsa_dec +DSA_EX = dsa_kgen dsa_sign dsa_ver +DH_EX = dh +HASH_EX = hash hash_fd hasher hasher2 stack +MISC_EX = base base64 bzip encrypt decrypt fips140 xor_ciph + +PROGS = $(X509_EX) $(RSA_EX) $(DSA_EX) $(DH_EX) $(HASH_EX) $(MISC_EX) + +STRIP = true + +all: $(PROGS) + +clean: + @rm -f $(PROGS) + +asn1: asn1.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +base: base.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +base64: base64.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +bzip: bzip.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +ca: ca.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +decrypt: decrypt.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +dh: dh.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +dsa_kgen: dsa_kgen.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +dsa_sign: dsa_sign.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +dsa_ver: dsa_ver.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +encrypt: encrypt.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +fips140: fips140.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +hash: hash.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +hash_fd: hash_fd.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +hasher: hasher.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +hasher2: hasher2.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +pkcs10: pkcs10.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +rsa_dec: rsa_dec.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +rsa_enc: rsa_enc.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +rsa_kgen: rsa_kgen.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +self_sig: self_sig.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +stack: stack.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +x509info: x509info.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ + +xor_ciph: xor_ciph.cpp + $(CXX) $(FLAGS) $? $(LIBS) -o $@ + @$(STRIP) $@ diff --git a/doc/examples/asn1.cpp b/doc/examples/asn1.cpp new file mode 100644 index 000000000..91fb995e1 --- /dev/null +++ b/doc/examples/asn1.cpp @@ -0,0 +1,294 @@ +/* + A simple ASN.1 parser, similiar to 'dumpasn1' or 'openssl asn1parse', though + without some of the bells and whistles of those. Primarily used for testing + the BER decoder. The output format is modeled loosely on 'asn1parse -i' + + The output is actually less precise than the other decoders named, because + the underlying BER_Decoder hides quite a bit from userspace, such as the use + of indefinite length encodings (and the EOC markers). At some point it will + also hide the constructed string types from the user, but right now you'll + seem them as-is. + + Written by Jack Lloyd, November 9-10, 2003 + - Nov 22: Updated to new BER_Object format (tag -> class_tag/type_tag) + - Nov 25: Much improved BIT STRING output + Can deal with non-constructed taggings + Can produce UTF-8 output + + This file is in the public domain. +*/ + +/*******************************************************************/ + +// Set this if your terminal understands UTF-8; otherwise output is in Latin-1 +#define UTF8_TERMINAL 1 + +/* + What level the outermost layer of stuff is at. Probably 0 or 1; asn1parse + uses 0 as the outermost, while 1 makes more sense to me. 2+ doesn't make + much sense at all. +*/ +#define INITIAL_LEVEL 0 + +/*******************************************************************/ + +#include <botan/botan.h> +#include <botan/ber_dec.h> +#include <botan/asn1_obj.h> +#include <botan/oids.h> +#include <botan/pem.h> +#include <botan/charset.h> +using namespace Botan; + +#include <stdio.h> +#include <ctype.h> + +void decode(BER_Decoder&, u32bit); +void emit(const std::string&, u32bit, u32bit, const std::string& = ""); +std::string type_name(ASN1_Tag); + +int main(int argc, char* argv[]) + { + if(argc != 2) + { + printf("Usage: %s <file>\n", argv[0]); + return 1; + } + + try { + LibraryInitializer init; + + DataSource_Stream in(argv[1]); + + if(!PEM_Code::matches(in)) + { + BER_Decoder decoder(in); + decode(decoder, INITIAL_LEVEL); + } + else + { + std::string label; // ignored + BER_Decoder decoder(PEM_Code::decode(in, label)); + decode(decoder, INITIAL_LEVEL); + } + + } + catch(std::exception& e) + { + printf("%s\n", e.what()); + return 1; + } + return 0; + } + +void decode(BER_Decoder& decoder, u32bit level) + { + BER_Object obj = decoder.get_next_object(); + + while(obj.type_tag != NO_OBJECT) + { + const ASN1_Tag type_tag = obj.type_tag; + const ASN1_Tag class_tag = obj.class_tag; + const u32bit length = obj.value.size(); + + /* hack to insert the tag+length back in front of the stuff now + that we've gotten the type info */ + DER_Encoder encoder; + encoder.add_object(type_tag, class_tag, obj.value, obj.value.size()); + SecureVector<byte> bits = encoder.get_contents(); + + BER_Decoder data(bits); + + if(class_tag & CONSTRUCTED) + { + BER_Decoder cons_info(obj.value); + if(type_tag == SEQUENCE) + { + emit("SEQUENCE", level, length); + decode(cons_info, level+1); + } + else if(type_tag == SET) + { + emit("SET", level, length); + decode(cons_info, level+1); + } + else + { + std::string name; + + if((class_tag & APPLICATION) || (class_tag & CONTEXT_SPECIFIC) || + (class_tag & PRIVATE)) + name = "cons [" + to_string(type_tag) + "]"; + else + name = type_name(type_tag) + " (cons)"; + + emit(name, level, length); + decode(cons_info, level+1); + } + } + else if(class_tag == APPLICATION || class_tag == CONTEXT_SPECIFIC || + class_tag == PRIVATE) + { + bool not_text = false; + + for(u32bit j = 0; j != bits.size(); j++) + if(!isgraph(bits[j]) && !isspace(bits[j])) + not_text = true; + + Pipe pipe(((not_text) ? new Hex_Encoder : 0)); + pipe.process_msg(bits); + emit("[" + to_string(type_tag) + "]", level, length, + pipe.read_all_as_string()); + } + else if(type_tag == OBJECT_ID) + { + OID oid; + BER::decode(data, oid); + emit(type_name(type_tag), level, length, OIDS::lookup(oid)); + } + else if(type_tag == INTEGER) + { + BigInt number; + BER::decode(data, number); + + SecureVector<byte> rep; + + /* If it's small, it's probably a number, not a hash */ + if(number.bits() <= 16) + rep = BigInt::encode(number, BigInt::Decimal); + else + rep = BigInt::encode(number, BigInt::Hexadecimal); + + std::string str; + for(u32bit j = 0; j != rep.size(); j++) + str += (char)rep[j]; + + emit(type_name(type_tag), level, length, str); + } + else if(type_tag == BOOLEAN) + { + bool boolean; + BER::decode(data, boolean); + emit(type_name(type_tag), + level, length, (boolean ? "true" : "false")); + } + else if(type_tag == NULL_TAG) + { + emit(type_name(type_tag), level, length); + } + else if(type_tag == OCTET_STRING) + { + SecureVector<byte> bits; + BER::decode(data, bits, type_tag); + bool not_text = false; + + for(u32bit j = 0; j != bits.size(); j++) + if(!isgraph(bits[j]) && !isspace(bits[j])) + not_text = true; + + Pipe pipe(((not_text) ? new Hex_Encoder : 0)); + pipe.process_msg(bits); + emit(type_name(type_tag), level, length, pipe.read_all_as_string()); + } + else if(type_tag == BIT_STRING) + { + SecureVector<byte> bits; + BER::decode(data, bits, type_tag); + + std::vector<bool> bit_set; + + for(u32bit j = 0; j != bits.size(); j++) + for(u32bit k = 0; k != 8; k++) + bit_set.push_back((bool)((bits[bits.size()-j-1] >> (7-k)) & 1)); + + std::string bit_str; + for(u32bit j = 0; j != bit_set.size(); j++) + { + bool the_bit = bit_set[bit_set.size()-j-1]; + + if(!the_bit && bit_str.size() == 0) + continue; + bit_str += (the_bit ? "1" : "0"); + } + + emit(type_name(type_tag), level, length, bit_str); + } + else if(type_tag == PRINTABLE_STRING || + type_tag == NUMERIC_STRING || + type_tag == IA5_STRING || + type_tag == T61_STRING || + type_tag == VISIBLE_STRING || + type_tag == UTF8_STRING || + type_tag == BMP_STRING) + { + ASN1_String str; + BER::decode(data, str); + if(UTF8_TERMINAL) + emit(type_name(type_tag), level, length, iso2utf(str.iso_8859())); + else + emit(type_name(type_tag), level, length, str.iso_8859()); + } + else if(type_tag == UTC_TIME || type_tag == GENERALIZED_TIME) + { + X509_Time time; + BER::decode(data, time); + emit(type_name(type_tag), level, length, time.readable_string()); + } + else + fprintf(stderr, "Unknown tag: class=%02X, type=%02X\n", + class_tag, type_tag); + + obj = decoder.get_next_object(); + } + } + +void emit(const std::string& type, u32bit level, u32bit length, + const std::string& value) + { + const u32bit LIMIT = 128; + const u32bit BIN_LIMIT = 64; + + int written = 0; + written += printf(" d=%2d, l=%4d: ", level, length); + for(u32bit j = INITIAL_LEVEL; j != level; j++) + written += printf(" "); + written += printf("%s ", type.c_str()); + + bool should_skip = false; + if(value.length() > LIMIT) should_skip = true; + if((type == "OCTET STRING" || type == "BIT STRING") && + value.length() > BIN_LIMIT) + should_skip = true; + + if(value != "" && !should_skip) + { + if(written % 2 == 0) printf(" "); + while(written < 50) written += printf(" "); + printf(":%s\n", value.c_str()); + } + else + printf("\n"); + } + +std::string type_name(ASN1_Tag type) + { + if(type == PRINTABLE_STRING) return "PRINTABLE STRING"; + if(type == NUMERIC_STRING) return "NUMERIC STRING"; + if(type == IA5_STRING) return "IA5 STRING"; + if(type == T61_STRING) return "T61 STRING"; + if(type == UTF8_STRING) return "UTF8 STRING"; + if(type == VISIBLE_STRING) return "VISIBLE STRING"; + if(type == BMP_STRING) return "BMP STRING"; + + if(type == UTC_TIME) return "UTC TIME"; + if(type == GENERALIZED_TIME) return "GENERALIZED TIME"; + + if(type == OCTET_STRING) return "OCTET STRING"; + if(type == BIT_STRING) return "BIT STRING"; + + if(type == INTEGER) return "INTEGER"; + if(type == NULL_TAG) return "NULL"; + if(type == OBJECT_ID) return "OBJECT"; + if(type == BOOLEAN) return "BOOLEAN"; + return "(UNKNOWN)"; + } diff --git a/doc/examples/base.cpp b/doc/examples/base.cpp new file mode 100644 index 000000000..5a1328ccf --- /dev/null +++ b/doc/examples/base.cpp @@ -0,0 +1,30 @@ +/* + A simple template for Botan applications, showing startup, etc +*/ +#include <botan/botan.h> +using namespace Botan; + +/* This is how you can do compile-time version checking */ +/* +#if BOTAN_VERSION_CODE < BOTAN_VERSION_CODE_FOR(1,3,9) + #error Your Botan installation is too old; upgrade to 1.3.9 or later +#endif +*/ + +#include <iostream> + +int main() + { + try { + /* Put it inside the try block so exceptions at startup/shutdown will + get caught. + */ + LibraryInitializer init; + } + catch(std::exception& e) + { + std::cout << e.what() << std::endl; + return 1; + } + return 0; + } diff --git a/doc/examples/base64.cpp b/doc/examples/base64.cpp new file mode 100644 index 000000000..c1260b8f4 --- /dev/null +++ b/doc/examples/base64.cpp @@ -0,0 +1,81 @@ +/* +An Botan example application which emulates a poorly written version of +"uuencode -m" + +Written by Jack Lloyd ([email protected]), in maybe an hour scattered +over 2000/2001 + +This file is in the public domain +*/ +#include <fstream> +#include <iostream> +#include <string> +#include <vector> +#include <cstring> +#include <botan/botan.h> + +int main(int argc, char* argv[]) + { + if(argc < 2) + { + std::cout << "Usage: " << argv[0] << " [-w] [-c n] [-e|-d] files...\n" + " -e : Encode input to base64 strings (default) \n" + " -d : Decode base64 input\n" + " -w : Wrap lines\n" + " -c n: Wrap lines at column n, default 78\n"; + return 1; + } + + Botan::LibraryInitializer init; + + int column = 78; + bool wrap = false; + bool encoding = true; + std::vector<std::string> files; + + for(int j = 1; argv[j] != 0; j++) + { + std::string this_arg = argv[j]; + + if(this_arg == "-w") + wrap = true; + else if(this_arg == "-e"); + else if(this_arg == "-d") + encoding = false; + else if(this_arg == "-c") + { + if(argv[j+1]) + { column = atoi(argv[j+1]); j++; } + else + { + std::cout << "No argument for -c option" << std::endl; + return 1; + } + } + else files.push_back(argv[j]); + } + + for(unsigned int j = 0; j != files.size(); j++) + { + std::istream* stream; + if(files[j] == "-") stream = &std::cin; + else stream = new std::ifstream(files[j].c_str()); + + if(!*stream) + { + std::cout << "ERROR, couldn't open " << files[j] << std::endl; + continue; + } + + Botan::Pipe pipe((encoding) ? + ((Botan::Filter*)new Botan::Base64_Encoder(wrap, column)) : + ((Botan::Filter*)new Botan::Base64_Decoder)); + pipe.start_msg(); + *stream >> pipe; + pipe.end_msg(); + pipe.set_default_msg(j); + std::cout << pipe; + if(files[j] != "-") delete stream; + } + return 0; + } diff --git a/doc/examples/bzip.cpp b/doc/examples/bzip.cpp new file mode 100644 index 000000000..46ac8abce --- /dev/null +++ b/doc/examples/bzip.cpp @@ -0,0 +1,102 @@ +/* +An Botan example application which emulates a poorly written version of bzip2 + +Written by Jack Lloyd ([email protected]), Jun 9, 2001 + +This file is in the public domain +*/ +#include <string> +#include <cstring> +#include <vector> +#include <fstream> +#include <iostream> +#include <botan/botan.h> + +#if defined(BOTAN_EXT_COMPRESSOR_BZIP2) + #include <botan/bzip2.h> +#else + #error "You didn't compile the bzip module into Botan" +#endif + +const std::string SUFFIX = ".bz2"; + +int main(int argc, char* argv[]) + { + if(argc < 2) + { + std::cout << "Usage: " << argv[0] + << " [-s] [-d] [-1...9] <filenames>" << std::endl; + return 1; + } + + Botan::LibraryInitializer init; + + std::vector<std::string> files; + bool decompress = false, small = false; + int level = 9; + + for(int j = 1; argv[j] != 0; j++) + { + if(std::strcmp(argv[j], "-d") == 0) { decompress = true; continue; } + if(std::strcmp(argv[j], "-s") == 0) { small = true; continue; } + if(std::strcmp(argv[j], "-1") == 0) { level = 1; continue; } + if(std::strcmp(argv[j], "-2") == 0) { level = 2; continue; } + if(std::strcmp(argv[j], "-3") == 0) { level = 3; continue; } + if(std::strcmp(argv[j], "-4") == 0) { level = 4; continue; } + if(std::strcmp(argv[j], "-5") == 0) { level = 5; continue; } + if(std::strcmp(argv[j], "-6") == 0) { level = 6; continue; } + if(std::strcmp(argv[j], "-7") == 0) { level = 7; continue; } + if(std::strcmp(argv[j], "-8") == 0) { level = 8; continue; } + if(std::strcmp(argv[j], "-9") == 0) { level = 9; continue; } + files.push_back(argv[j]); + } + + try { + + Botan::Filter* bzip; + if(decompress) + bzip = new Botan::Bzip_Decompression(small); + else + bzip = new Botan::Bzip_Compression(level); + + Botan::Pipe pipe(bzip); + + for(unsigned int j = 0; j != files.size(); j++) + { + std::string infile = files[j], outfile = files[j]; + if(!decompress) + outfile = outfile += SUFFIX; + else + outfile = outfile.replace(outfile.find(SUFFIX), + SUFFIX.length(), ""); + + std::ifstream in(infile.c_str()); + std::ofstream out(outfile.c_str()); + if(!in) + { + std::cout << "ERROR: could not read " << infile << std::endl; + continue; + } + if(!out) + { + std::cout << "ERROR: could not write " << outfile << std::endl; + continue; + } + + pipe.start_msg(); + in >> pipe; + pipe.end_msg(); + pipe.set_default_msg(j); + out << pipe; + + in.close(); + out.close(); + } + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + return 1; + } + return 0; + } diff --git a/doc/examples/ca.cpp b/doc/examples/ca.cpp new file mode 100644 index 000000000..fbc637bbc --- /dev/null +++ b/doc/examples/ca.cpp @@ -0,0 +1,65 @@ +/* + Implement the functionality of a simple CA: read in a CA certificate, + the associated private key, and a PKCS #10 certificate request. Sign the + request and print out the new certificate. + + File names are hardcoded for simplicity. + cacert.pem: The CA's certificate (perhaps created by self_sig) + caprivate.pem: The CA's private key + req.pem: The user's PKCS #10 certificate request + + Written by Jack Lloyd, May 19, 2003 + + This file is in the public domain. +*/ + +#include <botan/botan.h> +#include <botan/x509_ca.h> +using namespace Botan; + +#include <iostream> + +#define DOUCH_BAG CESSATION_OF_OPERATION + +int main(int argc, char* argv[]) + { + if(argc != 2) + { + std::cout << "Usage: " << argv[0] << " passphrase" << std::endl; + return 1; + } + + try { + LibraryInitializer init; + + // set up our CA + X509_Certificate ca_cert("cacert.pem"); + std::auto_ptr<PKCS8_PrivateKey> privkey( + PKCS8::load_key("caprivate.pem", argv[1]) + ); + X509_CA ca(ca_cert, *privkey); + + // got a request + PKCS10_Request req("req.pem"); + + // presumably attempt to verify the req for sanity/accuracy here, but + // as Verisign, etc have shown, that's not a must. :) + + // now sign it + X509_Certificate new_cert = ca.sign_request(req); + + // send the new cert back to the requestor + std::cout << new_cert.PEM_encode(); + + std::vector<CRL_Entry> revoked_certs; + revoked_certs.push_back(CRL_Entry(new_cert, DOUCH_BAG)); + X509_CRL crl = ca.update_crl(ca.new_crl(), revoked_certs); + std::cout << crl.PEM_encode(); + } + catch(std::exception& e) + { + std::cout << e.what() << std::endl; + return 1; + } + return 0; + } diff --git a/doc/examples/decrypt.cpp b/doc/examples/decrypt.cpp new file mode 100644 index 000000000..84490cb1b --- /dev/null +++ b/doc/examples/decrypt.cpp @@ -0,0 +1,158 @@ +/* +Decrypt files encrypted with the 'encrypt' example application. + +I'm being lazy and writing the output to stdout rather than stripping off the +".enc" suffix and writing it there. So all diagnostics go to stderr so there is +no confusion. + +Written by Jack Lloyd ([email protected]) on August 5, 2002 + +This file is in the public domain +*/ +#include <fstream> +#include <iostream> +#include <string> +#include <vector> +#include <cstring> + +#include <botan/botan.h> + +#if defined(BOTAN_EXT_COMPRESSOR_ZLIB) + #include <botan/zlib.h> +#else + #error "You didn't compile the zlib module into Botan" +#endif + +using namespace Botan; + +SecureVector<byte> b64_decode(const std::string&); + +int main(int argc, char* argv[]) + { + if(argc < 2) + { + std::cout << "Usage: " << argv[0] << " [-p passphrase] file\n" + << " -p : Use this passphrase to decrypt\n"; + return 1; + } + + std::string filename, passphrase; + + for(int j = 1; argv[j] != 0; j++) + { + if(std::strcmp(argv[j], "-p") == 0) + { + if(argv[j+1]) + { + passphrase = argv[j+1]; + j++; + } + else + { + std::cout << "No argument for -p option" << std::endl; + return 1; + } + } + else + { + if(filename != "") + { + std::cout << "You can only specify one file at a time\n"; + return 1; + } + filename = argv[j]; + } + } + + if(passphrase == "") + { + std::cout << "You have to specify a passphrase!" << std::endl; + return 1; + } + + std::ifstream in(filename.c_str()); + if(!in) + { + std::cout << "ERROR: couldn't open " << filename << std::endl; + return 1; + } + + std::string algo; + + try { + + LibraryInitializer init; + + std::string header, salt_str, mac_str; + std::getline(in, header); + std::getline(in, algo); + std::getline(in, salt_str); + std::getline(in, mac_str); + + if(header != "-------- ENCRYPTED FILE --------") + { + std::cout << "ERROR: File is missing the usual header" << std::endl; + return 1; + } + + if(!have_block_cipher(algo)) + { + std::cout << "Don't know about the block cipher \"" << algo << "\"\n"; + return 1; + } + + const u32bit key_len = max_keylength_of(algo); + const u32bit iv_len = block_size_of(algo); + + std::auto_ptr<S2K> s2k(get_s2k("PBKDF2(SHA-1)")); + s2k->set_iterations(8192); + s2k->change_salt(b64_decode(salt_str)); + + SymmetricKey bc_key = s2k->derive_key(key_len, "BLK" + passphrase); + InitializationVector iv = s2k->derive_key(iv_len, "IVL" + passphrase); + SymmetricKey mac_key = s2k->derive_key(16, "MAC" + passphrase); + + Pipe pipe(new Base64_Decoder, + get_cipher(algo + "/CBC", bc_key, iv, DECRYPTION), + new Zlib_Decompression, + new Fork( + 0, + new Chain(new MAC_Filter("HMAC(SHA-1)", mac_key), + new Base64_Encoder) + ) + ); + + pipe.start_msg(); + in >> pipe; + pipe.end_msg(); + + std::string our_mac = pipe.read_all_as_string(1); + if(our_mac != mac_str) + std::cout << "WARNING: MAC in message failed to verify\n"; + + std::cout << pipe.read_all_as_string(0); + } + catch(Algorithm_Not_Found) + { + std::cout << "Don't know about the block cipher \"" << algo << "\"\n"; + return 1; + } + catch(Decoding_Error) + { + std::cout << "Bad passphrase or corrupt file\n"; + return 1; + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + return 1; + } + return 0; + } + +SecureVector<byte> b64_decode(const std::string& in) + { + Pipe pipe(new Base64_Decoder); + pipe.process_msg(in); + return pipe.read_all(); + } diff --git a/doc/examples/dh.cpp b/doc/examples/dh.cpp new file mode 100644 index 000000000..8dcc93f88 --- /dev/null +++ b/doc/examples/dh.cpp @@ -0,0 +1,54 @@ +/* + A simple DH example + + Written by Jack Lloyd ([email protected]), on December 24, 2003 + + This file is in the public domain +*/ +#include <botan/botan.h> +#include <botan/dh.h> +using namespace Botan; + +#include <iostream> + +int main() + { + try { + LibraryInitializer init; + + // Alice creates a DH key and sends (the public part) to Bob + DH_PrivateKey private_a(DL_Group("modp/ietf/1024")); + DH_PublicKey public_a = private_a; // Bob gets this + + // Bob creates a key with a matching group + DH_PrivateKey private_b(public_a.get_domain()); + + // Bob sends the key back to Alice + DH_PublicKey public_b = private_b; // Alice gets this + + // Both of them create a key using their private key and the other's + // public key + SymmetricKey alice_key = private_a.derive_key(public_b); + SymmetricKey bob_key = private_b.derive_key(public_a); + + if(alice_key == bob_key) + { + std::cout << "The two keys matched, everything worked\n"; + std::cout << "The shared key was: " << alice_key.as_string() << "\n"; + } + else + { + std::cout << "The two keys didn't match!\n"; + std::cout << "Alice's key was: " << alice_key.as_string() << "\n"; + std::cout << "Bob's key was: " << bob_key.as_string() << "\n"; + } + + // Now Alice and Bob hash the key and use it for something + } + catch(std::exception& e) + { + std::cout << e.what() << std::endl; + return 1; + } + return 0; + } diff --git a/doc/examples/dsa_kgen.cpp b/doc/examples/dsa_kgen.cpp new file mode 100644 index 000000000..eb3a00393 --- /dev/null +++ b/doc/examples/dsa_kgen.cpp @@ -0,0 +1,57 @@ +/* +Generate a 1024 bit DSA key and put it into a file. The public key format is +that specified by X.509, while the private key format is PKCS #8. + +The domain parameters are the ones specified as the Java default DSA +parameters. There is nothing special about these, it's just the only 1024-bit +DSA parameter set that's included in Botan at the time of this writing. The +application always reads/writes all of the domain parameters to/from the file, +so a new set could be used without any problems. We could generate a new set +for each key, or read a set of DSA params from a file and use those, but they +mostly seem like needless complications. + +Written by Jack Lloyd ([email protected]), August 5, 2002 + Updated to use X.509 and PKCS #8 formats, October 21, 2002 + +This file is in the public domain +*/ + +#include <iostream> +#include <fstream> +#include <string> +#include <botan/botan.h> +#include <botan/dsa.h> +using namespace Botan; + +int main(int argc, char* argv[]) + { + if(argc != 2) + { + std::cout << "Usage: " << argv[0] << " passphrase" << std::endl; + return 1; + } + + std::string passphrase(argv[1]); + + std::ofstream priv("dsapriv.pem"); + std::ofstream pub("dsapub.pem"); + if(!priv || !pub) + { + std::cout << "Couldn't write output files" << std::endl; + return 1; + } + + try { + LibraryInitializer init; + + DSA_PrivateKey key(DL_Group("dsa/jce/1024")); + + pub << X509::PEM_encode(key); + priv << PKCS8::PEM_encode(key, passphrase); + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + } + return 0; + } diff --git a/doc/examples/dsa_sign.cpp b/doc/examples/dsa_sign.cpp new file mode 100644 index 000000000..26f9e9dac --- /dev/null +++ b/doc/examples/dsa_sign.cpp @@ -0,0 +1,79 @@ +/* +Decrypt an encrypted DSA private key. Then use that key to sign a message. + +Written by Jack Lloyd ([email protected]), August 5, 2002 + Updated to use X.509 and PKCS #8 format keys, October 21, 2002 + +This file is in the public domain +*/ + +#include <iostream> +#include <iomanip> +#include <fstream> +#include <string> +#include <memory> + +#include <botan/botan.h> +#include <botan/look_pk.h> +#include <botan/dsa.h> +using namespace Botan; + +const std::string SUFFIX = ".sig"; + +int main(int argc, char* argv[]) + { + if(argc != 4) + { + std::cout << "Usage: " << argv[0] << " keyfile messagefile passphrase" + << std::endl; + return 1; + } + + try { + std::string passphrase(argv[3]); + + std::ifstream message(argv[2]); + if(!message) + { + std::cout << "Couldn't read the message file." << std::endl; + return 1; + } + + std::string outfile = argv[2] + SUFFIX; + std::ofstream sigfile(outfile.c_str()); + if(!sigfile) + { + std::cout << "Couldn't write the signature to " + << outfile << std::endl; + return 1; + } + + LibraryInitializer init; + + std::auto_ptr<PKCS8_PrivateKey> key( + PKCS8::load_key(argv[1], passphrase) + ); + + DSA_PrivateKey* dsakey = dynamic_cast<DSA_PrivateKey*>(key.get()); + + if(!dsakey) + { + std::cout << "The loaded key is not a DSA key!\n"; + return 1; + } + + Pipe pipe(new PK_Signer_Filter(get_pk_signer(*dsakey, "EMSA1(SHA-1)")), + new Base64_Encoder); + + pipe.start_msg(); + message >> pipe; + pipe.end_msg(); + + sigfile << pipe.read_all_as_string() << std::endl; + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + } + return 0; + } diff --git a/doc/examples/dsa_ver.cpp b/doc/examples/dsa_ver.cpp new file mode 100644 index 000000000..fb6eb7079 --- /dev/null +++ b/doc/examples/dsa_ver.cpp @@ -0,0 +1,94 @@ +/* +Grab an DSA public key from the file given as an argument, grab a signature +from another file, and verify the message (which, suprise, is also in a file). + +The signature format isn't particularly standard, but it's not bad. It's simply +the IEEE 1363 signature format, encoded into base64 with a trailing newline + +Written by Jack Lloyd ([email protected]), August 5, 2002 + Updated to use X.509 format keys, October 21, 2002 + +This file is in the public domain +*/ + +#include <iostream> +#include <iomanip> +#include <fstream> +#include <cstdlib> +#include <string> + +#include <botan/botan.h> +#include <botan/look_pk.h> +#include <botan/dsa.h> +using namespace Botan; + +SecureVector<byte> b64_decode(const std::string& in) + { + Pipe pipe(new Base64_Decoder); + pipe.process_msg(in); + return pipe.read_all(); + } + +int main(int argc, char* argv[]) + { + if(argc != 4) + { + std::cout << "Usage: " << argv[0] + << " keyfile messagefile sigfile" << std::endl; + return 1; + } + + std::ifstream message(argv[2]); + if(!message) + { + std::cout << "Couldn't read the message file." << std::endl; + return 1; + } + + std::ifstream sigfile(argv[3]); + if(!sigfile) + { + std::cout << "Couldn't read the signature file." << std::endl; + return 1; + } + + try { + std::string sigstr; + getline(sigfile, sigstr); + + LibraryInitializer init; + + std::auto_ptr<X509_PublicKey> key(X509::load_key(argv[1])); + DSA_PublicKey* dsakey = dynamic_cast<DSA_PublicKey*>(key.get()); + if(!dsakey) + { + std::cout << "The loaded key is not a DSA key!\n"; + return 1; + } + + SecureVector<byte> sig = b64_decode(sigstr); + + Pipe pipe(new PK_Verifier_Filter( + get_pk_verifier(*dsakey, "EMSA1(SHA-1)"), sig + ) + ); + + pipe.start_msg(); + message >> pipe; + pipe.end_msg(); + + byte result = 0; + pipe.read(result); + + if(result) + std::cout << "Signature verified\n"; + else + std::cout << "Signature did NOT verify\n"; + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + return 1; + } + return 0; + } diff --git a/doc/examples/encrypt.cpp b/doc/examples/encrypt.cpp new file mode 100644 index 000000000..c2cf2c5ba --- /dev/null +++ b/doc/examples/encrypt.cpp @@ -0,0 +1,175 @@ +/* +Encrypt a file using a block cipher in CBC mode. Compresses the plaintext +with Zlib, MACs with HMAC(SHA-1). Stores the block cipher used in the file, +so you don't have to specify it when decrypting. + +What a real application would do (and what this example should do), is test for +the presence of the Zlib module, and use it only if it's available. Then add +some marker to the stream so the other side knows whether or not the plaintext +was compressed. Bonus points for supporting multiple compression schemes. + +Another flaw is that is stores the entire ciphertext in memory, so if the file +you're encrypting is 1 Gb... you better have a lot of RAM. + +Based on the base64 example, of all things + +Written by Jack Lloyd ([email protected]) on August 5, 2002 + +This file is in the public domain +*/ +#include <fstream> +#include <iostream> +#include <string> +#include <vector> +#include <cstring> + +#include <botan/botan.h> + +#if defined(BOTAN_EXT_COMPRESSOR_ZLIB) + #include <botan/zlib.h> +#else + #error "You didn't compile the zlib module into Botan" +#endif + +using namespace Botan; + +std::string b64_encode(const SecureVector<byte>&); + +int main(int argc, char* argv[]) + { + if(argc < 2) + { + std::cout << "Usage: " << argv[0] << " [-c algo] -p passphrase file\n" + " -p : Use this passphrase to encrypt\n" + " -c : Encrypt with block cipher 'algo' (default 3DES)\n"; + return 1; + } + + std::string algo = "TripleDES"; + std::string filename, passphrase; + + // Holy hell, argument processing is a PITA + for(int j = 1; argv[j] != 0; j++) + { + if(std::strcmp(argv[j], "-c") == 0) + { + if(argv[j+1]) + { + algo = argv[j+1]; + j++; + } + else + { + std::cout << "No argument for -c option" << std::endl; + return 1; + } + } + else if(std::strcmp(argv[j], "-p") == 0) + { + if(argv[j+1]) + { + passphrase = argv[j+1]; + j++; + } + else + { + std::cout << "No argument for -p option" << std::endl; + return 1; + } + } + else + { + if(filename != "") + { + std::cout << "You can only specify one file at a time\n"; + return 1; + } + filename = argv[j]; + } + } + + if(passphrase == "") + { + std::cout << "You have to specify a passphrase!" << std::endl; + return 1; + } + + std::ifstream in(filename.c_str()); + if(!in) + { + std::cout << "ERROR: couldn't open " << filename << std::endl; + return 1; + } + + std::string outfile = filename + ".enc"; + std::ofstream out(outfile.c_str()); + if(!out) + { + std::cout << "ERROR: couldn't open " << outfile << std::endl; + return 1; + } + + try { + + LibraryInitializer init; + + if(!have_block_cipher(algo)) + { + std::cout << "Don't know about the block cipher \"" << algo << "\"\n"; + return 1; + } + + const u32bit key_len = max_keylength_of(algo); + const u32bit iv_len = block_size_of(algo); + + std::auto_ptr<S2K> s2k(get_s2k("PBKDF2(SHA-1)")); + s2k->set_iterations(8192); + s2k->new_random_salt(8); + + SymmetricKey bc_key = s2k->derive_key(key_len, "BLK" + passphrase); + InitializationVector iv = s2k->derive_key(iv_len, "IVL" + passphrase); + SymmetricKey mac_key = s2k->derive_key(16, "MAC" + passphrase); + + // Just to be all fancy we even write a (simple) header. + out << "-------- ENCRYPTED FILE --------" << std::endl; + out << algo << std::endl; + out << b64_encode(s2k->current_salt()) << std::endl; + + Pipe pipe(new Fork( + new Chain(new MAC_Filter("HMAC(SHA-1)", mac_key), + new Base64_Encoder + ), + new Chain(new Zlib_Compression, + get_cipher(algo + "/CBC", bc_key, iv, ENCRYPTION), + new Base64_Encoder(true) + ) + ) + ); + + pipe.start_msg(); + in >> pipe; + pipe.end_msg(); + + out << pipe.read_all_as_string(0) << std::endl; + out << pipe.read_all_as_string(1); + + } + catch(Algorithm_Not_Found) + { + std::cout << "Don't know about the block cipher \"" << algo << "\"\n"; + return 1; + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + return 1; + } + return 0; + } + +std::string b64_encode(const SecureVector<byte>& in) + { + Pipe pipe(new Base64_Encoder); + pipe.process_msg(in); + return pipe.read_all_as_string(); + } diff --git a/doc/examples/fips140.cpp b/doc/examples/fips140.cpp new file mode 100644 index 000000000..46e0db4b0 --- /dev/null +++ b/doc/examples/fips140.cpp @@ -0,0 +1,59 @@ +/* + A minimal FIPS-140 application. + + Written by Jack Lloyd ([email protected]), on December 16-19, 2003 + + This file is in the public domain +*/ + +#include <botan/botan.h> +#include <botan/fips140.h> +using namespace Botan; + +#include <iostream> +#include <fstream> + +int main(int, char* argv[]) + { + const std::string EDC_SUFFIX = ".edc"; + + try { + LibraryInitializer init; /* automatically does startup self tests */ + + // you can also do self tests on demand, like this: + if(!FIPS140::passes_self_tests()) + throw Self_Test_Failure("FIPS-140 startup tests"); + + /* + Here, we just check argv[0] and assume that it works. You can use + various extremely nonportable APIs on some Unices (dladdr, to name one) + to find out the real name (I presume there are similiarly hairy ways of + doing it on Windows). We then assume the EDC (Error Detection Code, aka + a hash) is stored in argv[0].edc + + Remember: argv[0] can be easily spoofed. Don't trust it for real. + + You can also do various nasty things and find out the path of the + shared library you are linked with, and check that hash. + */ + std::string exe_path = argv[0]; + std::string edc_path = exe_path + EDC_SUFFIX; + std::ifstream edc_file(edc_path.c_str()); + std::string edc; + std::getline(edc_file, edc); + + std::cout << "Our EDC is " << edc << std::endl; + + bool good = FIPS140::good_edc(exe_path, edc); + + if(good) + std::cout << "Our EDC matches" << std::endl; + else + std::cout << "Our EDC is bad" << std::endl; + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + } + return 0; + } diff --git a/doc/examples/hash.cpp b/doc/examples/hash.cpp new file mode 100644 index 000000000..a97cd6082 --- /dev/null +++ b/doc/examples/hash.cpp @@ -0,0 +1,64 @@ +/* +Prints the message digest of files, using an arbitrary hash function +chosen by the user. This is less flexible that I might like, for example: + ./hash sha1 some_file [or md5 or sha-1 or ripemd160 or ...] +will not work, cause the name lookup is case-sensitive. Oh well... + +Written by Jack Lloyd ([email protected]), on August 4, 2002 + - December 16, 2003: "Fixed" to accept "sha1" or "md5" as a hash name + +This file is in the public domain +*/ + +#include <iostream> +#include <fstream> +#include <botan/botan.h> + +int main(int argc, char* argv[]) + { + if(argc < 3) + { + std::cout << "Usage: " << argv[0] << " digest <filenames>" << std::endl; + return 1; + } + + Botan::LibraryInitializer init; + + std::string hash = argv[1]; + /* a couple of special cases, kind of a crock */ + if(hash == "sha1") hash = "SHA-1"; + if(hash == "md5") hash = "MD5"; + + try { + if(!Botan::have_hash(hash)) + { + std::cout << "Unknown hash \"" << argv[1] << "\"" << std::endl; + return 1; + } + + Botan::Pipe pipe(new Botan::Hash_Filter(hash), + new Botan::Hex_Encoder); + + int skipped = 0; + for(int j = 2; argv[j] != 0; j++) + { + std::ifstream file(argv[j]); + if(!file) + { + std::cout << "ERROR: could not open " << argv[j] << std::endl; + skipped++; + continue; + } + pipe.start_msg(); + file >> pipe; + pipe.end_msg(); + pipe.set_default_msg(j-2-skipped); + std::cout << pipe << " " << argv[j] << std::endl; + } + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + } + return 0; + } diff --git a/doc/examples/hash_fd.cpp b/doc/examples/hash_fd.cpp new file mode 100644 index 000000000..19d744287 --- /dev/null +++ b/doc/examples/hash_fd.cpp @@ -0,0 +1,70 @@ +/* +Written by Jack Lloyd ([email protected]), on Prickle-Prickle, +the 10th of Bureaucracy, 3167. + +This file is in the public domain + +This is just like the normal hash application, but uses the Unix I/O system +calls instead of C++ iostreams. Previously, this version was much faster and +smaller, but GCC 3.1's libstdc++ seems to have been improved enough that the +difference is now fairly minimal. + +Nicely enough, doing the change required changing only about 3 lines of code. + +Note that this requires you to be on a machine running some sort of Unix. Well, +I guess any POSIX.1 compliant OS (in theory). +*/ + +#include <iostream> +#include <botan/botan.h> + +#if !defined(BOTAN_EXT_PIPE_UNIXFD_IO) + #error "You didn't compile the pipe_unixfd module into Botan" +#endif + +#include <fcntl.h> +#include <unistd.h> + +int main(int argc, char* argv[]) + { + if(argc < 3) + { + std::cout << "Usage: " << argv[0] << " digest <filenames>" << std::endl; + return 1; + } + + Botan::LibraryInitializer init; + + try { + Botan::Pipe pipe(new Botan::Hash_Filter(argv[1]), + new Botan::Hex_Encoder); + + int skipped = 0; + for(int j = 2; argv[j] != 0; j++) + { + int file = open(argv[j], O_RDONLY); + if(file == -1) + { + std::cout << "ERROR: could not open " << argv[j] << std::endl; + skipped++; + continue; + } + pipe.start_msg(); + file >> pipe; + pipe.end_msg(); + close(file); + pipe.set_default_msg(j-2-skipped); + std::cout << pipe << " " << argv[j] << std::endl; + } + } + catch(Botan::Algorithm_Not_Found) + { + std::cout << "Don't know about the hash function \"" << argv[1] << "\"" + << std::endl; + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + } + return 0; + } diff --git a/doc/examples/hasher.cpp b/doc/examples/hasher.cpp new file mode 100644 index 000000000..5ba982fc0 --- /dev/null +++ b/doc/examples/hasher.cpp @@ -0,0 +1,58 @@ +/* +A Botan example application which emulates a +poorly written version of "gpg --print-md" + +Written by Jack Lloyd ([email protected]), quite a while ago (as of June +2001) + +This file is in the public domain +*/ +#include <fstream> +#include <iostream> +#include <string> +#include <botan/botan.h> + +int main(int argc, char* argv[]) + { + if(argc < 2) + { + std::cout << "Usage: " << argv[0] << " <filenames>" << std::endl; + return 1; + } + + Botan::LibraryInitializer init; + + const int COUNT = 3; + std::string name[COUNT] = { "MD5", "SHA-1", "RIPEMD-160" }; + + for(int j = 1; argv[j] != 0; j++) + { + Botan::Filter* hash[COUNT] = { + new Botan::Chain(new Botan::Hash_Filter(name[0]), + new Botan::Hex_Encoder), + new Botan::Chain(new Botan::Hash_Filter(name[1]), + new Botan::Hex_Encoder), + new Botan::Chain(new Botan::Hash_Filter(name[2]), + new Botan::Hex_Encoder) + }; + + Botan::Pipe pipe(new Botan::Fork(hash, COUNT)); + + std::ifstream file(argv[j]); + if(!file) + { + std::cout << "ERROR: could not open " << argv[j] << std::endl; + continue; + } + pipe.start_msg(); + file >> pipe; + pipe.end_msg(); + file.close(); + for(int k = 0; k != COUNT; k++) + { + pipe.set_default_msg(k); + std::cout << name[k] << "(" << argv[j] << ") = " << pipe << std::endl; + } + } + return 0; + } diff --git a/doc/examples/hasher2.cpp b/doc/examples/hasher2.cpp new file mode 100644 index 000000000..12d3c853d --- /dev/null +++ b/doc/examples/hasher2.cpp @@ -0,0 +1,72 @@ +/* +Identical to hasher.cpp, but uses Pipe in a different way. + +Note this tends to be much less efficient than hasher.cpp, because it does +three passes over the file. For a small file, it doesn't really matter. But for +a large file, or for something you can't re-read easily (socket, stdin, ...) +this is a bad idea. + +Written by Jack Lloyd ([email protected]), Feb 8 2001 + +This file is in the public domain +*/ +#include <fstream> +#include <iostream> +#include <string> +#include <botan/botan.h> + +int main(int argc, char* argv[]) + { + if(argc < 2) + { + std::cout << "Usage: " << argv[0] << " <filenames>" << std::endl; + return 1; + } + + Botan::LibraryInitializer init; + + const int COUNT = 3; + std::string name[COUNT] = { "MD5", "SHA-1", "RIPEMD-160" }; + + Botan::Pipe pipe; + + int skipped = 0; + for(int j = 1; argv[j] != 0; j++) + { + Botan::Filter* hash[COUNT] = { + new Botan::Hash_Filter(name[0]), + new Botan::Hash_Filter(name[1]), + new Botan::Hash_Filter(name[2]), + }; + + std::ifstream file(argv[j]); + if(!file) + { + std::cout << "ERROR: could not open " << argv[j] << std::endl; + skipped++; + continue; + } + for(int k = 0; k != COUNT; k++) + { + pipe.reset(); + pipe.append(hash[k]); + pipe.append(new Botan::Hex_Encoder); + pipe.start_msg(); + + // trickiness: the >> op reads until EOF, but seekg won't work + // unless we're in the "good" state (which EOF is not). + file.clear(); + file.seekg(0, std::ios::beg); + file >> pipe; + pipe.end_msg(); + } + file.close(); + for(int k = 0; k != COUNT; k++) + { + std::string out = pipe.read_all_as_string(COUNT*(j-1-skipped) + k); + std::cout << name[k] << "(" << argv[j] << ") = " << out << std::endl; + } + } + + return 0; + } diff --git a/doc/examples/pkcs10.cpp b/doc/examples/pkcs10.cpp new file mode 100644 index 000000000..4639353df --- /dev/null +++ b/doc/examples/pkcs10.cpp @@ -0,0 +1,68 @@ +/* +Generate a 1024 bit RSA key, and then create a PKCS #10 certificate request for +that key. The private key will be stored as an encrypted PKCS #8 object, and +stored in another file. + +Written by Jack Lloyd ([email protected]), April 7, 2003 + +This file is in the public domain +*/ +#include <botan/init.h> +#include <botan/x509self.h> +#include <botan/rsa.h> +#include <botan/dsa.h> +using namespace Botan; + +#include <iostream> +#include <fstream> + +int main(int argc, char* argv[]) + { + if(argc != 6) + { + std::cout << "Usage: " << argv[0] << + " passphrase name country_code organization email" << std::endl; + return 1; + } + + try { + LibraryInitializer init; + + RSA_PrivateKey priv_key(1024); + // If you want a DSA key instead of RSA, comment out the above line and + // uncomment this one: + //DSA_PrivateKey priv_key(DL_Group("dsa/jce/1024")); + + std::ofstream key_file("private.pem"); + key_file << PKCS8::PEM_encode(priv_key, argv[1]); + + X509_Cert_Options opts; + + opts.common_name = argv[2]; + opts.country = argv[3]; + opts.organization = argv[4]; + opts.email = argv[5]; + + /* Some hard-coded options, just to give you an idea of what's there */ + opts.challenge = "a fixed challenge passphrase"; + opts.locality = "Baltimore"; + opts.state = "MD"; + opts.org_unit = "Testing"; + opts.add_ex_constraint("PKIX.ClientAuth"); + opts.add_ex_constraint("PKIX.IPsecUser"); + opts.add_ex_constraint("PKIX.EmailProtection"); + + opts.xmpp = "[email protected]"; + + PKCS10_Request req = X509::create_cert_req(opts, priv_key); + + std::ofstream req_file("req.pem"); + req_file << req.PEM_encode(); + } + catch(std::exception& e) + { + std::cout << e.what() << std::endl; + return 1; + } + return 0; + } diff --git a/doc/examples/readme.txt b/doc/examples/readme.txt new file mode 100644 index 000000000..48686db71 --- /dev/null +++ b/doc/examples/readme.txt @@ -0,0 +1,77 @@ +This directory contains some simple example applications for the Botan crypto +library. If you want to see something a bit more complicated, check out the +stuff in the checks/ directory. Both it and the files in this directory are in +the public domain, and you may do with them however you please. + +The makefile assumes that you built the library with g++; you'll have to change +it if this assumption proves incorrect. + +Some of these examples will not build on all configurations of the library, +particularly 'bzip', 'encrypt', 'decrypt', and 'hash_fd', as they require +various extensions. + +The examples are fairly small (50-150 lines). And that's with argument +processing, I/O, error checking, etc (which counts for 40% or more of most of +them). This is partially to make them easy to understand, and partially because +I'm lazy. For the most part, the examples cover the stuff a 'regular' +application might need. + +Feel free to contribute new examples. You too can gain fame and fortune by +writing example apps for obscure libraries! + +The examples are: + +* X.509 examples +-------- +ca: A (very) simple CA application + +x509info: Prints some information about an X.509 certificate + +pkcs10: Generates a PKCS #10 certificate request for a 1024 bit RSA key + +self_sig: Generates a self-signed X.509v3 certificate with a 1024 bit RSA key +-------- + +* RSA examples (also uses X.509, PKCS #8, block ciphers, MACs, S2K algorithms) +-------- +rsa_kgen: Generate an RSA key, encrypt the private key with a passphrase, + output the keys to a pair of files +rsa_enc: Take a public key (generated by rsa_kgen) and encrypt a file + using CAST-128, MAC it with HMAC(SHA-1) +rsa_dec: Decrypt a file encrypted by rsa_enc + +* DSA examples (also uses X.509, PKCS #8) +-------- +dsa_kgen: Generates a DSA key, encrypts the private key with a passphrase + and stores it in PKCS #8 format. +dsa_sign: Produce a DSA signature for a file. Uses SHA-1 +dsa_ver: Verify a message signed with dsa_sign + +* Encryption examples +-------- +encrypt: Encrypt a file in CBC mode with a block cipher of your choice. Adds + a MAC for authentication, and compresses the plaintext with Zlib. + +decrypt: Decrypt the result of 'encrypt' + +xor_ciph: Shows how to add a new algorithm from application code + +* Hash function examples (also shows different methods of using Pipe) +-------- +hash: Print digests of files, using any chosen hash function + +hash_fd: Same as hash, except that it uses Unix file I/O. Requires the + pipe_unixfd extension + +hasher: Print MD5, SHA-1, and RIPEMD-160 digests of files + +hasher2: Same as hasher, just shows an alternate method + +stack: A demonstration of some more advanced Pipe functionality. Prints + MD5 hashes + +* Misc examples +-------- +base64: Simple base64 encoding/decoding tool + +bzip: Bzip2 compression/decompression. diff --git a/doc/examples/rsa_dec.cpp b/doc/examples/rsa_dec.cpp new file mode 100644 index 000000000..d50f6781a --- /dev/null +++ b/doc/examples/rsa_dec.cpp @@ -0,0 +1,122 @@ +/* +Decrypt an encrypted RSA private key. Then use that key to decrypt a +message. This program can decrypt messages generated by rsa_enc, and uses the +same key format as that generated by rsa_kgen. + +Written by Jack Lloyd ([email protected]), June 3-5, 2002 + +This file is in the public domain +*/ + +#include <iostream> +#include <fstream> +#include <string> + +#include <botan/botan.h> +#include <botan/look_pk.h> // for get_kdf +#include <botan/rsa.h> +using namespace Botan; + +SecureVector<byte> b64_decode(const std::string&); +SymmetricKey derive_key(const std::string&, const SymmetricKey&, u32bit); + +const std::string SUFFIX = ".enc"; + +int main(int argc, char* argv[]) + { + if(argc != 4) + { + std::cout << "Usage: " << argv[0] << " keyfile messagefile passphrase" + << std::endl; + return 1; + } + + try { + + LibraryInitializer init; + + std::auto_ptr<PKCS8_PrivateKey> key(PKCS8::load_key(argv[1], argv[3])); + RSA_PrivateKey* rsakey = dynamic_cast<RSA_PrivateKey*>(key.get()); + if(!rsakey) + { + std::cout << "The loaded key is not a RSA key!\n"; + return 1; + } + + std::ifstream message(argv[2]); + if(!message) + { + std::cout << "Couldn't read the message file." << std::endl; + return 1; + } + + std::string outfile(argv[2]); + outfile = outfile.replace(outfile.find(SUFFIX), SUFFIX.length(), ""); + + std::ofstream plaintext(outfile.c_str()); + if(!plaintext) + { + std::cout << "Couldn't write the plaintext to " + << outfile << std::endl; + return 1; + } + + std::string enc_masterkey_str; + std::getline(message, enc_masterkey_str); + std::string mac_str; + std::getline(message, mac_str); + + SecureVector<byte> enc_masterkey = b64_decode(enc_masterkey_str); + + std::auto_ptr<PK_Decryptor> decryptor(get_pk_decryptor(*rsakey, + "EME1(SHA-1)")); + SecureVector<byte> masterkey = decryptor->decrypt(enc_masterkey); + + SymmetricKey cast_key = derive_key("CAST", masterkey, 16); + InitializationVector iv = derive_key("IV", masterkey, 8); + SymmetricKey mac_key = derive_key("MAC", masterkey, 16); + + Pipe pipe(new Base64_Decoder, + get_cipher("CAST-128/CBC/PKCS7", cast_key, iv, DECRYPTION), + new Fork( + 0, + new Chain( + new MAC_Filter("HMAC(SHA-1)", mac_key, 12), + new Base64_Encoder + ) + ) + ); + + pipe.start_msg(); + message >> pipe; + pipe.end_msg(); + + std::string our_mac = pipe.read_all_as_string(1); + + if(our_mac != mac_str) + std::cout << "WARNING: MAC in message failed to verify\n"; + + plaintext << pipe.read_all_as_string(0); + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + return 1; + } + return 0; + } + +SecureVector<byte> b64_decode(const std::string& in) + { + Pipe pipe(new Base64_Decoder); + pipe.process_msg(in); + return pipe.read_all(); + } + +SymmetricKey derive_key(const std::string& param, + const SymmetricKey& masterkey, + u32bit outputlength) + { + std::auto_ptr<KDF> kdf(get_kdf("KDF2(SHA-1)")); + return kdf->derive_key(outputlength, masterkey.bits_of(), param); + } diff --git a/doc/examples/rsa_enc.cpp b/doc/examples/rsa_enc.cpp new file mode 100644 index 000000000..49b62989c --- /dev/null +++ b/doc/examples/rsa_enc.cpp @@ -0,0 +1,149 @@ +/* + Grab an RSA public key from the file given as an argument, grab a message + from another file, and encrypt the message. + + Algorithms used: + RSA with EME1(SHA-1) padding to encrypt the master key + CAST-128 in CBC mode with PKCS#7 padding to encrypt the message. + HMAC with SHA-1 is used to authenticate the message + + The keys+IV used are derived from the master key (the thing that's encrypted + with RSA) using KDF2(SHA-1). The 3 outputs of KDF2 are parameterized by P, + where P is "CAST", "IV" or "MAC", in order to make each key/IV unique. + + The format is: + 1) First line is the master key, encrypted with the recipients public key + using EME1(SHA-1), and then base64 encoded. + 2) Second line is the first 96 bits (12 bytes) of the HMAC(SHA-1) of + the _plaintext_ + 3) Following lines are base64 encoded ciphertext (CAST-128 as described), + each broken after ~72 characters. + +Written by Jack Lloyd ([email protected]), June 3, 2002 + Updated to use KDF2, September 8, 2002 + Updated to read X.509 keys, October 21, 2002 + +This file is in the public domain +*/ + +#include <iostream> +#include <fstream> +#include <string> +#include <memory> + +#include <botan/botan.h> +#include <botan/look_pk.h> +#include <botan/rsa.h> +using namespace Botan; + +std::string b64_encode(const SecureVector<byte>&); +SymmetricKey derive_key(const std::string&, const SymmetricKey&, u32bit); + +int main(int argc, char* argv[]) + { + if(argc != 3) + { + std::cout << "Usage: " << argv[0] << " keyfile messagefile" << std::endl; + return 1; + } + + std::ifstream message(argv[2]); + if(!message) + { + std::cout << "Couldn't read the message file." << std::endl; + return 1; + } + + std::string output_name(argv[2]); + output_name += ".enc"; + std::ofstream ciphertext(output_name.c_str()); + if(!ciphertext) + { + std::cout << "Couldn't write the ciphertext to " << output_name + << std::endl; + return 1; + } + + try { + + LibraryInitializer init; + + std::auto_ptr<X509_PublicKey> key(X509::load_key(argv[1])); + RSA_PublicKey* rsakey = dynamic_cast<RSA_PublicKey*>(key.get()); + if(!rsakey) + { + std::cout << "The loaded key is not a RSA key!\n"; + return 1; + } + + std::auto_ptr<PK_Encryptor> encryptor(get_pk_encryptor(*rsakey, + "EME1(SHA-1)")); + + /* Generate the master key (the other keys are derived from this) + + Basically, make the key as large as can be encrypted by this key, up + to a limit of 256 bits. For 512 bit keys, the master key will be >160 + bits. A >600 bit key will use the full 256 bit master key. + + In theory, this is not enough, because we derive 16+16+8=40 bytes of + secrets (if you include the IV) using the master key, so they are not + statistically indepedent. Practically speaking I don't think this is + a problem. + */ + SymmetricKey masterkey(std::min(32U, encryptor->maximum_input_size())); + + SymmetricKey cast_key = derive_key("CAST", masterkey, 16); + SymmetricKey mac_key = derive_key("MAC", masterkey, 16); + SymmetricKey iv = derive_key("IV", masterkey, 8); + + SecureVector<byte> encrypted_key = + encryptor->encrypt(masterkey.bits_of()); + + ciphertext << b64_encode(encrypted_key) << std::endl; + + Pipe pipe(new Fork( + new Chain( + get_cipher("CAST-128/CBC/PKCS7", cast_key, iv, + ENCRYPTION), + new Base64_Encoder(true) // true == do linebreaking + ), + new Chain( + new MAC_Filter("HMAC(SHA-1)", mac_key, 12), + new Base64_Encoder + ) + ) + ); + + pipe.start_msg(); + message >> pipe; + pipe.end_msg(); + + /* Write the MAC as the second line. That way we can pull it off right + from the start, and feed the rest of the file right into a pipe on the + decrypting end. + */ + + ciphertext << pipe.read_all_as_string(1) << std::endl; + ciphertext << pipe.read_all_as_string(0); + } + catch(std::exception& e) + { + std::cout << "Exception: " << e.what() << std::endl; + } + return 0; + } + +std::string b64_encode(const SecureVector<byte>& in) + { + Pipe pipe(new Base64_Encoder); + pipe.process_msg(in); + return pipe.read_all_as_string(); + } + +SymmetricKey derive_key(const std::string& param, + const SymmetricKey& masterkey, + u32bit outputlength) + { + std::auto_ptr<KDF> kdf(get_kdf("KDF2(SHA-1)")); + return kdf->derive_key(outputlength, masterkey.bits_of(), param); + } diff --git a/doc/examples/rsa_kgen.cpp b/doc/examples/rsa_kgen.cpp new file mode 100644 index 000000000..f2601101e --- /dev/null +++ b/doc/examples/rsa_kgen.cpp @@ -0,0 +1,55 @@ +/* +Generate an RSA key of a specified bitlength, and put it into a pair of key +files. One is the public key in X.509 format (PEM encoded), the private key is +in PKCS #8 format (also PEM encoded). + +Written by Jack Lloyd ([email protected]), June 2-3, 2002 + Updated to use X.509 and PKCS #8 on October 21, 2002 + +This file is in the public domain +*/ + +#include <iostream> +#include <fstream> +#include <string> +#include <botan/botan.h> +#include <botan/rsa.h> +using namespace Botan; + +int main(int argc, char* argv[]) + { + if(argc != 3) + { + std::cout << "Usage: " << argv[0] << " bitsize passphrase" << std::endl; + return 1; + } + + u32bit bits = std::atoi(argv[1]); + if(bits < 512 || bits > 4096) + { + std::cout << "Invalid argument for bitsize" << std::endl; + return 1; + } + + std::string passphrase(argv[2]); + + std::ofstream pub("rsapub.pem"); + std::ofstream priv("rsapriv.pem"); + if(!priv || !pub) + { + std::cout << "Couldn't write output files" << std::endl; + return 1; + } + + try { + LibraryInitializer init; + RSA_PrivateKey key(bits); + pub << X509::PEM_encode(key); + priv << PKCS8::PEM_encode(key, passphrase); + } + catch(std::exception& e) + { + std::cout << "Exception caught: " << e.what() << std::endl; + } + return 0; + } diff --git a/doc/examples/self_sig.cpp b/doc/examples/self_sig.cpp new file mode 100644 index 000000000..c117ff3aa --- /dev/null +++ b/doc/examples/self_sig.cpp @@ -0,0 +1,76 @@ +/* +Generate a 1024 bit RSA key, and then create a self-signed X.509v3 certificate +with that key. If the do_CA variable is set to true, then it will be marked for +CA use, otherwise it will get extensions appropriate for use with a client +certificate. The private key is stored as an encrypted PKCS #8 object in +another file. + +Written by Jack Lloyd ([email protected]), April 7, 2003 + +This file is in the public domain +*/ +#include <botan/botan.h> +#include <botan/x509self.h> +#include <botan/rsa.h> +#include <botan/dsa.h> +using namespace Botan; + +#include <iostream> +#include <fstream> + +int main(int argc, char* argv[]) + { + if(argc != 7) + { + std::cout << "Usage: " << argv[0] + << " passphrase [CA|user] name country_code organization email" + << std::endl; + return 1; + } + + LibraryInitializer init; + + std::string CA_flag = argv[2]; + bool do_CA = false; + + if(CA_flag == "CA") do_CA = true; + else if(CA_flag == "user") do_CA = false; + else + { + std::cout << "Bad flag for CA/user switch: " << CA_flag << std::endl; + return 1; + } + + try { + RSA_PrivateKey key(1024); + //DSA_PrivateKey key(DL_Group("dsa/jce/1024")); + + std::ofstream priv_key("private.pem"); + priv_key << PKCS8::PEM_encode(key, argv[1]); + + X509_Cert_Options opts; + + opts.common_name = argv[3]; + opts.country = argv[4]; + opts.organization = argv[5]; + opts.email = argv[6]; + /* Fill in other values of opts here */ + + //opts.xmpp = "[email protected]"; + + if(do_CA) + opts.CA_key(); + + X509_Certificate cert = X509::create_self_signed_cert(opts, key); + + std::ofstream cert_file("cert.pem"); + cert_file << cert.PEM_encode(); + } + catch(std::exception& e) + { + std::cout << "Exception: " << e.what() << std::endl; + return 1; + } + + return 0; + } diff --git a/doc/examples/stack.cpp b/doc/examples/stack.cpp new file mode 100644 index 000000000..1522b05f5 --- /dev/null +++ b/doc/examples/stack.cpp @@ -0,0 +1,86 @@ +/* +An Botan example application showing how to use the pop and prepend functions +of Pipe. Based on the md5 example. It's output should always be identical to +such. + +Written by Jack Lloyd ([email protected]), Feb 3, 2002 + +This file is in the public domain +*/ + +#include <iostream> +#include <fstream> +#include <botan/botan.h> + +int main(int argc, char* argv[]) + { + if(argc < 2) + { + std::cout << "Usage: " << argv[0] << " <filenames>" << std::endl; + return 1; + } + + Botan::LibraryInitializer init; + + // this is a pretty vacuous example, but it's useful as a test + Botan::Pipe pipe; + + // CPS == Current Pipe Status, ie what Filters are set up + + pipe.prepend(new Botan::Hash_Filter("MD5")); + // CPS: MD5 + + pipe.prepend(new Botan::Hash_Filter("RIPEMD-160")); + // CPS: RIPEMD-160 | MD5 + + pipe.prepend(new Botan::Chain( + new Botan::Hash_Filter("RIPEMD-160"), + new Botan::Hash_Filter("RIPEMD-160"))); + // CPS: (RIPEMD-160 | RIPEMD-160) | RIPEMD-160 | MD5 + + pipe.pop(); // will pop everything inside the Chain as well as Chain itself + // CPS: RIPEMD-160 | MD5 + + pipe.pop(); // will get rid of the RIPEMD-160 Hash_Filter + // CPS: MD5 + + pipe.prepend(new Botan::Hash_Filter("SHA-1")); + // CPS: SHA-1 | MD5 + + pipe.append(new Botan::Hex_Encoder); + // CPS: SHA-1 | MD5 | Hex_Encoder + + pipe.prepend(new Botan::Hash_Filter("SHA-1")); + // CPS: SHA-1 | SHA-1 | MD5 | Hex_Encoder + + pipe.pop(); // Get rid of the Hash_Filter(SHA-1) + pipe.pop(); // Get rid of the other Hash_Filter(SHA-1) + // CPS: MD5 | Hex_Encoder + // The Hex_Encoder is safe because it is at the end of the Pipe, + // and pop() pulls off the Filter that is at the start. + + pipe.prepend(new Botan::Hash_Filter("RIPEMD-160")); + // CPS: RIPEMD-160 | MD5 | Hex_Encoder + + pipe.pop(); // Get rid of that last prepended Hash_Filter(RIPEMD-160) + // CPS: MD5 | Hex_Encoder + + int skipped = 0; + for(int j = 1; argv[j] != 0; j++) + { + std::ifstream file(argv[j]); + if(!file) + { + std::cout << "ERROR: could not open " << argv[j] << std::endl; + skipped++; + continue; + } + pipe.start_msg(); + file >> pipe; + pipe.end_msg(); + file.close(); + pipe.set_default_msg(j-1-skipped); + std::cout << pipe << " " << argv[j] << std::endl; + } + return 0; + } diff --git a/doc/examples/x509info.cpp b/doc/examples/x509info.cpp new file mode 100644 index 000000000..cbb9c0a11 --- /dev/null +++ b/doc/examples/x509info.cpp @@ -0,0 +1,142 @@ +/* + Read an X.509 certificate, and print various things about it + + Written by Jack Lloyd, March 23 2003 + - October 31, 2003: Prints the public key + - November 1, 2003: Removed the -d flag; it can tell automatically now + + This file is in the public domain +*/ +#include <botan/botan.h> +#include <botan/x509cert.h> +#include <botan/oids.h> +using namespace Botan; + +#include <iostream> + +std::string to_hex(const SecureVector<byte>& bin) + { + Pipe pipe(new Hex_Encoder); + pipe.process_msg(bin); + if(pipe.remaining()) + return pipe.read_all_as_string(); + else + return "(none)"; + } + +void do_print(const std::string& what, const std::string& val) + { + if(val == "") + return; + + std::cout << " " << what << ": " << val << std::endl; + } + +void do_subject(const X509_Certificate& cert, const std::string& what) + { + do_print(what, cert.subject_info(what)); + } + +void do_issuer(const X509_Certificate& cert, const std::string& what) + { + do_print(what, cert.issuer_info(what)); + } + +int main(int argc, char* argv[]) + { + if(argc != 2) + { + std::cout << "Usage: " << argv[0] << " <x509cert>\n"; + return 1; + } + + try { + LibraryInitializer init; + + X509_Certificate cert(argv[1]); + + std::cout << "Version: " << cert.x509_version() << std::endl; + + std::cout << "Subject" << std::endl; + do_subject(cert, "Name"); + do_subject(cert, "Email"); + do_subject(cert, "Organization"); + do_subject(cert, "Organizational Unit"); + do_subject(cert, "Locality"); + do_subject(cert, "State"); + do_subject(cert, "Country"); + do_subject(cert, "PKIX.XMPPAddr"); + + std::cout << "Issuer" << std::endl; + do_issuer(cert, "Name"); + do_issuer(cert, "Email"); + do_issuer(cert, "Organization"); + do_issuer(cert, "Organizational Unit"); + do_issuer(cert, "Locality"); + do_issuer(cert, "State"); + do_issuer(cert, "Country"); + + std::cout << "Validity" << std::endl; + + std::cout << " Not before: " << cert.start_time() << std::endl; + std::cout << " Not after: " << cert.end_time() << std::endl; + + std::cout << "Constraints" << std::endl; + Key_Constraints constraints = cert.constraints(); + if(constraints == NO_CONSTRAINTS) + std::cout << "No constraints" << std::endl; + else + { + if(constraints & DIGITAL_SIGNATURE) + std::cout << " Digital Signature\n"; + if(constraints & NON_REPUDIATION) + std::cout << " Non-Repuidation\n"; + if(constraints & KEY_ENCIPHERMENT) + std::cout << " Key Encipherment\n"; + if(constraints & DATA_ENCIPHERMENT) + std::cout << " Data Encipherment\n"; + if(constraints & KEY_AGREEMENT) + std::cout << " Key Agreement\n"; + if(constraints & KEY_CERT_SIGN) + std::cout << " Cert Sign\n"; + if(constraints & CRL_SIGN) + std::cout << " CRL Sign\n"; + } + + std::vector<OID> policies = cert.policies(); + if(policies.size()) + { + std::cout << "Policies: " << std::endl; + for(u32bit j = 0; j != policies.size(); j++) + std::cout << " " << OIDS::lookup(policies[j]) << std::endl; + } + + std::vector<OID> ex_constraints = cert.ex_constraints(); + if(ex_constraints.size()) + { + std::cout << "Extended Constraints: " << std::endl; + for(u32bit j = 0; j != ex_constraints.size(); j++) + std::cout << " " << OIDS::lookup(ex_constraints[j]) << std::endl; + } + + std::cout << "Signature algorithm: " << + OIDS::lookup(cert.signature_algorithm().oid) << std::endl; + + std::cout << "Serial: " + << to_hex(cert.serial_number()) << std::endl; + std::cout << "Authority keyid: " + << to_hex(cert.authority_key_id()) << std::endl; + std::cout << "Subject keyid: " + << to_hex(cert.subject_key_id()) << std::endl; + + X509_PublicKey* pubkey = cert.subject_public_key(); + std::cout << "Public Key:\n" << X509::PEM_encode(*pubkey); + delete pubkey; + } + catch(std::exception& e) + { + std::cout << e.what() << std::endl; + return 1; + } + return 0; + } diff --git a/doc/examples/xor_ciph.cpp b/doc/examples/xor_ciph.cpp new file mode 100644 index 000000000..b57fbfc4d --- /dev/null +++ b/doc/examples/xor_ciph.cpp @@ -0,0 +1,95 @@ +/* + An implementation of the highly secure (not) XOR cipher. AKA, how to write + and use your own cipher object. DO NOT make up your own ciphers. Please. + + Written by Jack Lloyd ([email protected]) on Feb 17, 2004 + + This file is in the public domain +*/ +#include <botan/base.h> +#include <botan/init.h> +using namespace Botan; + +class XOR_Cipher : public StreamCipher + { + public: + // what we want to call this cipher + std::string name() const { return "XOR"; } + // return a new object of this type + StreamCipher* clone() const { return new XOR_Cipher; } + // StreamCipher() can take a number of args, which are: + // min keylen, max keylen, keylength mod, iv size + // In this case we just pass min keylen, which means the + // only keysize we support is 1 byte, and don't use an IV. + XOR_Cipher() : StreamCipher(1) { mask = 0; } + private: + void cipher(const byte[], byte[], u32bit); + void key(const byte[], u32bit); + byte mask; + }; + +void XOR_Cipher::cipher(const byte in[], byte out[], u32bit length) + { + for(u32bit j = 0; j != length; j++) + out[j] = in[j] ^ mask; + } + +void XOR_Cipher::key(const byte key[], u32bit) + { + /* We know length == 1 because it is checked in set_key against the + constraints we passed to StreamCipher's constructor. In this case, + we said: "All keys are of length 1 byte and no other length". + + An excercise for the reader would be to extend this to support + arbitrary length (for arbitrary in the range 1-32) keys. + */ + mask = key[0]; + } + +#include <fstream> +#include <iostream> +#include <string> +#include <vector> +#include <cstring> + +#include <botan/look_add.h> +#include <botan/lookup.h> +#include <botan/filters.h> + +int main() + { + LibraryInitializer init; + + add_algorithm(new XOR_Cipher); // make it available to use + add_alias("Vernam", "XOR"); // make Vernam an alias for XOR + + SymmetricKey key("42"); // a key of length 1, value hex 42 == dec 66 + + /* + Since stream ciphers are typically additive, the encryption and + decryption ops are the same, so this isn't terribly interesting. + + If this where a block cipher you would have to add a cipher mode and + padding method, such as "/CBC/PKCS7". + */ + Pipe enc(get_cipher("XOR", key, ENCRYPTION), new Hex_Encoder); + Pipe dec(new Hex_Decoder, get_cipher("Vernam", key, DECRYPTION)); + + // I think the pigeons are actually asleep at midnight... + std::string secret = "The pigeon flys at midnight."; + + std::cout << "The secret message is '" << secret << "'" << std::endl; + + enc.process_msg(secret); + std::string cipher = enc.read_all_as_string(); + + std::cout << "The encrypted secret message is " << cipher << std::endl; + + dec.process_msg(cipher); + secret = dec.read_all_as_string(); + + std::cout << "The decrypted secret message is '" + << secret << "'" << std::endl; + + return 0; + } diff --git a/doc/fips140.tex b/doc/fips140.tex new file mode 100644 index 000000000..8b2004508 --- /dev/null +++ b/doc/fips140.tex @@ -0,0 +1,156 @@ +\documentclass{article} + +\setlength{\textwidth}{6.5in} +\setlength{\textheight}{9in} + +\setlength{\headheight}{0in} +\setlength{\topmargin}{0in} +\setlength{\headsep}{0in} + +\setlength{\oddsidemargin}{0in} +\setlength{\evensidemargin}{0in} + +\title{\textbf{Botan FIPS 140-2 Security Policy}} +\author{Jack Lloyd \\ + \texttt{[email protected]}} +\date{} + +\newcommand{\filename}[1]{\texttt{#1}} +\newcommand{\module}[1]{\texttt{#1}} + +\newcommand{\type}[1]{\texttt{#1}} +\newcommand{\function}[1]{\textbf{#1}} +\newcommand{\macro}[1]{\texttt{#1}} + +\begin{document} + +\maketitle + +\tableofcontents + +\parskip=5pt +%\baselineskip=15pt + +\pagebreak + +\section{Introduction} + +\emph{Note that this is a draft, and almost certainly does not comply with what +FIPS 140-2 wants (also it's incomplete). In any case, there is no way for me to +afford paying the validation lab, so this is all theoretical.} + +\emph{I would welcome comments from people who are familiar with the FIPS 140 +process. I am currently basing this off a few dozen other security policies and +the FIPS itself.} + +\subsection{Purpose} + +This document is a security policy for the Botan C++ crypto library for use in +a FIPS 140-2 Level 1 validation process. It describes how to configure and use +the library to comply with the requirements of FIPS 140-2. + +This document is non-proprietary, and may be freely reproduced and distributed +in unmodified form. + +\subsection{Product Description} + +The Botan C++ crypto library (hereafter ``Botan'' or ``the library'') is an +open source C++ class library providing a general-purpose interface to a wide +variety of cryptographic algorithms and formats (such as X.509v3 and PKCS +\#10). It runs on most Win32 and POSIX-like systems, including Windows +NT/2000/XP, MacOS X, Linux, Solaris, FreeBSD, and QNX. However, only versions +running on \emph{(goal:)} Windows XP, Linux, and Solaris have been validated by +FIPS 140-2 at this time. + +\subsection{Algorithms} + +The library contains the following FIPS Approved algorithms: RSA, DSA, DES, +TripleDES, Skipjack, AES, SHA-1, HMAC, the X9.19 DES MAC, and the FIPS 186-2 +SHA-1 RNG. Other (non-Approved) algorithms, such as MD5 and Diffie-Hellman, are +also included. + +\section{Initialization} + +Certain tests are only performed if the flag ``fips140'' is passed as part of +the initialization process to the library (the argument to +\type{LibraryInitializer} or \function{Init::initialize}). Known answer tests +and key generation self-checks for RSA and DSA are always performed, regardless +of this setting. This flag must be passed by any application which desires +using the FIPS 140 mode of operation. + +\section{Roles and Services} + +Botan supports two roles, the User and the Crypto Officer. Authentication is +not performed by the module; all authentication is implicitly done by the +operating system. + +\subsection{User Role} + +The user has the ability to access the services of the module. This role is +implicitly selected whenever the module's services are accessed. + +\subsection{Crypto Officer Role} + +The crypto officer has all of the powers of the user, and in addition has the +power to install and uninstall the module and to configure the operating +system. This role is implicitly selected whenever these actions are performed. + +\section{Key Management} + +\subsection{Key Import/Export} + +Symmetric keys can be imported and exported in either unencrypted, encrypted, +or split-knowledge forms, as the application desires. Private keys for +asymmetric algorithms can be imported and exported as either encrypted or +unencrypted PKCS \#8 structures. The library natively supports PKCS \#5 +encryption with TripleDES for encrypting private keys. + +\subsection{Key Storage} + +In no case does the library itself import or export keys from/to an external +storage device; all such operations are done explicitly by the application. It +is the responsibly of the operator to ensure than any such operations comply +with the requirements of FIPS 140-2 Level 1. + +\subsection{Key Generation} + +Keys for symmetric algorithms (such as DES, AES, and HMAC) are generated by an +Approved RNG, by generating a random byte string of the appropriate size, and +using it as a key. + +DSA keys are generated as specified in FIPS 186-2 (or not?). RSA keys are +generated as specified in ANSI X9.31 (\emph{I think...}). Diffie-Hellman keys +are generated in a manner compatible with ANSI X9.42. All newly created DSA and +RSA keys are checked with a pairwise consistency test before being returned to +the caller. A pairwise consistency check can be performed on any RSA, DSA, or +Diffie-Hellman key by calling the \function{check\_key} member function with +an argument of \type{true}. + +\subsection{Key Establishment} + +Botan supports using RSA or Diffie-Hellman to establish keys. RSA can be used +with PKCS \#1 v1.5 or OAEP padding. None of these methods are FIPS Approved, +but Annex D of FIPS 140-2 allows for their use until such time as a FIPS +Approved asymmetric key establishment method is established. + +\subsection{Key Protection / Zeroization} + +Keys are protected against external access by the operating system's memory and +process protection mechanisms. If the library is used by multiple processes at +once, the OS virtual memory mechanisms ensure that each version will have it's +own data space (and thus, keys are not shared among multiple processes). + +All keys and other sensitive materials are zeroed in memory before being +released to the system. + +On Windows systems the \function{VirtualLock} system call is used to notify the +operating system that the memory containing potentially sensitive keying +material is not swapped to disk, preventing an attacker from applying disk +forenistics techniques to recovery data. + +On Unix systems, Botan allocates memory from file-backed memory mappings, which +are thoroughly erased when the memory is freed. + +\section{References} + +\end{document} diff --git a/doc/license.txt b/doc/license.txt new file mode 100644 index 000000000..28fb5f587 --- /dev/null +++ b/doc/license.txt @@ -0,0 +1,23 @@ +Copyright (C) 1999-2006 The Botan Project. All rights reserved. + +Redistribution and use in source and binary forms, for any use, with or without +modification, is permitted provided that the following conditions are met: + +1. Redistributions of source code must retain the above copyright notice, this +list of conditions, and the following disclaimer. + +2. Redistributions in binary form must reproduce the above copyright notice, +this list of conditions, and the following disclaimer in the documentation +and/or other materials provided with the distribution. + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR(S) "AS IS" AND ANY EXPRESS OR IMPLIED +WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF +MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ARE DISCLAIMED. + +IN NO EVENT SHALL THE AUTHOR(S) OR CONTRIBUTOR(S) BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, +BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF +LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE +OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF +ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/doc/log.txt b/doc/log.txt new file mode 100644 index 000000000..0505b6da2 --- /dev/null +++ b/doc/log.txt @@ -0,0 +1,87 @@ + +* 1.5.6, March 1, 2006 + - The low-level DER/BER coding system was redesigned and rewritten + - Portions of the certificate code were cleaned up internally + - Use macros to substantially clean up the GCC assembly code + - Added 32-bit x86 assembly for Visual C++ (by Luca Piccarreta) + - Avoid a couple of spurious warnings under Visual C++ + - Some slight cleanups in X509_PublicKey::key_id + +* 1.5.5, February 4, 2006 + - Fixed a potential infinite loop in the memory pool code (Matt Johnston) + - Made Pooling_Allocator::Memory_Block an actual class of sorts + - Some small optimizations to the division and modulo computations + - Cleaned up the implementation of some of the BigInt operators + - Reduced use of dynamic memory allocation in low-level BigInt functions + - A few simplifications in the Randpool mixing function + - Removed power(), as it was not particularly useful (or fast) + - Fixed some annoying bugs in the benchmark code + - Added a real credits file + +* 1.5.4, January 29, 2006 + - Integrated x86 and amd64 assembly code, contributed by Luca Piccarreta + - Fixed a memory access off-by-one in the Karatsuba code + - Changed Pooling_Allocator's free list search to a log(N) algorithm + - Merged ModularReducer with its only subclass, Barrett_Reducer + - Fixed sign-handling bugs in some of the division and modulo code + - Renamed the module description files to modinfo.txt + - Further cleanups in the initialization code + - Removed BigInt::add and BigInt::sub + - Merged all the division-related functions into just divide() + - Modified the <mp_asmi.h> functions to allow for better optimizations + - Made the number of bits polled from an EntropySource user configurable + - Avoid including <algorithm> in <botan/secmem.h> + - Fixed some build problems with Sun Forte + - Removed some dead code from bigint_modop + - Fix the definition of same_mem + +* 1.5.3, January 24, 2006 + - Many optimizations in the low-level multiple precision integer code + - Added hooks for assembly implementations of the MPI code + - Support for the X.509 issuer alternative name extension in new certs + - Fixed a bug in the decompression modules; found and patched by Matt Johnston + - New Windows mutex module (mux_win32), by Luca Piccarreta + - Changed the Windows timer module to use QueryPerformanceCounter + - mem_pool.cpp was using std::set iterators instead of std::multiset ones + - Fixed a bug in X509_CA preventing users from disabling particular extensions + - Fixed the mp_asm64 module, which was entirely broken in 1.5.2 + - Fixed some module build problems on FreeBSD and Tru64 + +* 1.5.2, January 15, 2006 + - Fixed an off-by-one memory read in MISTY1::key() + - Fixed a nasty memory leak in Output_Buffers::retire() + - Reimplemented the memory allocator for scratch + - Improved memory caching in Montgomery exponentiation + - Optimizations for multiple precision addition and subtraction + - Fixed a build problem in the hardware timer module on 64-bit PowerPC + - Changed default Karatsuba cutoff to 12 words (was 14) + - Removed MemoryRegion::bits(), which was unused and incorrect + - Changed maximum HMAC keylength to 1024 bits + - Various minor Makefile and build system changes + - Avoid using std::min in <secmem.h> to bypass Windows libc macro pollution + - Switched checks/clock.cpp back to using clock() by default + - Enabled the symmetric algorithm tests, which were accidentally off in 1.5.1 + - Removed the Default_Mutex's unused clone() member function + +* 1.5.1, January 8, 2006 + - Implemented Montgomery exponentiation + - Implemented generalized Karatsuba multiplication and squaring + - Implemented Comba squaring for 4, 6, and 8 word inputs + - Added new Modular_Exponentiator and Power_Mod classes + - Removed FixedBase_Exp and FixedExponent_Exp + - Fixed a performance regression in get_allocator introduced in 1.5.0 + - Engines can now offer S2K algorithms and block cipher padding methods + - Merged the remaining global 'algolist' code into Default_Engine + - The low-level MPI code is linked as C again + - Replaced BigInt's get_nibble with the more general get_substring + - Some documentation updates + +* 1.5.0, January 1, 2006 + - Moved all global/shared library state into a single object + - Mutex objects are created through mutex factories instead of a global + - Removed ::get_mutex(), ::initialize_mutex(), and Mutex::clone() + - Removed the RNG_Quality enum entirely + - There is now only a single global-use PRNG + - Removed the no_aliases and no_oids options for LibraryInitializer + - Removed the deprecated algorithms SEAL, ISAAC, and HAVAL + - Change es_ftw to use unbuffered I/O diff --git a/doc/misc/indent.el b/doc/misc/indent.el new file mode 100644 index 000000000..9811bf848 --- /dev/null +++ b/doc/misc/indent.el @@ -0,0 +1,57 @@ +; This Emacs Lips code defines the indentation style used in Botan. If doesn't +; get everything perfectly correct, but it's pretty close. Copy this code into +; your .emacs file, or use M-x eval-buffer. Make sure to also set +; indent-tabs-mode to nil so spaces are inserted instead. + +; This style is basically Whitesmiths style with 3 space indents (the Emacs +; "whitesmith" style seems more like a weird Whitesmiths/Allman mutant style). + +; To activate using this style, open the file you want to edit and run this: +; M-x c-set-style <RET> and then enter "botan". Alternately, put something +; like this in your .emacs file to make it the default style: + +; (add-hook 'c++-mode-common-hook +; (function (lambda() +; (c-add-style "botan" botan t)))) + +(setq botan '( + (c-basic-offset . 3) + (c-comment-only-line-offset . 0) + (c-offsets-alist + (c . 0) + (comment-intro . 0) + + (statement-block-intro . 0) + (statement-cont . +) + + (substatement . +) + (substatement-open . +) + + (block-open . +) + (block-close . 0) + + (defun-open . +) + (defun-close . 0) + (defun-block-intro . 0) + (func-decl-cont . +) + + (class-open . +) + (class-close . +) + (inclass . +) + (access-label . -) + (inline-open . +) + (inline-close . 0) + + (extern-lang-open . 0) + (extern-lang-close . 0) + (inextern-lang . 0) + + (statement-case-open +) + + (namespace-open . 0) + (namespace-close . 0) + (innamespace . 0) + + (label . 0) + ) +)) diff --git a/doc/misc/internals.tex b/doc/misc/internals.tex new file mode 100644 index 000000000..81eca146a --- /dev/null +++ b/doc/misc/internals.tex @@ -0,0 +1,153 @@ +\documentclass{article} + +\setlength{\textwidth}{6.75in} % 1 inch side margins +\setlength{\textheight}{9in} % ~1 inch top and bottom margins + +\setlength{\headheight}{0in} +\setlength{\topmargin}{0in} +\setlength{\headsep}{0in} + +\setlength{\oddsidemargin}{0in} +\setlength{\evensidemargin}{0in} + +\title{Botan Internals} +\author{Jack Lloyd ([email protected])} +\date{July 30, 2002} + +\newcommand{\filename}[1]{\texttt{#1}} +\newcommand{\manpage}[2]{\texttt{#1}(#2)} + +\newcommand{\function}[1]{\textbf{#1}} +\newcommand{\type}[1]{\texttt{#1}} +\renewcommand{\arg}[1]{\textsl{#1}} + +\begin{document} + +\maketitle + +\tableofcontents + +\parskip=5pt + +\section{Introduction} + +This document is intended to document some of the trickier and/or more +complicated parts of Botan. This is not going to be terribly useful if you +just want to use the library, but for people wishing to understand how it +works, or contribute new code to it, it will hopefully prove helpful. + +I've realized that a lot of things Botan does internally are pretty hard to +understand, and that a lot of things are only inside my head, which is a bad +place for them to be (things tend to get lost in there, not to mention the +possibility that I'll get hit by a truck next week). + +This document is currently very incomplete. I'll be working on it as I have +time. + +\pagebreak + +\section{Filter} + +Need something here. + +\section{Pipe} + +Pipe is, conceptually, a tree structure of Filter objects. There is a single +unique top, and an arbitrary number of leaves (which are SecureQueue objects). +SecureQueue is a simple Filter that buffers it's input. + +Writing into the pipe writes into the top of the tree. The filter at the top +of the tree writes it's output into the next Filter, and so on until eventually +data trickles down into the bottommost Filters, where the data is stored for +later retrieval. + +When a new message is started, Pipe searches through the tree of Filters and +finds places where the \arg{next} field of the Filter is NULL. This implies +that it was the lowest layer of the Filter tree that the user added. It then +adds SecureQueue objects onto these Filters. These queues are also stored in a +\type{std::vector} called \arg{messages}. This is how the Pipe knows how to +read from them later without doing a tree traversal every time. + +Pipe will, if asked, destroy the existing tree structure, in order to create a +new one. However, the queue objects are not deleted, because Pipe might need to +read from them later. + +An optimization in future versions will involve deleting empty queues that we +``know'' can't be written to, and then replace their field in \arg{messages} +with NULL. On reading, Pipe will know that this means that the queue is empty, +and act as if such a queue was really there. This is relatively minor, because +in recent versions an empty queue only takes up a few dozen bytes (previous to +0.8.4 or so, an empty queue still took up 4 kilobytes of memory). + +\section{Library Initialization} + +A lot of messy corner cases. + +\section{Lookup Mechanism} + +Most objects know their name, and they know how to create a new copy of +themselves. We build mapping tables that map from an algorithm name into a +single instance of that algorithm. The tables themselves can be found in +\filename{src/lookup.cpp}. + +There are a set of functions named \function{add\_algorithm} that can be used +to populate the tables. We get something out of the table with +\function{retrieve\_x}, where x is the name of a type (\texttt{block\_cipher}, +\texttt{hash}, etc). This returns a const pointer to the single unique instance +of the algorithm that the lookup tables know about. If it doesn't know about +it, it falls back on calling a function called +\function{try\_to\_get\_x}. These functions live in +\filename{src/algolist.cpp}. They are mostly used to handle algorithms which +need (or at least can have) arguments passed to them, like \type{HMAC} and +\type{SAFER\_SK}. It will return NULL if it can't find the algorithm at all. + +When it's asked for an algorithm it doesn't know about (ie, isn't in the +mapping tables), the retrieval functions will ask the try-to-get functions if +\emph{they} know about it. If they do, then the object returned will be stored +into the table for later retrieval. + +The functions \function{get\_x} call the retrieval functions. If we get back +NULL, an exception is thrown. Otherwise it will call the \function{clone} +method to get a new copy of the algorithm, which it returns. + +The various functions like \function{output\_length\_of} call the retrieval +function for each type of object that the parameter in question (in this case, +\texttt{OUTPUT\_LENGTH}) might be meaningful for. If it manages to get back an +object, it will return (in this case) the \texttt{OUTPUT\_LENGTH} field of the +object. No allocations are required to call this function: all of it's +operations work directly on the copies living in the lookup tables. + +\section{Allocators} + +A big (slow) mess. + +\section{BigInt} + +Read ``Handbook of Applied Cryptography''. + +\section{PEM/BER Identification} + +We have a specific algorithm for figuring out if something is PEM or +BER. Previous versions (everything before 1.3.0) requried that the caller +specify which one it was, and they had to be right. Now we use a hueristic +(aka, an algorithm that sometimes doesn't work right) to figure it out. If the +first character is not 0x30 (equal to ASCII '0'), then it can't possibly be BER +(because everything we care about is enclosed in an ASN.1 SEQUENCE, which for +BER/DER is encoded as beginning with 0x30). Roughly 99.9% of PEM blocks +\emph{won't} have a random 0 character in front of them, so we are mostly safe +(unless someone does it on purpose, in which case, please hit them for me). +But to be sure, if there is a 0, then we search the first \emph{N} bytes of the +block for the string ``-----BEGIN ``, which marks the typical start of a PEM +block. The specific \emph{N} depends on the variable ``base/pem\_search'', +which defaults to 4 kilobytes. + +So, you can actually fool it either way: that a PEM file is really BER, or that +a BER file is actually PEM. To fool it that a BER file is PEM, just have the +string ``-----BEGIN `` somewhere (I can't imagine this string shows up in +certificates or CRLs too often, so if it is there it means somebody is being a +jerk). If a file starts with 0 and has at least ``base/pem\_search'' byte more +junk in the way, it won't notice that it's PEM at all. In either case, of +course, the loading will fail, and you'll get a nice exception saying that the +decoding failed. + +\end{document} diff --git a/doc/misc/log-07.txt b/doc/misc/log-07.txt new file mode 100644 index 000000000..a385bbbb7 --- /dev/null +++ b/doc/misc/log-07.txt @@ -0,0 +1,125 @@ + +* 0.7.10, April 7, 2002 + - Added EGD_EntropySource module (es_egd) + - Added a file tree walking EntropySource (es_ftw) + - Added MemoryLocking_Allocator module (alloc_mlock) + - Renamed the pthr_mux, unix_rnd, and mmap_mem modules + - Changed timer mechanism; the clock method can be switched on the fly. + - Renamed MmapDisk_Allocator to MemoryMapping_Allocator + - Renamed ent_file.h to es_file.h (ent_file.h is around, but deprecated) + - Fixed several bugs in MemoryMapping_Allocator + - Added more default sources for Unix_EntropySource + - Changed SecureBuffer to use same allocation methods as SecureVector + - Added bigint_divcore into mp_core to support BigInt alpha2 release + - Removed some Pipe functions deprecated since 0.7.8 + - Some fixes for the configure program + +* 0.7.9, March 19, 2002 + - Memory allocation substantially revamped + - Added memory allocation method based on mmap(2) in the mmap_mem module + - Added ECB and CTS block cipher modes (ecb.h, cts.h) + - Added a Mutex interface (mutex.h) + - Added module pthr_mux, implementing the Mutex interface + - Added Threaded Filter interface (thr_filt.h) + - All algorithms can now by keyed with SymmetricKey objects + - More testing occurs with --validate (expected failures) + - Fixed two bugs reported by Hany Greiss, in Luby-Rackoff and RC6 + - Fixed a buffering bug in Bzip_Decompress and Zlib_Decompress + - Made X917 safer (and about 1/3 as fast) + - Documentation updates + +* 0.7.8, February 28, 2002 + - More capabilities for Pipe, inspired by SysV STREAMS, including peeking, + better buffering, and stack ops. NOT BACKWARDS COMPATIBLE: SEE DOCUMENTATION + - Added a BufferingFilter class + - Added popen() based EntropySource for generic Unix systems (unix_rnd) + - Moved 'devrand' module into main distribution (ent_file.h), renamed to + File_EntropySource, and changed interface somewhat. + - Made Randpool somewhat more conservative and also 25% faster + - Minor fixes and updates for the configure script + - Added some tweaks for memory allocation + - Documentation updates for the new Pipe interface + - Fixed various minor bugs + - Added a couple of new example programs (stack and hasher2) + +* 0.7.7, November 24, 2001 + - Filter::send now works in the constructor of a Filter subclass + - You may now have to include <opencl/pipe.h> explicitly in some code + - Added preliminary PK infrastructure classes in pubkey.h and pkbase.h + - Enhancements to SecureVector (append, destroy functions) + - New infrastructure for secure memory allocation + - Added IEEE P1363 primitives MGF1, EME1, KDF1 + - Rijndael optimizations and cleanups + - Changed CipherMode<B> to BlockCipherMode(B*) + - Fixed a nasty bug in pipe_unixfd + - Added portions of the BigInt code into the main library + - Support for VAX, SH, POWER, PowerPC-64, Intel C++ + +* 0.7.6, October 14, 2001 + - Fixed several serious bugs in SecureVector created in 0.7.5 + - Square optimizations + - Fixed shared objects on MacOS X and HP-UX + - Fixed static libs for KCC 4.0; works with KCC 3.4g as well + - Full support for Athlon and K6 processors using GCC + - Added a table of prime numbers < 2**16 (primes.h) + - Some minor documentation updates + +* 0.7.5, August 19, 2001 + - Split checksum.h into adler32.h, crc24.h, and crc32.h + - Split modes.h into cbc.h, cfb.h, and ofb.h + - CBC_wPadding* has been replaced by CBC_Encryption and CBC_Decryption + - Added OneAndZeros and NoPadding methods for CBC + - Added Lion, a very fast block cipher construction + - Added an S2K base class (s2k.h) and an OpenPGP_S2K class (pgp_s2k.h) + - Basic types (ciphers, hashes, etc) know their names now (call name()) + - Changed the EntropySource type somewhat + - Big speed-ups for ISAAC, Adler32, CRC24, and CRC32 + - Optimized CAST-256, DES, SAFER-SK, Serpent, SEAL, MD2, and RIPEMD-160 + - Some semantics of SecureVector have changed slightly + - The mlock module has been removed for the time being + - Added string handling functions for hashes and MACs + - Various non-user-visible cleanups + - Shared library soname is now set to the full version number + +* 0.7.4, July 15, 2001 + - New modules: Zlib, gettimeofday and x86 RTC timers, Unix I/O for Pipe + - Fixed a vast number of errors in the config script/makefile/specfile + - Pipe now has a stdio(3) interface as well as C++ iostreams + - ARC4 supports skipping the first N bytes of the cipher stream (ala MARK4) + - Bzip2 supports decompressing multiple concatenated streams, and flushing + - Added a simple 'overall average' score to the benchmarks + - Fixed a small bug in the POSIX timer module + - Removed a very-unlikely-to-occur bug in most of the hash functions + - filtbase.h now includes <iosfwd>, not <iostream> + - Minor documentation updates + +* 0.7.3, June 8, 2001 + - Fix build problems on Solaris/SPARC + - Fix build problems with Perl versions < 5.6 + - Fixed some stupid code that broke on a few compilers + - Added string handling functions to Pipe + - MISTY1 optimizations + +* 0.7.2, June 3, 2001 + - Build system supports modules + - Added modules for mlock, a /dev/random EntropySource, POSIX1.b timers + - Added Bzip2 compression filter, contributed by Peter Jones + - GNU make no longer required (tested with 4.4BSD pmake and Solaris make) + - Fixed minor bug in several of the hash functions + - Various other minor fixes and changes + - Updates to the documentation + +* 0.7.1, May 16, 2001 + - Rewrote configure script: more consistent and complete + - Made it easier to find out parameters of types at run time (opencl.h) + - New functions for finding the version being used (version.h) + - New SymmetricKey interface for Filters (symkey.h) + - InvalidKeyLength now records what the invalid key length was + - Optimized DES, CS-Cipher, MISTY1, Skipjack, XTEA + - Changed GOST to use correct S-box ordering (incompatible change) + - Benchmark code was almost totally rewritten + - Many more entries in the test vector file + - Fixed minor and idiotic bug in check.cpp + +* 0.7.0, March 1, 2001 + - First public release diff --git a/doc/misc/log-08.txt b/doc/misc/log-08.txt new file mode 100644 index 000000000..4476d1978 --- /dev/null +++ b/doc/misc/log-08.txt @@ -0,0 +1,120 @@ + +* 0.8.7, July 30, 2002 + - Fixed bugs in EME1 and EMSA4 + - Fixed a potential crash at shutdown + - Cipher modes returned an ill-formed name + - Removed various deprecated types and headers + - Cleaned up the Pipe interface a bit + - Minor additions to the documentation + - First stab at a Visual C++ makefile (doc/Makefile.vc7) + +* 0.8.6, July 25, 2002 + - Added EMSA4 (aka PSS) + - Brought the manual up to date; many corrections and additions + - Added a parallel hash function construction + - Lookup supports all available algorithms now + - Lazy initialization of the lookup tables + - Made more discrete logarithm groups available through get_dl_group() + - StreamCipher_Filter supports seeking (if the underlying cipher does) + - Minor optimization for GCD calculations + - Renamed SAFER_SK128 to SAFER_SK + - Removed many previously deprecated functions + - Some now-obsolete functions, headers, and types have been deprecated + - Fixed some bugs in DSA prime generation + - DL_Group had a constructor for DSA-style prime gen but it wasn't defined + - Reversed the ordering of the two arguments to SEAL's constructor + - Fixed a threading problem in the PK algorithms + - Fixed a minor memory leak in lookup.cpp + - Fixed pk_types.h (it was broken in 0.8.5) + - Made validation tests more verbose + - Updated the check and example applications + +* 0.8.5, July 21, 2002 + - Major changes to constructors for DL-based cryptosystems (DSA, NR, DH) + - Added a DL_Group class + - Reworking of the pubkey internals + - Support in lookup for aliases and PK algorithms + - Renamed CAST5 to CAST_128 and CAST256 to CAST_256 + - Added EMSA1 + - Reorganization of header files + - LibraryInitializer will install new allocator types if requested + - Fixed a bug in Diffie-Hellman key generation + - Did a workaround in pipe.cpp for GCC 2.95.x on Linux + - Removed some debugging code from init.cpp that made FTW ES useless + - Better checking for invalid arguments in the PK algorithms + - Reduced Base64 and Hex default line length (if line breaking is used) + - Fixes for HP's aCC compiler + - Cleanups in BigInt + +* 0.8.4, July 14, 2002 + - Added Nyberg-Rueppel signatures + - Added Diffie-Hellman key exchange (kex interface is subject to change) + - Added KDF2 + - Enhancements to the lookup API + - Many things formerly taking pointers to algorithms now take names + - Speedups for prime generation + - LibraryInitializer has support for seeding the global RNG + - Reduced SAFER-SK128 memory consumption + - Reversed the ordering of public and private key values in DSA constructor + - Fixed serious bugs in MemoryMapping_Allocator + - Fixed memory leak in Lion + - FTW_EntropySource was not closing the files it read + - Fixed line breaking problem in Hex_Encoder + +* 0.8.3, June 9, 2002 + - Added DSA and Rabin-Williams signature schemes + - Added EMSA3 + - Added PKCS#1 v1.5 encryption padding + - Added Filters for PK algorithms + - Added a Keyed_Filter class + - LibraryInitializer processes arguments now + - Major revamp of the PK interface classes + - Changed almost all of the Filters for non-template operation + - Changed HMAC, Lion, Luby-Rackoff to non-template classes + - Some fairly minor BigInt optimizations + - Added simple benchmarking for PK algorithms + - Added hooks for fixed base and fixed exponent modular exponentiation + - Added some examples for using RSA + - Numerous bugfixes and cleanups + - Documentation updates + +* 0.8.2, May 18, 2002 + - Added an (experimental) algorithm lookup interface + - Added code for directly testing BigInt + - Added SHA2-384 + - Optimized SHA2-512 + - Major optimization for Adler32 (thanks to Dan Nicolaescu) + - Various minor optimizations in BigInt and related areas + - Fixed two bugs in X9.19 MAC, both reported by Darren Starsmore + - Fixed a bug in BufferingFilter + - Made a few fixes for MacOS X + - Added a workaround in configure.pl for GCC 2.95.x + - Better support for PowerPC, ARM, and Alpha + - Some more cleanups + +* 0.8.1, May 6, 2002 + - Major code cleanup (check doc/deprecated.txt) + - Various bugs fixed, including several portability problems + - Renamed MessageAuthCode to MessageAuthenticationCode + - A replacement for X917 is in x917_rng.h + - Changed EMAC to non-template class + - Added ANSI X9.19 compatible CBC-MAC + - TripleDES now supports 128 bit keys + +* 0.8.0, April 24, 2002 + - Merged BigInt: many bugfixes and optimizations since alpha2 + - Added RSA (rsa.h) + - Added EMSA2 (emsa2.h) + - Lots of new interface code for public key algorithms (pk_base.h, pubkey.h) + - Changed some interfaces, including SymmetricKey, to support the global rng + - Fixed a serious bug in ManagedAllocator + - Renamed RIPEMD128 to RIPEMD_128 and RIPEMD160 to RIPEMD_160 + - Removed some deprecated stuff + - Added a global random number generator (rng.h) + - Added clone functions to most of the basic algorithms + - Added a library initializer class (init.h) + - Version macros in version.h + - Moved the base classes from opencl.h to base.h + - Renamed the bzip2 module to comp_bzip2 and zlib to comp_zlib + - Documentation updates for the new stuff (still incomplete) + - Many new deprecated things: check doc/deprecated.txt diff --git a/doc/misc/log-09.txt b/doc/misc/log-09.txt new file mode 100644 index 000000000..7e67d93c7 --- /dev/null +++ b/doc/misc/log-09.txt @@ -0,0 +1,28 @@ + +* 0.9.2, August 18, 2002 + - DH_PrivateKey::public_value() was returning the wrong value + - Various BigInt optimizations + - The filters.h header now includes hex.h and base64.h + - Moved Counter mode to ctr.h + - Fixed a couple minor problems with VC++ 7 + - Fixed problems with the RPM spec file + +* 0.9.1, August 10, 2002 + - Grand rename from OpenCL to Botan + - Major optimizations for the PK algorithms + - Added ElGamal encryption + - Added Whirlpool + - Tweaked memory allocation parameters + - Improved the method of seeding the global RNG + - Moved pkcs1.h to eme_pkcs.h + - Added more test vectors for some algorithms + - Fixed error reporting in the BigInt tests + - Removed Default_Timer, it was pointless + - Added some new example applications + - Removed some old examples that weren't that interesting + - Documented the compression modules + +* 0.9.0, August 3, 2002 + - EMSA4 supports variable salt size + - PK_* can take a string naming the encoding method to use + - Started writing some internals documentation diff --git a/doc/misc/log-10.txt b/doc/misc/log-10.txt new file mode 100644 index 000000000..6222786e8 --- /dev/null +++ b/doc/misc/log-10.txt @@ -0,0 +1,17 @@ + +* 1.0.2, January 12, 2003 + - Fixed an obscure SEGFAULT causing bug in Pipe + - Fixed an obscure but dangerous bug in SecureVector::swap + +* 1.0.1, September 14, 2002 + - Fixed a minor bug in Randpool::random() + - Added some new aliases and typedefs for 1.1.x compatibility + - The 4096-bit RSA benchmark key was decimal instead of hex + - EMAC was returning an incorrect name + +* 1.0.0, August 26, 2002 + - Octal I/O of BigInt is now supported + - Fixed portability problems in the es_egd module + - Generalized IV handling in the block cipher modes + - Added Karatsuba multiplication and k-ary exponentiation + - Fixed a problem in the multiplication routines diff --git a/doc/misc/log-11.txt b/doc/misc/log-11.txt new file mode 100644 index 000000000..9cbe3846f --- /dev/null +++ b/doc/misc/log-11.txt @@ -0,0 +1,153 @@ + +* 1.1.13, April 22, 2003 + - Added OMAC + - Added EAX authenticated cipher mode + - Diffie-Hellman would not do blinding in some cases + - Optimized the OFB and CTR modes + - Corrected Skipjack's word ordering, as per NIST clarification + - Support for all subject/issuer attribute types required by RFC 3280 + - The removeFromCRL CRL reason code is now handled correctly + - Increased the flexibility of the allocators + - Renamed Rijndael to AES, created aes.h, deleted rijndael.h + - Removed support for the 'no_timer' LibraryInitializer option + - Removed 'es_pthr' module, pending further testing + - Cleaned up get_ciph.cpp + +* 1.1.12, April 15, 2003 + - Fixed a ASN.1 string encoding bug + - Fixed a pair of X509_DN encoding problems + - Base64_Decoder and Hex_Decoder can now validate input + - Removed support for the LibraryInitializer option 'egd_path' + - Added tests for DSA X.509 and PKCS #8 key formats + - Removed a long deprecated feature of DH_PrivateKey's constructor + - Updated the RPM .spec file + - Major documentation updates + +* 1.1.11, April 7, 2003 + - Added PKCS #10 certificate requests + - Changed X509_Store searching interface to be more flexible + - Added a generic Certificate_Store interface + - Added a function for generating self-signed X.509 certs + - Cleanups and changes to X509_CA + - New examples for PKCS #10 and self-signed certificates + - Some documentation updates + +* 1.1.10, April 3, 2003 + - X509_CA can now generate new X.509 CRLs + - Added blinding for RSA, RW, DH, and ElGamal to prevent timing attacks + - More certificate and CRL extensions/attributes are supported + - Better DN handling in X.509 certificates/CRLs + - Added a DataSink hierarchy (suggested by Jim Darby) + - Consolidated SecureAllocator and ManagedAllocator + - Many cleanups and generalizations + - Added a (slow) pthreads based EntropySource + - Fixed some threading bugs + +* 1.1.9, February 25, 2003 + - Added support for using X.509v2 CRLs + - Fixed several bugs in the path validation algorithm + - Certificates can be verified for a particular usage + - Algorithm for comparing distinguished names now follows X.509 + - Cleaned up the code for the es_beos, es_ftw, es_unix modules + - Documentation updates + +* 1.1.8, January 29, 2003 + - Fixes for the certificate path validation algorithm in X509_Store + - Fixed a bug affecting X509_Certificate::is_ca_cert() + - Added a general configuration interface for policy issues + - Cleanups and API changes in the X.509 CA, cert, and store code + - Made various options available for X509_CA users + - Changed X509_Time's interface to work around time_t problems + - Fixed a theoretical weakness in Randpool's entropy mixing function + - Fixed problems compiling with GCC 2.95.3 and GCC 2.96 + - Fixed a configure bug (reported by Jon Wilson) affecting MinGW + +* 1.1.7, January 12, 2003 + - Fixed an obscure but dangerous bug in SecureVector::swap + - Consolidated SHA-384 and SHA-512 to save code space + - Added SSL3-MAC and SSL3-PRF + - Documentation updates, including a new tutorial + +* 1.1.6, December 10, 2002 + - Initial support for X.509v3 certificates and CAs + - Major redesign/rewrite of the ASN.1 encoding/decoding code + - Added handling for DSA/NR signatures encoded as DER SEQUENCEs + - Documented the generic cipher lookup interface + - Added an (untested) entropy source for BeOS + - Various cleanups and bug fixes + +* 1.1.5, November 17, 2002 + - Added the discrete logarithm integrated encryption system (DLIES) + - Various optimizations for BigInt + - Added support for assembler optimizations in modules + - Added BigInt x86 optimizations module (mpi_ia32) + +* 1.1.4, November 10, 2002 + - Speedup of 15-30% for PK algorithms + - Implemented the PBES2 encryption scheme + - Fixed a potential bug in decoding RSA and RW private keys + - Changed the DL_Group class interface to handle different formats better + - Added support for PKCS #3 encoded DH parameters + - X9.42 DH parameters use a PEM label of 'X942 DH PARAMETERS' + - Added key pair consistency checking + - Fixed a compatibility problem with gcc 2.96 (pointed out by Hany Greiss) + - A botan-config script is generated at configure time + - Documentation updates + +* 1.1.3, November 3, 2002 + - Added a generic public/private key loading interface + - Fixed a small encoding bug in RSA, RW, and DH + - Changed the PK encryption/decryption interface classes + - ECB supports using padding methods + - Added a function-based interface for library initialization + - Added support for RIPEMD-128 and Tiger PKCS#1 v1.5 signatures + - The cipher mode benchmarks now use 128-bit AES instead of DES + - Removed some obsolete typedefs + - Removed OpenCL support (opencl.h, the OPENCL_* macros, etc) + - Added tests for PKCS #8 encoding/decoding + - Added more tests for ECB and CBC + +* 1.1.2, October 21, 2002 + - Support for PKCS #8 encoded RSA, DSA, and DH private keys + - Support for Diffie-Hellman X.509 public keys + - Major reorganization of how X.509 keys are handled + - Added PKCS #5 v2.0's PBES1 encryption scheme + - Added a generic cipher lookup interface + - Added the WiderWake4+1 stream cipher + - Added support for sync-able stream ciphers + - Added a 'paranoia level' option for the LibraryInitializer + - More security for RNG output meant for long term keys + - Added documentation for some of the new 1.1.x features + - CFB's feedback argument is now specified in bits + - Renamed CTR class to CTR_BE + - Updated the RSA and DSA examples to use X.509 and PKCS #8 key formats + +* 1.1.1, October 15, 2002 + - Added the Korean hash function HAS-160 + - Partial support for RSA and DSA X.509 public keys + - Added a mostly functional BER encoder/decoder + - Added support for nondeterministic MAC functions + - Initial support for PEM encoding/decoding + - Internal cleanups in the PK algorithms + - Several new convenience functions in Pipe + - Fixed two nasty bugs in Pipe + - Messed with the entropy sources for es_unix + - Discrete logarithm groups are checked for safety more closely now + - For compatibility with GnuPG, ElGamal now supports DSA-style groups + +* 1.1.0, September 14, 2002 + - Added entropy estimation to the RNGs + - Improved the overall design of both Randpool and ANSI_X917_RNG + - Added a separate RNG for nonce generation + - Added window exponentiation support in power_mod + - Added a get_s2k function and the PKCS #5 S2K algorithms + - Added the TLSv1 PRF + - Replaced BlockCipherModeIV typedef with InitializationVector class + - Renamed PK_Key_Agreement_Scheme to PK_Key_Agreement + - Renamed SHA1 -> SHA_160 and SHA2_x -> SHA_x + - Added support for RIPEMD-160 PKCS#1 v1.5 signatures + - Changed the key agreement scheme interface + - Changed the S2K and KDF interfaces + - Better SCAN compatibility for HAVAL, Tiger, MISTY1, SEAL, RC5, SAFER-SK + - Added support for variable-pass Tiger + - Major speedup for Rabin-Williams key generation diff --git a/doc/misc/log-12.txt b/doc/misc/log-12.txt new file mode 100644 index 000000000..e2f187031 --- /dev/null +++ b/doc/misc/log-12.txt @@ -0,0 +1,88 @@ + +* 1.2.8, November 21, 2003 + - Merged several important bug fixes from 1.3.x + +* 1.2.7, October 31, 2003 + - Added support for reading configuration files + - Added constructors so NR and RW keys can be imported easily + - Fixed mp_asm64, which was completely broken in 1.2.6 + - Removed tm_hw_ia32 module; replaced by tm_hard + - Added support for loading certain oddly formed RSA certificates + - Fixed spelling of NON_REPUDIATION enum + - Renamed the option default_to_ca to v1_assume_ca + - Fixed a minor bug in X.509 certificate generation + - Fixed a latent bug in the OID lookup code + - Updated the RPM spec file + - Added to the tutorial + +* 1.2.6, July 4, 2003 + - Major performance increase for PK algorithms on most 64-bit systems + - Cleanups in the low-level MPI code to support asm implementations + - Fixed build problems with some versions of Compaq's C++ compiler + - Removed useless constructors for NR public and private keys + - Removed support for the patch_file directive in module files + - Removed several deprecated functions + +* 1.2.5, June 22, 2003 + - Fixed a tricky and long-standing memory leak in Pipe + - Major cleanups and fixes in the memory allocation system + - Removed alloc_mlock, which has been superseded by the ml_unix module + - Removed a denial of service vulnerability in X509_Store + - Fixed compilation problems with VS .NET 2003 and Codewarrior 8 + - Added another variant of PKCS8::load_key, taking a memory buffer + - Fixed various minor/obscure bugs which occurred when MP_WORD_BITS != 32 + - BigInt::operator%=(word) was a no-op if the input was a power of 2 + - Fixed portability problems in BigInt::to_u32bit + - Fixed major bugs in SSL3-MAC + - Cleaned up some messes in the PK algorithms + - Cleanups and extensions for OMAC and EAX + - Made changes to the entropy estimation function + - Added a 'beos' module set for use on BeOS + - Officially deprecated a few X509:: and PKCS8:: functions + - Moved the contents of primes.h to numthry.h + - Moved the contents of x509opt.h to x509self.h + - Removed the (empty) desx.h header + - Documentation updates + +* 1.2.4, May 29, 2003 + - Fixed a bug in EMSA1 affecting NR signature verification + - Fixed a few latent bugs in BigInt related to word size + - Removed an unused function, mp_add2_nc, from the MPI implementation + - Reorganized the core MPI files + +* 1.2.3, May 20, 2003 + - Fixed a bug that prevented DSA/NR key generation + - Fixed a bug that prevented importing some root CA certs + - Fixed a bug in the BER decoder when handing optional bit or byte strings + - Fixed the encoding of authorityKeyIdentifier in X509_CA + - Added a sanity check in PBKDF2 for zero length passphrases + - Added versions of X509::load_key and PKCS8::load_key that take a file name + - X509_CA generates 128 bit serial numbers now + - Added tests to check PK key generation + - Added a simplistic X.509 CA example + - Cleaned up some of the examples + +* 1.2.2, May 13, 2003 + - Add checks to prevent any BigInt bugs from revealing an RSA or RW key + - Changed the interface of Global_RNG::seed + - Major improvements for the es_unix module + - Added another Win32 entropy source, es_win32 + - The Win32 CryptoAPI entropy source can now poll multiple providers + - Improved the BeOS entropy source + - Renamed pipe_unixfd module to fd_unix + - Fixed a file descriptor leak in the EGD module + - Fixed a few locking bugs + +* 1.2.1, May 6, 2003 + - Added ANSI X9.23 compatible CBC padding + - Added an entropy source using Win32 CryptoAPI + - Removed the Pipe I/O operators taking a FILE* + - Moved the BigInt encoding/decoding functions into the BigInt class + - Integrated several fixes for VC++ 7 (from Hany Greiss) + - Fixed the configure.pl script for Windows builds + +* 1.2.0, April 28, 2003 + - Tweaked the Karatsuba cut-off points + - Increased the allowed keylength of HMAC and Blowfish + - Removed the 'mpi_ia32' module, pending rewrite + - Workaround a GCC 2.95.x bug in eme1.cpp diff --git a/doc/misc/log-13.txt b/doc/misc/log-13.txt new file mode 100644 index 000000000..01a51cb02 --- /dev/null +++ b/doc/misc/log-13.txt @@ -0,0 +1,184 @@ + +* 1.3.14, June 12, 2004 + - Added support for AEP's AEP1000/AEP2000 crypto cards + - Added a Mutex module using Qt, from Justin Karneges + - Added support for engine loading in LibraryInitializer + - Tweaked SecureAllocator, giving 20% better performance under heavy load + - Added timer and memory locking modules for Win32 (tm_win32, ml_win32) + - Renamed PK_Engine to Engine_Core + - Improved the Karatsuba cutoff points + - Fixes for compiling with GCC 3.4 and Sun C++ 5.5 + - Fixes for Linux/s390, OpenBSD, and Solaris + - Added support for Linux/s390x + - The configure script was totally broken for 'generic' OS + - Removed Montgomery reduction due to bugs + - Removed an unused header, pkcs8alg.h + - check --validate returns an error code if any tests failed + - Removed duplicate entry in Unix command list for es_unix + - Moved the Cert_Usage enumeration into X509_Store + - Added new timing methods for PK benchmarks, clock_gettime and RDTSC + - Fixed a few minor bugs in the configure script + - Removed some deprecated functions from x509cert.h and pkcs10.h + - Removed the 'minimal' module, has to be updated for Engine support + - Changed MP_WORD_BITS macro to BOTAN_MP_WORD_BITS to clean up namespace + - Documentation updates + +* 1.3.13, May 15, 2004 + - Major fixes for Cygwin builds + - Minor MacOS X install fixes + - The configure script is a little better at picking the right modules + - Removed ml_unix from the 'unix' module set for Cygwin compatibility + - Fixed a stupid compile problem in pkcs10.h + +* 1.3.12, May 2, 2004 + - Added ability to remove old entries from CRLs + - Swapped the first two arguments of X509_CA::update_crl() + - Added an < operator for MemoryRegion, so it can be used as a std::map key + - Changed X.509 searching by DNS name from substring to full string compares + - Renamed a few X509_Certificate and PKCS10_Request member functions + - Fixed a problem when decoding some PKCS #10 requests + - Hex_Decoder would not check inputs, reported by Vaclav Ovsik + - Changed default CRL expire time from 30 days to 7 days + - X509_CRL's default PEM header is now "X509 CRL", for OpenSSL compatibility + - Corrected errors in the API doc, fixes from Ken Perano + - More documentation about the Pipe/Filter code + +* 1.3.11, April 1, 2004 + - Fixed two show-stopping bugs in PKCS10_Request + - Added some sanity checks in Pipe/Filter + - The DNS and URI entries would get swapped in subjectAlternativeNames + - MAC_Filter is now willing to not take a key at creation time + - Setting the expiration times of certs and CRLs is more flexible + - Fixed problems building on AIX with GCC + - Fixed some problems in the tutorial pointed out by Dominik Vogt + - Documentation updates + +* 1.3.10, March 27, 2004 + - Added support for OpenPGP's ASCII armor format + - Cleaned up the RNG system; seeding is much more flexible + - Added simple autoconfiguration abilities to configure.pl + - Fixed a GCC 2.95.x compile problem + - Updated the example configuration file + - Documentation updates + +* 1.3.9, March 7, 2004 + - Added an engine using OpenSSL (requires 0.9.7 or later) + - X509_Certificate would lose email addresses stored in the DN + - Fixed a missing initialization in a BigInt constructor + - Fixed several Visual C++ compile problems + - Fixed some BeOS build problems + - Fixed the WiderWake benchmark + +* 1.3.8, December 30, 2003 + - Internal changes to PK algorithms to divide data and algorithms + - DSA/DH/NR/ElGamal constructors accept taking just the private key again + - ElGamal keys now support being imported/exported as ASN.1 objects + - Much more consistent and complete error checking in PK algorithms + - Support for arbitrary backends (engines) for PK operations + - Added Montgomery reductions + - Added an engine that uses GNU MP (requires 4.1 or later) + - Removed the obsolete mp_gmp module + - Moved several initialization/shutdown functions to init.h + - Major refactoring of the memory containers + - New non-locking container, MemoryVector + - Fixed 64-bit problems in BigInt::set_bit/clear_bit + - Renamed PK_Key::check_params() to check_key() + - Some incompatible changes to OctetString + - Added version checking macros in version.h + - Removed the fips140 module pending rewrite + - Added some functions and hooks to help GUIs + - Moved more shared code into MDx_HashFunction + - Added a policy hook for specifying the encoding of X.509 strings + +* 1.3.7, December 12, 2003 + - Fixed a big security problem in es_unix + - Fixed several stability problems in es_unix + - Expanded the list of programs es_unix will try to use + - SecureAllocator now only preallocates blocks in special cases + - Added a special case in Global_RNG::seed for forcing a full poll + - Removed the FIPS 186 RNG added in 1.3.5 pending further testing + - Configure updates for PowerPC CPUs + - Removed the (never tested) VAX support + - Added support for S/390 Linux + +* 1.3.6, December 7, 2003 + - Added a new module 'minimal', which disables most algorithms + - SecureAllocator allocates a few blocks at startup + - A few minor MPI cleanups + - RPM spec file cleanups and fixes + +* 1.3.5, November 30, 2003 + - Major improvements in ASN.1 string handling + - Added partial support for ASN.1 UTF8 STRINGs and BMP STRINGs + - Added partial support for the X.509v3 certificate policies extension + - Centralized the handling of character set information + - Added FIPS 140-2 startup self tests + - Added a module (fips140) for doing extra FIPS 140-2 tests + - Added FIPS 186-2 RNG + - Improved ASN.1 BIT STRING handling + - Removed a memory leak in PKCS10_Request + - The encoding of DirectoryString now follows PKIX guidelines + - Fixed some of the character set dependencies + - Fixed a DER encoding error for tags greater than 30 + - The BER decoder can now handle tags larger than 30 + - Fixed tm_hard.cpp to recognize SPARC on more systems + - Workarounds for a GCC 2.95.x bug in x509find.cpp + - RPM changed to install into /usr instead of /usr/local + - Added support for QNX + +* 1.3.4, November 21, 2003 + - Added a module that does certain MPI operations using GNU MP + - Added the X9.42 Diffie-Hellman PRF + - The Zlib and Bzip2 objects now use custom allocators + - Added member functions for directly hashing/MACing SecureVectors + - Minor optimizations to the MPI addition and subtraction algorithms + - Some cleanups in the low-level MPI code + - Created separate AES-{128,192,256} objects + +* 1.3.3, November 17, 2003 + - The library can now be repeatedly initialized and shutdown without crashing + - Fixed an off-by-one error in the CTS code + - Fixed an error in the EMSA4 verification code + - Fixed a memory leak in mutex.cpp (pointed out by James Widener) + - Fixed a memory leak in Pthread_Mutex + - Fixed several memory leaks in the testing code + - Bulletproofed the EMSA/EME/KDF/MGF retrieval functions + - Minor cleanups in SecureAllocator + - Removed a needless mutex guarding the (stateless) global timer + - Fixed a piece of bash-specific code in botan-config + - X.509 objects report more information about decoding errors + - Cleaned up some of the exception handling + - Updated the example config file with new OIDSs + - Moved the build instructions into a separate document, building.tex + +* 1.3.2, November 13, 2003 + - Fixed a bug preventing DSA signatures from verifying on X.509 objects + - Made the X509_Store search routines more efficient and flexible + - Added a function to X509_PublicKey to do easy public/private key matching + - Added support for decoding indefinite length BER data + - Changed Pipe's peek() to take an offset + - Removed Filter::set_owns in favor of the new incr_owns function + - Removed BigInt::zero() and BigInt::one() + - Renamed the PEM related options from base/pem_* to pem/* + - Added an option to specify the line width when encoding PEM + - Removed the "rng/safe_longterm" option; it's always on now + - Changed the cipher used for RNG super-encryption from ARC4 to WiderWake4+1 + - Cleaned up the base64/hex encoders and decoders + - Added an ASN.1/BER decoder as an example + - AES had its internals marked 'public' in previous versions + - Changed the value of the ASN.1 NO_OBJECT enum + - Various new hacks in the configure script + - Removed the already nominal support for SunOS + +* 1.3.1, November 4, 2003 + - Generalized a few pieces of the DER encoder + - PKCS8::load_key would fail if handed an unencrypted key + - Added a failsafe so PKCS #8 key decoding can't go into an infinite loop + +* 1.3.0, November 2, 2003 + - Major redesign of the PKCS #8 private key import/export system + - Added a small amount of UI interface code for getting passphrases + - Added heuristics that tell if a key, cert, etc is stored as PEM or BER + - Removed CS-Cipher, SHARK, ThreeWay, MD5-MAC, and EMAC + - Removed certain deprecated constructors of RSA, DSA, DH, RW, NR + - Made PEM decoding more forgiving of extra text before the header diff --git a/doc/misc/log-14.txt b/doc/misc/log-14.txt new file mode 100644 index 000000000..0406e8a8b --- /dev/null +++ b/doc/misc/log-14.txt @@ -0,0 +1,137 @@ + +* 1.4.12, January 15, 2005 + - Fixed an off-by-one memory read in MISTY1::key() + - Fixed a nasty memory leak in Output_Buffers::retire() + - Changed maximum HMAC keylength to 1024 bits + - Fixed a build problem in the hardware timer module on 64-bit PowerPC + +* 1.4.11, December 31, 2005 + - Changed Whirlpool diffusion matrix to match updated algorithm spec + - Fixed several engine module build errors introduced in 1.4.10 + - Fixed two build problems in es_capi; reported by Matthew Gregan + - Added a constructor to DataSource_Memory taking a std::string + - Placing the same Filter in multiple Pipes triggers an exception + - The configure script accepts --docdir and --libdir + - Merged doc/rngs.txt into the main API document + - Thanks to Joel Low for a several bugreports on early tarballs of 1.4.11 + +* 1.4.10, December 18, 2005 + - Added an implementation of KASUMI, the block cipher used in 3G phones + - Refactored Pipe; output queues are now managed by a distinct class + - Made certain Filter facilities only available to subclasses of Fanout_Filter + - There is no longer any overhead in Pipe for a message that has been read out + - It is now possible to generate RSA keys as small as 128 bits + - Changed some of the core classes to derive from Algorithm as a virtual base + - Changed Randpool to use HMAC instead of a plain hash as the mixing function + - Fixed a bug in the allocators; found and fixed by Matthew Gregan + - Enabled the use of binary file I/O, when requested by the application + - The OpenSSL engine's block cipher code was missing some deallocation calls + - Disabled the es_ftw module on NetBSD, due to header problems there + - Fixed a problem preventing tm_hard from building on MacOS X on PowerPC + - Some cleanups for the modules that use inline assembler + - config.h is now stored in build/ instead of build/include/botan/ + - The header util.h was split into bit_ops.h, parsing.h, and util.h + - Cleaned up some redundant include directives + +* 1.4.9, November 6, 2005 + - Added the IBM-created AES candidate algorithm MARS + - Added the South Korean block cipher SEED + - Added the stream cipher Turing + - Added the new hash function FORK-256 + - Deprecated the ISAAC stream cipher + - Twofish and RC6 are significantly faster with GCC + - Much better support for 64-bit PowerPC + - Added support for high-resolution PowerPC timers + - Fixed a bug in the configure script causing problems on FreeBSD + - Changed ANSI X9.31 to support arbitrary block ciphers + - Make the configure script a bit less noisy + - Added more test vectors for some algorithms, including all the AES finalists + - Various cosmetic source code cleanups + +* 1.4.8, October 16, 2005 + - Resolved a bad performance problem in the allocators; fix by Matt Johnston + - Worked around a Visual Studio 2003 compilation problem introduced in 1.4.7 + - Renamed OMAC to CMAC to match the official NIST naming + - Added single byte versions of update() to PK_Signer and PK_Verifier + - Removed the unused reverse_bits and reverse_bytes functions + +* 1.4.7, September 25, 2005 + - Fixed major performance problems with recent versions of GNU C++ + - Added an implementation of the X9.31 PRNG + - Removed the X9.17 and FIPS 186-2 PRNG algorithms + - Changed defaults to use X9.31 PRNGs as global PRNG objects + - Documentation updates to reflect the PRNG changes + - Some cleanups related to the engine code + - Removed two useless headers, base_eng.h and secalloc.h + - Removed PK_Verifier::valid_signature + - Fixed configure/build system bugs affecting MacOS X builds + - Added support for the EKOPath x86-64 compiler + - Added missing destructor for BlockCipherModePaddingMethod + - Fix some build problems with Visual C++ 2005 beta + - Fix some build problems with Visual C++ 2003 Workshop + +* 1.4.6, March 13, 2005 + - Fix an error in the shutdown code introduced in 1.4.5 + - Setting base/pkcs8_tries to 0 disables the builtin fail-out + - Support for XMPP identifiers in X.509 certificates + - Duplicate entries in X.509 DNs are removed + - More fixes for Borland C++, from Friedemann Kleint + - Add a workaround for buggy iostreams + +* 1.4.5, February 26, 2005 + - Add support for AES encryption of private keys + - Minor fixes for PBES2 parameter decoding + - Internal cleanups for global state variables + - GCC 3.x version detection was broken in non-English locales + - Work around a Sun Forte bug affecting mem_pool.h + - Several fixes for Borland C++ 5.5, from Friedemann Kleint + - Removed inclusion of init.h into base.h + - Fixed a major bug in reading from certificate stores + - Cleaned up a couple of mutex leaks + - Removed some left-over debugging code + - Removed SSL3_MAC, SSL3_PRF, and TLS_PRF + +* 1.4.4, December 2, 2004 + - Further tweaks to the pooling allocator + - Modified EMSA3 to support SSL/TLS signatures + - Changes to support Qt/QCA, from Justin Karneges + - Moved mux_qt module code into mod_qt + - Fixes for HP-UX from Mike Desjardins + +* 1.4.3, November 6, 2004 + - Split up SecureAllocator into Allocator and Pooling_Allocator + - Memory locking allocators are more likely to be used + - Fixed the placement of includes in some modules + - Fixed broken installation procedure + - Fixes in configure script to support alternate install programs + - Modules can specify the minimum version they support + +* 1.4.2, October 31, 2004 + - Fixed a major CRL handling bug + - Cipher and hash operations can be offloaded to engines + - Added support for cipher and hash offload in OpenSSL engine + - Improvements for 64-bit CPUs without a widening multiply instruction + - Support for SHA2-* and Whirlpool with EMSA2 + - Fixed a long-standing build problem with conflicting include files + - Fixed some examples that hadn't been updated for 1.4.x + - Portability fixes for Solaris, *BSD, HP-UX, and others + - Lots of fixes and cleanups in the configure script + - Updated the Gentoo ebuild file + +* 1.4.1, October 10, 2004 + - Fixed major errors in the X.509 and PKCS #8 copy_key functions + - Added a LAST_MESSAGE meta-message number for Pipe + - Added new aliases (3DES and DES-EDE) for Triple-DES + - Added some new functions to PK_Verifier + - Cleaned up the KDF interface + - Disabled tm_posix on *BSD due to header issues + - Fixed a build problem on PowerPC with GNU C++ pre-3.4 + +* 1.4.0, June 26, 2004 + - Added the FIPS 186 RNG back + - Added copy_key functions for X.509 public keys and PKCS #8 private keys + - Fixed PKCS #1 signatures with RIPEMD-128 + - Moved some code around to avoid warnings with Sun ONE compiler + - Fixed a bug in botan-config affecting OpenBSD + - Fixed some build problems on Tru64, HP-UX + - Fixed compile problems with Intel C++, Compaq C++ diff --git a/doc/packages/Botan.spec b/doc/packages/Botan.spec new file mode 100644 index 000000000..b0a3cb937 --- /dev/null +++ b/doc/packages/Botan.spec @@ -0,0 +1,172 @@ +# Botan base spec file + +# Note that some of the commands in here assume a GNU toolset, which is +# unfortunate and should probably be fixed. + +################################################## +# Version numbers and config options # +################################################## +%define MAJOR $MAJOR +%define MINOR $MINOR +%define PATCH $PATCH + +%define ONLY_BASE_MODS 0 + +################################################## +# Hardware restrictions on various modules # +################################################## +%define USE_TM_HARD i586 i686 athlon x86_64 alpha sparcv9 sparc64 +%define USE_MP64 alpha ppc64 ia64 x86_64 + +################################################## +# Module settings # +################################################## +%define BASE_MODS alloc_mmap,ml_unix,es_egd,es_ftw,es_unix,fd_unix,tm_unix +%define EXTRA_MODS comp_bzip2,comp_zlib,mux_pthr,tm_posix,eng_gmp + +%ifarch %{USE_TM_HARD} + %{expand: %%define EXTRA_MODS %{EXTRA_MODS},tm_hard} +%endif + +%ifarch %{USE_MP64} + %{expand: %%define EXTRA_MODS %{EXTRA_MODS},mp_asm64} +%endif + +%if %{ONLY_BASE_MODS} + %define MODULES %{BASE_MODS} +%else + %define MODULES %{BASE_MODS},%{EXTRA_MODS} +%endif + +################################################## +# Descriptions # +################################################## +%define VERSION %{MAJOR}.%{MINOR}.%{PATCH} + +Name: Botan +Summary: A C++ crypto library +Version: %{VERSION} +Release: 1 +License: BSD +Group: System Environment/Libraries +Source: http://botan.randombit.net/files/%{name}-%{VERSION}.tgz +URL: http://botan.randombit.net/ +Packager: Jack Lloyd <[email protected]> +Prefix: /usr +BuildRequires: perl make + +%if ! %{ONLY_BASE_MODS} +Requires: zlib, bzip2 >= 1.0.2, gmp >= 4.1 +BuildRequires: zlib-devel, bzip2-devel >= 1.0.2, gmp-devel >= 4.1 +%endif + +BuildRoot: %{_tmppath}/%{name}-%{version}-root + +%description +Botan is a C++ library which provides support for many common cryptographic +operations, including encryption, authentication, and X.509v3 certificates and +CRLs. A wide variety of algorithms is supported, including RSA, DSA, DES, AES, +MD5, and SHA-1. + +%package devel +Summary: Development files for Botan +Group: Development/Libraries +Requires: Botan = %{VERSION} +%description devel +This package contains the header files and libraries needed to develop programs +that use the Botan library. + +################################################## +# Main Logic # +################################################## +%prep +%setup -n Botan-%{VERSION} + +%build +./configure.pl --noauto --modules=%{MODULES} gcc-%{_target_os}-%{_target_cpu} +make shared static + +%clean +rm -rf $RPM_BUILD_ROOT + +%install +ROOT="$RPM_BUILD_ROOT/usr" +make OWNER=`id -u` GROUP=`id -g` INSTALLROOT="$ROOT" install + +# Need this since we're installing shared libs... +%post +if ! grep "^$RPM_INSTALL_PREFIX/lib$" /etc/ld.so.conf 2>&1 >/dev/null +then + echo "$RPM_INSTALL_PREFIX/lib" >>/etc/ld.so.conf +fi +/sbin/ldconfig -X + +%postun +RMDIR_IGNORE_NONEMPTY="rmdir --ignore-fail-on-non-empty" +/sbin/ldconfig -X +if [ -d $RPM_INSTALL_PREFIX/share/doc/Botan-%{VERSION} ]; then + $RMDIR_IGNORE_NONEMPTY $RPM_INSTALL_PREFIX/share/doc/Botan-%{VERSION} +fi + +%postun devel +RMDIR_IGNORE_NONEMPTY="rmdir --ignore-fail-on-non-empty" +if [ -d $RPM_INSTALL_PREFIX/include/botan ]; then + $RMDIR_IGNORE_NONEMPTY $RPM_INSTALL_PREFIX/include/botan +fi + +################################################## +# File Lists # +################################################## +%files +%defattr(-,root,root) +%docdir /usr/share/doc/Botan-%{VERSION}/ +/usr/share/doc/Botan-%{VERSION}/license.txt +/usr/share/doc/Botan-%{VERSION}/readme.txt +/usr/share/doc/Botan-%{VERSION}/log.txt +/usr/share/doc/Botan-%{VERSION}/thanks.txt +/usr/share/doc/Botan-%{VERSION}/credits.txt +/usr/share/doc/Botan-%{VERSION}/pgpkeys.asc +/usr/lib/libbotan-%{MAJOR}.%{MINOR}.%{PATCH}.so + +%files devel +%defattr(-,root,root) +%docdir /usr/share/doc/Botan-%{VERSION}/ +/usr/share/doc/Botan-%{VERSION}/api.tex +/usr/share/doc/Botan-%{VERSION}/api.pdf +/usr/share/doc/Botan-%{VERSION}/tutorial.tex +/usr/share/doc/Botan-%{VERSION}/tutorial.pdf +/usr/share/doc/Botan-%{VERSION}/fips140.tex +/usr/share/doc/Botan-%{VERSION}/fips140.pdf +/usr/share/doc/Botan-%{VERSION}/deprecated.txt +/usr/share/doc/Botan-%{VERSION}/todo.txt +/usr/share/doc/Botan-%{VERSION}/bugs.txt +/usr/share/doc/Botan-%{VERSION}/rngs.txt +/usr/share/doc/Botan-%{VERSION}/botan.rc +/usr/lib/libbotan.so +/usr/lib/libbotan.a +/usr/include/botan/ +/usr/bin/botan-config + +################################################## +# Changelog # +################################################## +%changelog +* Wed Mar 17 2004 [email protected] + - Changed EXTRA_MODS to include eng_gmp, not mp_gmp + - Requires: included uneeded stuff if ONLY_BASE_MODS was used + +* Sun Feb 1 2004 [email protected] + - The Source: tag pointed to nowhere + - Removed the FIPS 140 stuff, it was messy and broken + +* Mon Dec 1 2003 [email protected] + - Cleaned up module handling + - Added a preliminary FIPS 140-2 toggle + - Use %defattr + +* Tue Nov 30 2003 [email protected] + - Default to installing into /usr instead of /usr/local + - Use tm_hard on sparcv9 + +* Tue Nov 23 2003 [email protected] + - Cleaned up the declaration of TIMERS diff --git a/doc/pgpkeys.asc b/doc/pgpkeys.asc new file mode 100644 index 000000000..32a7c411c --- /dev/null +++ b/doc/pgpkeys.asc @@ -0,0 +1,53 @@ +-----BEGIN PGP PUBLIC KEY BLOCK----- +Version: GnuPG v1.2.4 (GNU/Linux) + +mQILBEF/JS8BEAC3nJ0NyZNYmo04yqFK8lgLPKKw0wcYjpGELsA9YNNRnruVzHA8 +dMpKjz3q9evPEDdj/+C3OszeAdId0jZ/M5s/TCPWnwmi9wz8p6ICl+P8z70kgCE4 +ksrQpSjW6UkaKo6XV3qFFHFu4LPPnfNW2CbYAbQE7VRw2V8wzt9sGz09WYviHSPt +fUfOLYFmQQ4C2HUCGlGvhvo9eBnbm3OJxXz+of1Jlgu6+saNE3kLDS1L9nPZ8jHl +FCnimcAWq9N+PbFiooYy353vU1z4HRKYNAvX2AJcUzgWvoSGxElYnv+Acbb0h5Ps +FhaUWbQZEuN0gxeamWNE17mSZrd2IYl/85UbrdZ4S7asczVTCGbTWZA+DKzJe/ph +zGpyE5AjAUuTZqZl3ZQREkovnD1TA/dXxALiC3MyY/1QlJdaZ/N5CJQ/cMQzUHBX +TM7onLxZ3hLM1u1Anbz1At1HBUHLMQLmFBD6/x/UK2qIJvP1BO/n02/KU0rdOirG +Ud5zCyzCiAnIC6a2g0BDWRgnCvI7PNhbAfunHEOVgdZzA33Q22DckSr/24D5isTs +O+I6NuMEnoqOMxRwc1V2EAi5Xjudh/sOY8+LxF4QtyBKqC4SRwCFuNcpd6asdanm +dF8Dnxj+u50MyPmIVJ+ffjgZ4AFUPK3X1saKc1UyO6EZUGnxW08jf7EXkwAGKbQu +SmFjayBMbG95ZCAoU2lnbmluZyBLZXkpIDxsbG95ZEByYW5kb21iaXQubmV0PokC +LgQTAQIAGAIbAwIeAQIXgAUCQYA6eQMLCQIEFQIDCAAKCRBcD5boTsFta6E0D/kB +c0UlcCWMx1Hm69Z0EkuuW//RHl97UiPGZyhDACAGJ92L52IWcSsHRbwlimSV3z3G +iar/sVhLwK7GBG4+p+ACcO91du1Ei6r6jZwZXnkJbjNZ5vAGL5rpB11A5kzgkYai +9ayu9Ayk0Qd0NPLIJ5rkRfLDLhuk0T5vM5PTfO8yPH2G0+4VsaDuhPcFzzq/uHGI +q5rWz2NDYuW1r3fya0c1mtnv1Fa6b6AROpD2hepsSvKA0xpka/sioxLMbEymECW4 +Lsw2LYxL46A8UnGH3oJO5+T3ThsoYw1fxIwvyOx3CTQ1YO9rM5jS2jbJKIh2WhIo +kyg+tIFhUcECXzUWImZNCcde9O8DWQs4ylZHg+VPMeLlnIZFo137dF3QmLCnWUPx +k3qXCMM+vN8InHQlQDHFLb8uN4uQ2j3pTldtk6i+nKC6HpbKNQFERwMH4AXP3Kgr +MHlFaLTQPEr7GDE2l7jTJue/ArOOABF61mNLCxQmBLuqwOOx1exV6Gq7iu0I6PyK +KarC3eNoOxwUffd7RJe4+d5StFRh3RSNCOJMP7VklV6h+bytdli4NJnPttShqC45 +BknT7CjVD9sYWWoivHwAfdlqJ5xjAFVF8EOE7vYBswOCs42b+US1rzp20XPFYdaa +V16Eq5UZByV0+sB3zQ6732Sbjq7hWrlCZsNQhEHCspkBCwRBg9Y9AQgAx8TyZD3I +yRPuA5YrMdoVXNA4+iSNTEi65DPEftmzoVVEn3q8T09RDD4X6wHTh0wJaMr0U4Sc +7Z2bDSqnyy+PRhnAS3RBgFTZLNaRt+03s3uS8dVyLsyRa6My9rKgcuQupx6RYHdb +XA+vEWodsKdLtnEfaMF39SKinRH7qOutfjE7+8EwdeISle+Zwe2Tuuu+khGUSC2D +Pu0ScCgChJhDyXOoRH/eFV5/XnDc2k5R4GJI4c/XWyznd6HfV77lMPizEdXcAajM +W70fdDIl423nQxJGwK3GQfWDgCLxF/5Qt6mluFybeIsB1pf2FWRHwC3L0MsHtgg6 +x+YTu/eMHfTKQQAGKbQWQm90YW4gRGlzdHJpYnV0aW9uIEtleYkBNAQTAQIAHgUC +QYPWPQIbAwYLCQgHAwIDFQIDAxYCAQIeAQIXgAAKCRBiEevx77rfvLOYB/9VPMUd +Nj8mh6GZpeahB+WENPaC8cJGLVxa/SDWfE8wsCA3jgodFClySA7x507pBgj3NPHB +r5ZDG35apfyq/z8wAqOGqwsgH9cEXb4XBbpFdw723uDAoNjq59S43suO7QIhmtdh +VSpJ3AvPAboyJNzXvRpitqerWuJG2a1HCG5u8gxGjduTFldE9YembnfmHsJ+izsA +qP6Sr9DciUGGea5ItLYfTmtx4tDftUeL/Ek6qpFmImd1I/KA/e4ogjV85ou7vhgo +ZJ6td5a0LPMDlfMJlGi/TPvgKdNErBPm7mF/jOn12gyL4NWyO1dOpcMczyycDa28 +8BI1bru4r6/tadmZiQIcBBMBAgAGBQJBg9ZSAAoJEFwPluhOwW1r6v0QAJZWKzth +xoaQoqBEH/mjHyszfpdWExd2H+GecYe3mtqlttFGREdFxZTxYhwU7UDn35N/A013 +yA3UwS3/lpfkodZUocEN7EmAjGboObWmKxgd/92dhA8zXyGaqrA2Upj3uEWQ+kd3 +t6eu1z9K8ZtCGEtdiWnAGt0yXvp73EkueetFCSzF6AY+jxyv4L0dOzUvBrzPMHHI +HBgFhf7SagOQNKESG5fHqYQ5esa9Bqcm4i+mgjAYf7rYTrkxskaXBanD8wC2baMW +EEiEZXRuokIWn3MiB02B+yQFo+vSiKVu4lBmo/Qip7/oIK3heRLnXJZmrGMsAWkS +m8N5zg8bDfCK0p/+Sgaif1fwLTfn6cSMsGtbLXxD3W/DZRLIsj6ab12IwBLyljaC +iZ/EE7ZXBgcmvJnKSMJhCRxxEDW0e15GPiCpM5ujQCTjvLkyemVNbCLjSg/srZx+ +EJEGYyKjGZgRLjGwcS6l9EGjfGvUmr2rS3rzxZNPU1cc8w+i54X2YpkJ/alr1oey +oGZZObewJo1Aoa5sIMERSYlTKd2ZJq0nJgzs1VihSnoR+owAIW5M10GyMoyHtpZW +KRClskN8DUEhTeVl6qCOCuCFUBaQQ4F+EbQhVmP5FUImmyqhMnxqxjZWt4CtILKv +9WcsSd+4f9CmC8KIJt6mjcTfgMm3hJoA47Zm +=N8PX +-----END PGP PUBLIC KEY BLOCK----- diff --git a/doc/porting.txt b/doc/porting.txt new file mode 100644 index 000000000..48c095837 --- /dev/null +++ b/doc/porting.txt @@ -0,0 +1,144 @@ +* Botan Porting Guide, 1.2.x -> 1.4.x + +This is a guide for how to port your code from 1.2.x to 1.4.x. For the most +part, they are compatible, but there are a few cases that might cause +problems. The externally visible changes are much smaller than 1.0->1.2 changes +were. If you run into something that used to work that doesn't now and isn't +mentioned here, let me know so I can either fix it, or document it here. + +If you can provide a solid reason for 1.4.x to supply backwards-compatible +behavior with something that's mentioned here, I'll consider it, but no +promises. + +* Memory Containers + +The memory containers (SecureBuffer, SecureVector) have been changed to +subclass another object, MemoryRegion (a third subclass, MemoryVector, is just +like SecureVector, except it never locks memory - though it will clear it when +freeing). On it's own, this shouldn't cause any problems, but there are some +cases to be aware of. + +The ptr() function was renamed begin() to match the STL. This is probably the +change most likely to cause problems. + +Various other functions (such as compare) were removed or renamed. The ones +that were removed can be replaced with STL algorithms (for example, compare -> +lexicographical_compare) and the renamed ones were typically renamed to match +the STL. + +SecureBuffer can now be resized, so the second template parameter shouldn't be +considered to be the same as the actual size; it is instead the *initial* size +of the buffer. You can get the current size by calling the size() member +function. While it's possible to modify the size of a SecureBuffer now, don't +do it: it's really confusing, and you should just use a SecureVector instead. + +Second (optional, but a good idea): convert any functions taking a "const +SecureVector<T>&" to "const MemoryRegion<T>&"; this will let them work with +arbitrary memory types; in particular, it will work with a MemoryVector, which +doesn't lock. In fact, the compiler will convert it for you, but this will slow +things down quite a bit (since copying it requires an allocation and a memcpy), +so it's a good idea to do the change. + +* OctetString / SymmetricKey / InitializationVector + +This probably wins the 'most likely to cause compile errors' award. There are +two changes: + +1) copy() was renamed bits_of() + +2) The implicit conversion to byte* was removed. If you were passing it to + another library function as a byte*/u32bit pair, there is probably a version + taking the object directly; use that instead. + + If you were using it for something else, do the following: 1) call bits_of() + and use the returned SecureVector<byte> to get the byte* value, and 2) email + me so I can figure out if what you're doing it worth supporting in the + library (obviously strike this last part if what you're doing is so totally + one-off that nobody would ever need it elsewhere). + +* BigInt::zero() / BigInt::one() + +They were removed. Just use integer constants (0/1) instead; the performance +gain was extremely questionable, and there was lots of special glue to make +sure they worked correctly. + +* ASN.1 decoding + +If something took an X509_Encoding flag before, it probably doesn't +anymore. Some magical heuristics (BER::maybe_BER and PEM_Code::matches) have +been added which can successfully tell if something is PEM or BER/DER encoded +by looking at the data. The heuristics are not perfect (that's why I'm calling +them heuristics), but they work pretty well, and for the most part you would +have to go quite out of your way to fool them (you will be rewarded for your +hard work with an exception, when the decoding fails). + +The places that took it for encoding still do, as the library has no way to +guess which format you want it in. + +* General PK + +PK_Key::check_params() was renamed check_key() to better reflect what +operations are performed. + +* X.509 + +The first two arguments of X509_CA::update_crl (the list of new entries and the +CRL) have been swapped. Now the CRL is the first argument, and the list of +entires is the second. This just seemed a lot more natural. + +CRL_Usage was moved from being a global enumeration to a member of X509_Store, +which makes more sense as that is the only class that uses it. Just replace +CRL_Usage with X509_Store::CRL_Usage, and similarly for any elements of +CRL_Usage (ie, instead of TLS_SERVER, use X509_Store::TLS_SERVER). + +* PKCS #8 + +The PKCS #8 key loading interface has changed significantly in 1.4.0. First, +the versions taking memory objects have been completely removed. While it is, +in fact, quite useful to do this, it's not so useful that it's worth supporting +it in the library (IMO). Just create a DataSource_Memory and pass that to +load_key instead. In fact, here's the code: + +PKCS8_PrivateKey* load_key(const SecureVector<byte>& buffer, + const std::string& pass) + { + DataSource_Memory source(buffer); + return PKCS8::load_key(source, pass); + } + +See, that was easy. :) + +Second, instead of passing a std::string, representing the passphrase, you pass +a User_Interface& object, which a) will not be used if the key isn't encrypted, +and b) will be called as many times as needed until the correct passphrase it +entered (or until the number of tries exceeds the config option +"base/pkcs8_tries", which defaults to 3 (or until the ui object returns +CANCEL_ACTION)). + +The base User_Interface class is pretty brain-dead. The constructor takes an +(optional) passphrase, which it spits back out the first time it's called. The +second time it gets called, it will return CANCEL_ACTION. This behavior is for +compatibility with the old std::string interface (the functions still exist as +forwarding functions, which just create the base UI object and pass it to the +real decoding functions). + +Updating your code to use the new PKCS #8 functions could make things much +nicer in your interface (for example, popping up a dialog box that asks for the +passphrase, but only if the key really is encrypted). There is a GTK+ example +that shows how to do this, check the web page. + +* Public/Private Keys + +This is a pretty big change. Almost all of the PK objects used to have a +constructor taking a DataSource&, from which it would read the key. However, +this was a poor design, because if you guess incorrectly as to what kind of key +was in use, bad stuff would happen. So basically it was impossible to use +safely. In addition, it was rather complex to support. + +Use {X509,PKCS8}::load_key and dynamic_cast<> instead. It's a bit more code, +but it's worth it for the flexibility and better error handling. If you really +want something like the constructors, you can look at the try_load functions in +1.2.x's pkcs8.cpp and x509_key.cpp to see how they did it (you won't get the +same exact effect, since you can't add a constructor, but you can do something +that looks fairly similar). + diff --git a/doc/thanks.txt b/doc/thanks.txt new file mode 100644 index 000000000..0d1f538e7 --- /dev/null +++ b/doc/thanks.txt @@ -0,0 +1,44 @@ + +The following people (sorted alphabetically) contributed bug reports, useful +information, or were generally just helpful people to talk to: + +Jeff B +Mike Desjardins +Matthew Gregan +Hany Greiss +Friedemann Kleint +Ying-Chieh Liao +Dan Nicolaescu +Vaclav Ovsik +Ken Perano +Darren Starsmore +Kaushik Veeraraghavan +Dominik Vogt +James Widener + +Barry Kavanagh of AEP Systems Ltd kindly provided an AEP2000 crypto card and +drivers, enabling the creation of Botan's AEP engine module. + + +In addition, the following people have unknowingly contributed help: + + The implementation of DES is based off a public domain implementation by Phil + Karn from 1994 (he, in turn, credits Richard Outerbridge and Jim Gillogly). + + Rijndael and Square are based on the reference implementations written by + the inventors, Joan Daemen and Vincent Rijmen. + + The Serpent S-boxes used were discovered by Dag Arne Osvik and detailed in + his paper "Speeding Up Serpent". + + Matthew Skala's public domain twofish.c (as given in GnuPG 0.9.8) provided + the basis for my Twofish code (particularly the key schedule). + + Some of the hash functions (MD5, SHA-1, etc) use an optimized implementation + of one of the boolean functions, which was discovered by Colin Plumb. + + The design of Randpool takes some of it's design principles from those + suggested by Eric A. Young in his SSLeay documentation, Peter Gutmann's paper + "Software Generation of Practically Strong Random Numbers", and the paper + "Cryptanalytic Attacks on Pseudorandom Number Generators", by Kelsey, + Schneier, Wagner, and Hall. diff --git a/doc/todo.txt b/doc/todo.txt new file mode 100644 index 000000000..87647747c --- /dev/null +++ b/doc/todo.txt @@ -0,0 +1,56 @@ +Here are some notes about various things I should/could/might do. If you're +interested in working on something here (or something else!), drop me an email +and we can coordinate efforts. + +* Algorithms / Related + - Algorithms: ECDSA, ECDH, ... ? + +* X.509 / PKCS / ASN.1 + - X.509 code is in need of a major cleanup + - OCSP (RFC 2560) + - Attribute Certificates (RFC 3281) + - Support for Unicode (BMP STRING/UNIVERSAL STRING) strings in ASN1_String + - Support for Unicode/UTF-8 strings everywhere they may show up (certs, etc) + +* New Interfaces / Protocols + - SSL/TLS: Alpha release is available (http://ajisai.randombit.net) + - OpenPGP + - CMS: alpha1 is available as a separate download; currently stalled + - NIST's PKAPI: needs CMS + +* Modules + - EntropySources + z/OS, OS/400, VMS + - Compression: Zip, Gzip + - Dynamic Algorithm Loader + - Maybe, (maybe, maybe) integrate it with the stuff in algolist.cpp + so it can do automatic lookup. I'm rather skeptical of this approach + but it is a possibility. + - mp_asm64: z/Series + - HTTP certificate store access + - Engines + - VIA PadLock + - Broadcom BCM582x: Free Linux drivers are available, but I need a card + to test against. + - CryptoSwift: Rainbow blew me off when I contacted them. I have a card, + I just need drivers and API docs. + - Hifn: Sokretis sells them cheap, but drivers may be an issue. + - IBM 4758 / CCA + - HP / Atalla + - Intel Performance Primitives library + - PKCS #11 + - Other suggestions welcome + +* Configure / Build System + - The build system doesn't handle GCC on Windows well + - Support for new OSes: + - z/OS + - OS/400 + - VMS + - Hurd + - Plan 9 + - Support more packaging systems + - Debian + - Solaris + - MacOS X [Fink?] + - Windows binary installer diff --git a/doc/tutorial.tex b/doc/tutorial.tex new file mode 100644 index 000000000..9dcd80ff8 --- /dev/null +++ b/doc/tutorial.tex @@ -0,0 +1,875 @@ +\documentclass{article} + +\setlength{\textwidth}{6.5in} % 1 inch side margins +\setlength{\textheight}{9in} % ~1 inch top and bottom margins + +\setlength{\headheight}{0in} +\setlength{\topmargin}{0in} +\setlength{\headsep}{0in} + +\setlength{\oddsidemargin}{0in} +\setlength{\evensidemargin}{0in} + +\title{\textbf{Botan 1.4.x Tutorial}} +\author{Jack Lloyd \\ + \texttt{[email protected]}} +\date{} + +\newcommand{\filename}[1]{\texttt{#1}} +\newcommand{\manpage}[2]{\texttt{#1}(#2)} + +\newcommand{\macro}[1]{\texttt{#1}} + +\newcommand{\function}[1]{\textbf{#1}} +\newcommand{\type}[1]{\texttt{#1}} +\renewcommand{\arg}[1]{\textsl{#1}} +\newcommand{\variable}[1]{\textsl{#1}} + +\begin{document} + +\maketitle + +\tableofcontents + +\parskip=5pt +\pagebreak + +\section{Introduction} + +This document essentially sets up various simple scenarios and then shows how +to solve the problems using Botan 1.4.x. It's fairly simple, and doesn't cover +many of the available APIs and algorithms, especially the more obscure or +unusual ones. It is a supplement to the API documentation and the example +applications, which are included in the distribution. + +To quote the Perl man page: '``There's more than one way to do it.'' Divining +how many more is left as an exercise to the reader.' + +This is \emph{not} a general introduction to cryptography, and most simple +terms and ideas are not explained in any great detail. + +Finally, most of the code shown in this tutorial has not been tested, it was +just written down from memory. If you find errors, please let me know. + +\section{Initializing the Library} + +The first step to using Botan is the create a \type{LibraryInitializer} object, +which handles creating various internal structures, and also destroying them at +shutdown. Essentially: + +\begin{verbatim} +#include <botan/botan.h> +/* include other headers here */ + +int main() + { + LibraryInitializer init; + /* now do stuff */ + return 0; + } +\end{verbatim} + +\section{Hashing a File} + + + +\section{Symmetric Cryptography} + +\subsection{Encryption with a passphrase} + +Probably the most common crypto problem is encrypting a file (or some data that +is in-memory) using a passphrase. There are a million ways to do this, most of +them bad. In particular, you've have to protect against weak passphrases, +people reusing a passphrase many times, accidental and deliberate modification, +and a dozen other potential problems. + +We'll start with a simple method that is commonly used, and show the problems +that can arise. Each subsequent solution will modify the previous one to +prevent one or more common problems, until we arrive at a good version. + +In these examples, we'll always use Blowfish in CBC mode. Blowfish has been +around almost 10 years at this point, and is well known and trusted. The main +reason for choosing Blowfish over, say, TripleDES, is because Blowfish supports +nearly arbitrary key lengths, allowing us to easily try many different ways of +generating the keys. For production code, another algorithm (such as TripleDES +or AES) may be more appropriate. Whenever we need a hash function, we'll use +SHA-1, since that is a common and well-known hash that is thought to be secure. + +In all examples, we choose to derive the IV from the passphrase. Another +(probably more common) alternative is to generate the IV randomly and include +it at the beginning of the message. Either way is acceptable, and can be +secure. The method used here was chosen to make for more interesting examples +(because it's harder to get right), and may not be an appropriate choice for +some environments. + +First, some notation. The passphrase is stored as a \type{std::string} named +\variable{passphrase}. The input and output files (\variable{infile} and +\variable{outfile}) are of types \type{std::ifstream} and \type{std::ofstream} +(respectively). + +\subsubsection{First try} + +We hash the passphrase with SHA-1, and use the resulting hash to key Blowfish. +To generate the IV, we prepend a single '0' character to the passphrase, hash +it, and truncate it to 8 bytes (which is Blowfish's block size). + +\begin{verbatim} + HashFunction* hash = get_hash("SHA-1"); + + SymmetricKey key = hash->process(passphrase); + SecureVector<byte> raw_iv = hash->process('0' + passphrase); + InitializationVector iv(raw_iv, 8); + + Pipe pipe(get_cipher("Blowfish/CBC/PKCS7", key, iv, ENCRYPTION)); + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); + outfile << pipe; +\end{verbatim} + +\subsubsection{Problem 1: Buffering} + +There is a problem with the above code, if the input file is fairly large as +compared to available memory. Specifically, all of the encrypted data is stored +in memory, and then flushed to \variable{outfile} in a single go at the very +end. If the input file is big (say, a gigabyte), this will be most problematic. + +The solution is to use a \type{DataSink} to handle the output for us (writing +to \arg{outfile} will be implicit with writing to the \type{Pipe}). We can do +this by replacing the last few lines with: + +\begin{verbatim} + Pipe pipe(get_cipher("Blowfish/CBC/PKCS7", key, iv, ENCRYPTION), + new DataSink_Stream(outfile)); + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); +\end{verbatim} + +\subsubsection{Problem 2: Deriving the key and IV} + +Hash functions like SHA-1 are deterministic; if the same passphrase is supplied +twice, then the key (and in our case, the IV) will be the same. This is very +dangerous, and could easily open the whole system up to attack. What we need to +do is introduce a salt (or nonce) into the generation of the key from the +passphrase. This will mean that the key will not be the same each time the same +passphrase is typed in by a user. + +There is another problem with using a bare hash function to derive keys. While +it's inconceivable that an attacker could brute-force a 160-bit key, it would +be fairly simple for them to compute the SHA-1 hashes of various common +passwords ('password', the name of the dog, SO's middle name, etc) and try +those as keys. So we want to slow the attacker down if we can, and an easy way +to do that is to iterate the hash function a bunch of times (say, 1024 to 4096 +times). This will involve only a small amount of effort for a legitimate user +(since they only have to compute the hashes once, when they type in their +passphrase), but an attacker, trying out a large list of potential passphrases, +will be seriously annoyed by this. + +In this iteration of the example, we'll kill these two birds with one stone, +and derive the key from the passphrase using a S2K (string to key) algorithm +(these are also often called PBKDF algorithms, for Password Based Key +Derivation Function). In this example, we use PBKDF2 with HMAC(SHA-1), which is +specified in PKCS \#5. We replace the first four lines of code from the first +example with: + +\begin{verbatim} + S2K* s2k = get_s2k("PBKDF2(SHA-1)"); + // hard-coded iteration count for simplicity; should be sufficient + s2k->set_iterations(4096); + // 8 octets == 64 bit salt; again, good enough + s2k->new_random_salt(8); + SecureVector<byte> the_salt = s2k->current_salt(); + + // 28 octets == 20 for key + 8 for IV + SecureVector<byte> key_and_IV = s2k->derive_key(28, passphrase); + + SymmetricKey key(key_and_IV, 20); + InitializationVector iv(key_and_IV + 20, 8); +\end{verbatim} + +To complete the example, we have to remember to write out the salt (stored in +\variable{the\_salt}) at the beginning of the file. The receiving side needs to +know this value in order to restore it (by calling the \variable{s2k} object's +\function{change\_salt} function) so it can derive the same key and IV from the +passphrase. + +\subsubsection{Problem 3: Protecting against modification} + +As it is, an attacker can undetectably alter the message while it is in +transit. It is vital to remember that encryption does not imply authentication +(except when using special modes which are specifically designed to provide +authentication along with encryption, like OCB and EAX). For this purpose, we +will append a message authentication code to the encrypted +message. Specifically, we will generate an extra 160 bits of key data, and use +it to key the ``HMAC(SHA-1)'' MAC function. We don't want to have the MAC and +the cipher to share the same key; that is very much a no-no. + +\begin{verbatim} + // 48 octets == 20 for blowfish key + 8 for IV + 20 for hmac key + SecureVector<byte> keys_and_IV = s2k->derive_key(48, passphrase); + + SymmetricKey key(keys_and_IV, 20); + InitializationVector iv(keys_and_IV + 20, 8); + SymmetricKey mac_key(keys_and_IV + 28, 20); + + Pipe pipe(new Fork( + new Chain( + get_cipher("Blowfish/CBC/PKCS7", key, iv, ENCRYPTION), + new DataSink_Stream(outfile) + ), + new MAC_Filter("HMAC(SHA-1)", mac_key) + ) + ); + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); + + // now read the MAC from message #2. Message numbers start from 0 + SecureVector<byte> hmac = pipe.read_all(1); + outfile.write((const char*)hmac.ptr(), hmac.size()); +\end{verbatim} + +The receiver can check the size of the file (in bytes), and since it knows how +long the MAC is, can figure out how many bytes of ciphertext there are. Then it +reads in that many bytes, sending them to a Blowfish/CBC decryption object +(which could be obtained by calling \verb|get_cipher| with an argument of +\type{DECRYPTION} instead of \type{ENCRYPTION}), and storing the final bytes to +authenticate the message with. + +\subsubsection{Problem 4: Cleaning up the key generation} + +The method used to derive the keys and IV is rather inelegant, and it would be +nice to clean that up a bit, algorithmically speaking. A nice solution for this +is to generate a master key from the passphrase and salt, and then generate the +two keys and the IV (the cryptovariables) from that. + +Starting from the master key, we derive the cryptovariables using a KDF +algorithm, which is designed, among other things, to ``separate'' keys so that +we can derive several different keys from the single master key. For this +purpose, we will use KDF2, which is a generally useful KDF function (defined in +IEEE 1363a, among other standards). The use of different labels (``cipher +key'', etc) makes sure that each of the three derived variables will have +different values. + +\begin{verbatim} + S2K* s2k = get_s2k("PBKDF2(SHA-1)"); + // hard-coded iteration count for simplicity; should be sufficient + s2k->set_iterations(4096); + // 8 octet == 64 bit salt; again, good enough + s2k->new_random_salt(8); + // store the salt so we can write it to a file later + SecureVector<byte> the_salt = s2k->current_salt(); + + SymmetricKey master_key = s2k->derive_key(48, passphrase); + + KDF* kdf = get_kdf("KDF2(SHA-1)"); + + SymmetricKey key = kdf->derive_key(20, master_key, "cipher key"); + SymmetricKey mac_key = kdf->derive_key(20, master_key, "hmac key"); + InitializationVector iv = kdf->derive_key(8, master_key, "cipher iv"); +\end{verbatim} + +\subsubsection{Final version} + +Here is the final version of the encryption code, with all of the changes we've +made: + +\begin{verbatim} + S2K* s2k = get_s2k("PBKDF2(SHA-1)"); + s2k->set_iterations(4096); + s2k->new_random_salt(8); + SecureVector<byte> the_salt = s2k->current_salt(); + + SymmetricKey master_key = s2k->derive_key(48, passphrase); + + KDF* kdf = get_kdf("KDF2(SHA-1)"); + + SymmetricKey key = kdf->derive_key(20, master_key, "cipher key"); + SymmetricKey mac_key = kdf->derive_key(20, masterkey, "hmac key"); + InitializationVector iv = kdf->derive_key(8, masterkey, "cipher iv"); + + Pipe pipe(new Fork( + new Chain( + get_cipher("Blowfish/CBC/PKCS7", key, iv, ENCRYPTION), + new DataSink_Stream(outfile) + ), + new MAC_Filter("HMAC(SHA-1)", mac_key) + ) + ); + + outfile.write((const char*)the_salt.ptr(), the_salt.size()); + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); + + SecureVector<byte> hmac = pipe.read_all(1); + outfile.write((const char*)hmac.ptr(), hmac.size()); +\end{verbatim} + +\subsubsection{Another buffering technique} + +Sometimes the use of \type{DataSink\_Stream} is not practical for whatever +reason. In this case, an alternate buffering mechanism might be useful. Here is +some code which will write all the processed data as quickly as possible, so +that memory pressure is reduced in the case of large inputs. + +\begin{verbatim} + pipe.start_msg(); + SecureBuffer<byte, 1024> buffer; + while(infile.good()) + { + infile.read((char*)buffer.ptr(), buffer.size()); + u32bit got_from_infile = infile.gcount(); + pipe.write(buffer, got_from_infile); + + if(infile.eof()) + pipe.end_msg(); + + while(pipe.remaining() > 0) + { + u32bit buffered = pipe.read(buffer, buffer.size()); + outfile.write((const char*)buffer.ptr(), buffered); + } + } + if(infile.bad() || (infile.fail() && !infile.eof())) + throw Some_Exception(); +\end{verbatim} + +\pagebreak + +\subsection{Authentication} + +After doing the encryption routines, doing message authentication keyed off a +passphrase is not very difficult. In fact it's much easier than the encryption +case, for the following reasons: a) we only need one key, and b) we don't have +to store anything, so all the input can be done in a single step without +worrying about it taking up a lot of memory if the input file is large. + +In this case, we'll hex-encode the salt and the MAC, and output them both to +standard output (the salt followed by the MAC). + +\begin{verbatim} + S2K* s2k = get_s2k("PBKDF2(SHA-1)"); + s2k->set_iterations(4096); + s2k->new_random_salt(8); + OctetString the_salt = s2k->current_salt(); + + SymmetricKey hmac_key = s2k->derive_key(20, passphrase); + + Pipe pipe(new MAC_Filter("HMAC(SHA-1)", mac_key), + new Hex_Encoder + ); + + std::cout << the_salt.to_string(); // hex encoded + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); + std::cout << pipe.read_all_as_string() << std::endl; +\end{verbatim} + +\subsection{User Authentication} + +Doing user authentication off a shared passphrase is fairly easy. Essentially, +a challenge-response protocol is used - the server sends a random challenge, +and the client responds with an appropriate response to the challenge. The idea +is that only someone who knows the passphrase can generate or check to see if a +response is valid. + +Let's say we use 160 bit (20 byte) challenges, which seems fairly +reasonable. We can create this challenge using the global RNG: + +\begin{verbatim} + byte challenge[20]; + Global_RNG::randomize(challenge, sizeof(challenge), Nonce); + // send challenge to client +\end{verbatim} + +After reading the challenge, the client generates a response based on the +challenge and the passphrase. In this case, we will do it by repeatedly hashing +the challenge, the passphrase, and (if applicable) the previous digest. We +iterate this construction 4096 times, to make brute force attacks on the +passphrase hard to do. Since we are already using 160 bit challenges, a 160 bit +response seems warranted, so we'll use SHA-1. + +\begin{verbatim} + HashFunction* hash = get_hash("SHA-1"); + SecureVector<byte> digest; + for(u32bit j = 0; j != 4096; j++) + { + hash->update(digest, digest.size()); + hash->update(passphrase); + hash->update(challenge, challenge.size()); + digest = hash->final(); + } + delete hash; + // send value of digest to the server +\end{verbatim} + +Upon receiving the response from the client, the server computes what the +response should have been based on the challenge it sent out, and the +passphrase. If the two responses match, the the client is authenticated. +Otherwise, it is not. + +An alternate method is to use PBKDF2 again, using the challenge as the salt. In +this case, the response could (for example) be the hash of the key produced by +PBKDF2. There is no reason to have an explicit iteration loop, as PBKDF2 is +designed to prevent dictionary attacks (assuming PBKDF2 is set up for a large +iteration count internally). + +\pagebreak + +\section{Public Key Cryptography} + +\subsection{Basic Operations} + +In this section, we'll assume we have a \type{X509\_PublicKey*} named +\arg{pubkey}, and, if necessary, a private key type (a +\type{PKCS8\_PrivateKey*}) named \arg{privkey}. A description of these types, +how to create them, and related details appears later in this tutorial. In this +section, we will use various functions which are defined in +\filename{look\_pk.h} -- you will have to include this header explicitly. + +\subsubsection{Encryption} + +Basically, pick an encoding method, create a \type{PK\_Encryptor} (with +\function{get\_pk\_encryptor}()), and use it. But first we have to make sure +the public key can actually be used for public key encryption. For encryption +(and decryption), the key could be RSA, ElGamal, or (in future versions) some +other public key encryption scheme, like Rabin or an elliptic curve scheme. + +\begin{verbatim} + PK_Encrypting_Key* key = dynamic_cast<PK_Encrypting_Key*>(pubkey); + if(!key) + error(); + PK_Encryptor* enc = get_pk_encryptor(*key, "EME1(SHA-1)"); + + byte msg[] = { /* ... */ }; + + // will also accept a SecureVector<byte> as input + SecureVector<byte> ciphertext = enc->encrypt(msg, sizeof(msg)); +\end{verbatim} + +\subsubsection{Decryption} + +This is essentially the same as the encryption operation, but using a private +key instead. One major difference is that the decryption operation can fail due +to the fact that the ciphertext was invalid (most common padding schemes, such +as ``EME1(SHA-1)'', include various pieces of redundancy, which are checked +after decryption). + +\begin{verbatim} + PK_Decrypting_Key* key = dynamic_cast<PK_Decrypting_Key*>(privkey); + if(!key) + error(); + PK_Decryptor* dec = get_pk_decryptor(*key, "EME1(SHA-1)"); + + byte msg[] = { /* ... */ }; + + SecureVector<byte> plaintext; + + try { + // will also accept a SecureVector<byte> as input + plaintext = dec->decrypt(msg, sizeof(msg)); + } + catch(Decoding_Error) + { + /* the ciphertext was invalid */ + } +\end{verbatim} + +\subsubsection{Signature Generation} + +There is one difficulty with signature generation that does not occur with +encryption or decryption. Specifically, there are various padding methods which +can be useful for different signature algorithms, and not all are appropriate +for all signature schemes. The following table breaks down what algorithms +support which encodings: + +\begin{tabular}{|c|c|c|} \hline +Signature Algorithm & Usable Encoding Methods & Preferred Encoding(s) \\ \hline +DSA / NR & EMSA1 & EMSA1 \\ \hline +RSA & EMSA1, EMSA2, EMSA3, EMSA4 & EMSA3, EMSA4 \\ \hline +Rabin-Williams & EMSA2, EMSA4 & EMSA2, EMSA4 \\ \hline +\end{tabular} + +For new applications, use EMSA4 with both RSA and Rabin-Williams, as it is +significantly more secure than the alternatives. However, most current +applications/libraries only support EMSA2 with Rabin-Williams and EMSA3 with +RSA. Given this, you may be forced to use less secure encoding methods for the +near future. In these examples, we punt on the problem, and hard-code using +EMSA1 with SHA-1. + +\begin{verbatim} + PK_Signing_Key* key = dynamic_cast<PK_Signing_Key*>(privkey); + if(!key) + error(); + PK_Signer* signer = get_pk_signer(*key, "EMSA1(SHA-1)"); + + byte msg[] = { /* ... */ }; + + /* + You can also repeatedly call update(const byte[], u32bit), followed by a + call to signature(), which will return the final sig of all the date that + was passed through update(). sign_message() is just a stub that calls + update() once, and returns the value of signature(). + */ + SecureVector<byte> signature = signer->sign_message(msg, sizeof(msg)); +\end{verbatim} + +\pagebreak + +\subsubsection{Signature Verification} + +In addition to all of the problems with choosing the correct padding method, +there is yet another complication with verifying a signature. Namely, there are +two varieties of signature algorithms - those providing message recovery (that +is, the value that was signed can be directly recovered by someone verifying +the signature), and those without message recovery (the verify operation simply +returns if the signature was valid, without telling you exactly what was +signed). This leads to two slightly different implementations of the +verification operation, which user code has to work with. As you can see +however, the implementation is still not at all difficult. + +\begin{verbatim} + PK_Verifier* verifier = 0; + + PK_Verifying_with_MR_Key* key1 = + dynamic_cast<PK_Verifying_with_MR_Key*>(pubkey); + PK_Verifying_wo_MR_Key* key2 = + dynamic_cast<PK_Verifying_wo_MR_Key*>(pubkey); + + if(key1) + verifier = get_pk_verifier(*key1, "EMSA1(SHA-1)"); + else if(key2) + verifier = get_pk_verifier(*key2, "EMSA1(SHA-1)"); + else + error(); + + byte msg[] = { /* ... */ }; + byte sig[] = { /* ... */ }; + + /* + Like PK_Signer, you can also do repeated calls to + void update(const byte some_data[], u32bit length) + followed by a call to + bool check_signature(const byte the_sig[], u32bit length) + which will return true (valid signature) or false (bad signature). + The function verify_message() is a simple wrapper around update() and + check_signature(). + + */ + bool is_valid = verifier->verify_message(msg, sizeof(msg), sig, sizeof(sig)); +\end{verbatim} + +\subsubsection{Key Agreement} + +WRITEME + +\pagebreak + +\subsection{Working with Keys} + +\subsubsection{Reading Public Keys (X.509 format)} + +There are two separate ways to read X.509 public keys. Remember that the X.509 +public keys are simply that: public keys. There is no associated information +(such as the owner of that key) included with the public key itself. If you +need that kind of information, you'll need to use X.509 certificates. + +However, there are surely times when a simple public key is sufficient. The +most obvious is when the key is implicitly trusted, for example if access +and/or modification of it is controlled by something else (like filesystem +ACLs). In other cases, it is a perfectly reasonable proposition to use them +over the network as an anonymous key exchange mechanism. This is, admittedly, +vulnerable to man-in-the-middle attacks, but it's simple enough that it's hard +to mess up (see, for example, Peter Guttman's paper ``Lessons Learned in +Implementing and Deploying Crypto Software'' in Usenix '02). + +The way to load a key is basically to set up a \type{DataSource} and then call +\function{X509::load\_key}, which will return a \type{X509\_PublicKey*}. For +example: + +\begin{verbatim} + DataSource_Stream somefile("somefile.pem"); // has 3 public keys + X509_PublicKey* key1 = X509::load_key(somefile); + X509_PublicKey* key2 = X509::load_key(somefile); + X509_PublicKey* key3 = X509::load_key(somefile); + // Now we have them all loaded. Huzah! +\end{verbatim} + +At this point you can use \function{dynamic\_cast} to find the operations the +key supports (by seeing if a cast to \type{PK\_Encrypting\_Key}, +\type{PK\_Verifying\_with\_MR\_Key}, or \type{PK\_Verifying\_wo\_MR\_Key} +succeeds). + +There is a variant of \function{X509::load\_key} (and of +\function{PKCS8::load\_key}, described in the next section) which take a +filename (as a \type{std::string}). These are just convenience functions which +create the appropriate \type{DataSource} for you and call the main +\function{X509::load\_key}. + +\subsubsection{Reading Private Keys (PKCS \#8 format)} + +This is very similar to reading raw public keys, with the difference that the +key may be encrypted with a user passphrase: + +\begin{verbatim} + DataSource_Stream somefile("somefile"); + std::string a_passphrase = /* get from the user */ + PKCS8_PrivateKey* key = PKCS8::load_key(somefile, a_passphrase); +\end{verbatim} + +You can, by the way, convert a \type{PKCS8\_PrivateKey} to a +\type{X509\_PublicKey} simply by casting it (with \function{dynamic\_cast}), as +the private key type is derived from \type{X509\_PublicKey}. As with +\type{X509\_PublicKey}, you can use \function{dynamic\_cast} to figure out what +operations the private key is capable of; in particular, you can attempt to +cast it to \type{PK\_Decrypting\_Key}, \type{PK\_Signing\_Key}, or +\type{PK\_Key\_Agreement\_Key}. + +Sometimes you can get away with having a static passphrase passed to +\function{load\_key}. Typically, however, you'll have to do some user +interaction to get the appropriate passphrase. In that case you'll want to use +the the \type{UI} related interface, which is fully described in the API +documentation. + +\subsubsection{Generating New Private Keys} + +Generate a new private key is the one operation which requires you to +explicitly name the type of key you are working with. There are (currently) two +kinds of public key algorithms in Botan: ones based on the integer +factorization (IF) problem (RSA and Rabin-Williams), and ones based on the +discrete logarithm (DL) problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and +ElGamal). Since discrete logarithm parameters (primes and generators) can be +shared among many keys, there is the notion of these being a combined type +(called \type{DL\_Group}). + +To create a new DL-based private key, simply pass a desired \type{DL\_Group} to +the constructor of the private key - a new public/private key pair will be +generated. Since in IF-based algorithms, the modulus used isn't shared by other +keys, we don't use this notion. You can create a new key by passing in a +\type{u32bit} telling how long (in bits) the key should be. + +There are quite a few ways to get a \type{DL\_Group} object. The best is to use +the function \function{get\_dl\_group}, which takes a string naming a group; it +will either return that group, if it knows about it, or throw an +exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536, +2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF +groups are the ones specified for use with IPSec, and the DSA ones are the +default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, use +the ``DSA-n'' groups, and for Diffie-Hellman and ElGamal, use the ``IETF-n'' +groups. + +You can also generate a new random group. This is not recommend, because it is +very slow, particularly for ``safe'' primes, which are needed for +Diffie-Hellman and ElGamal. + +Some examples: + +\begin{verbatim} + RSA_PrivateKey rsa1(512); // 512-bit RSA key + RSA_PrivateKey rsa2(2048); // 2048-bit RSA key + + RW_PrivateKey rw1(1024); // 1024-bit Rabin-Williams key + RW_PrivateKey rw2(1536); // 1536-bit Rabin-Williams key + + DSA_PrivateKey dsa(get_dl_group("DSA-512")); // 512-bit DSA key + DH_PrivateKey dh(get_dl_group("IETF-4096")); // 4096-bit DH key + NR_PrivateKey nr(get_dl_group("DSA-1024")); // 1024-bit NR key + ElGamal_PrivateKey elg(get_dl_group("IETF-1536")); // 1536-bit ElGamal key +\end{verbatim} + +To export your newly created private key, use the PKCS \#8 routines in +\filename{pkcs8.h}: + +\begin{verbatim} + std::string a_passphrase = /* get from the user */ + std::string the_key = PKCS8::PEM_encode(rsa2, a_passphrase); +\end{verbatim} + +You can read the key back in using \function{PKCS8::load\_key}, described in +the section ``Reading Private Keys (PKCS \#8 format)'', above. Unfortunately, +this only works with keys that have an assigned algorithm identifier and +standardized format. Currently this is only the RSA, DSA, DH, and ElGamal +algorithms, though RW and NR keys can also be imported and exported by +assigning them an OID (this can be done either through a configuration file, or +by calling the function \function{OIDS::add\_oid} in \filename{oids.h}). Be +aware that the OID and format for ElGamal keys is not exactly standard, but +there does exist at least one other crypto library which will accept the +format. + +The raw public can be exported using: + +\begin{verbatim} + std::string the_public_key = X509::PEM_encode(rsa2); +\end{verbatim} + +\pagebreak + +\section{X.509v3 Certificates} + +Using certificates is rather complicated, so only the very basic mechanisms are +going to be covered here. The section ``Setting up a CA'' goes into reasonable +detail about CRLs and certificate requests, but there is a lot that isn't +covered (else this section would get quite long and complicated). + +\subsection{Importing and Exporting Certificates} + +Importing and exporting X.509 certificates is easy. Simply call the constructor +with either a \type{DataSource\&}, or the name of a file: + +\begin{verbatim} + X509_Certificate cert1("cert1.pem"); + + /* This file contains two certificates, concatenated */ + DataSource_Stream in("certs2_and_3.pem"); + + X509_Certificate cert2(in); // read the first cert + X509_Certificate cert3(in); // read the second cert +\end{verbatim} + +Exporting the certificate is a simple matter of calling the member function +\function{PEM\_encode}(), which returns a \type{std::string} containing the +certificate in PEM encoding. + +\begin{verbatim} + std::cout << cert3.PEM_encode(); + some_ostream_object << cert1.PEM_encode(); + std::string cert2_str = cert2.PEM_encode(); +\end{verbatim} + +\subsection{Verifying Certificates} + +Verifying a certificate requires that we build up a chain of trust, starting +from the root (usually a commercial CA), down through some number of +intermediate CAs, and finally reaching the actual certificate in +question. Thus, to verify, we actually have to have all of those certificates +on hand (or at the very least, know where we can get the ones we need). + +The class which handles both storing certificates, and verifying them, is +called \type{X509\_Store}. We'll start by assuming that we have all of the +certificates we need, and just want to verify a cert. This is done by calling +the member function \function{validate\_cert}, which takes the +\type{X509\_Certificate} in question, and an optional argument of type +\type{Cert\_Usage} (which is ignored here; read the section in the API doc +titled ``Verifying Certificates for information). It returns an enum; +\type{X509\_Code}, which, for most purposes, is either \type{VERIFIED}, or +something else (which specifies what circumstance caused the certificate to be +considered invalid). Really, that's it. + +Now, how to let \type{X509\_Store} know about all those certificates and CRLs +we have lying around? The simplest method is to add them directly, using the +functions \function{add\_cert}, \function{add\_certs}, +\function{add\_trusted\_certs}, and \function{add\_crl}; for details, consult +the API doc or read the \filename{x509stor.h} header. There is also a much more +elegant and powerful method: \type{Certificate\_Store}s. A certificate store +refers to an object which knows how to retrieve certificates from some external +source (a file, an LDAP directory, a HTTP server, a SQL database, or anything +else). By calling the function \function{add\_new\_certstore}, you can register +a new certificate store, which \type{X509\_Store} will use to find certificates +it needs. Thus, you can get away with only adding whichever root CA cert(s) you +want to use, letting some online source handle the storage of all intermediate +X.509 certificates. The API documentation has a more complete discussion of +\type{Certificate\_Store}. + +\subsection{Setting up a CA} + +WRITEME + +\pagebreak + +\section{Special Topics} + +This chapter is for subject which don't really fit into the API documentation +or into other chapters of the tutorial. + +\subsection{GUIs} + +There is nothing particularly special about using Botan in a GUI-based +application. However there are a few tricky spots, as well as a few ways to +take advantage of an event-based application. + +\subsubsection{Initialization} + +Generally you will create the \type{LibraryInitializer} somewhere in +\texttt{main}, before entering the event loop. One problem is that some GUI +libraries take over \texttt{main} and drop you right into the event loop; the +question then is how to initialize the library? The simplest way is probably to +have a static flag that marks if you have already initialized the library or +not. When you enter the event loop, check to see if this flag has not been set, +and if so, initialize the library using the function-based initializers. Using +\type{LibraryInitializer} obviously won't work in this case, since it would be +destroyed as soon as the current event handler finished. You then deinitialize +the library whenever your application is signaled to quit. + +\subsubsection{Interacting With the Library} + +In the simple case, the user will do stuff asynchronously, and then in response +your code will do things like encrypt a file or whatever, which can all be done +synchronously, since the data is right there for you. An application doing +something like this is basically going to look exactly like a command line +application that uses Botan, the only major difference being that the calls to +the library are inside event handlers. + +Much harder is something like an SSH client, where you're acting as a layer +between two asynchronous things (the user and the network). This actually isn't +restricted to GUIs at all (text-mode SSH clients have to deal with most of the +same problems), but it is probably more common with a GUI app. The following +discussion is fairly vague, but hopefully somewhat useful. + +There are a few facilities in Botan that are primarily designed to be used by +applications based on an event loop. See the section ``User Interfaces'' in the +main API doc for details. + +\subsubsection{Entropy} + +One nice advantage of using a GUI is opening a new method of gathering entropy +for the library. This is especially handy on Windows, where the available +sources of entropy are pretty questionable. In many versions, +\texttt{CryptGenRandom} is really rather poor, and the Toolhelp functions may +not provide much data on a small system (such as a handheld). For example, in +GTK+, you can use the following callback to get information about mouse +movements: + +\begin{verbatim} +static gint add_entropy(GtkWidget* widget, GdkEventMotion* event) + { + if(event) + Global_RNG::add_entropy(event, sizeof(GdkEventMotion)); + return FALSE; + } +\end{verbatim} + +And then register it with your main GTK window (presumably named +\variable{window}) as follows: + +\begin{verbatim} +gtk_signal_connect(GTK_OBJECT(window), "motion_notify_event", + GTK_SIGNAL_FUNC(add_entropy), NULL); + +gtk_widget_set_events(window, GDK_POINTER_MOTION_MASK); +\end{verbatim} + +Even though we're catching all mouse movements, and hashing the results into +the entropy pool, this doesn't use up more than a few percent of even a +relatively slow desktop CPU. Note that in the case of using X over a network, +catching all mouse events would cause large amounts of X traffic over the +network, which might make your application slow, or even unusable (I haven't +tried it, though). + +This could be made nicer if the collection function did something like +calculating deltas between each run, storing them into a buffer, and then when +enough of them have been added, hashing them and send them all to the PRNG in +one shot. This would not only reduce load, but also prevent the PRNG from +overestimating the amount of entropy it's getting, since its estimates don't +(can't) take history into account. For example, you could move the mouse back +and forth one pixel, and the PRNG would think it was getting a full load of +entropy each time, when actually it was getting (at best) a bit or two. + +\end{document} |