aboutsummaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorlloyd <[email protected]>2010-08-09 15:46:47 +0000
committerlloyd <[email protected]>2010-08-09 15:46:47 +0000
commitb741408407984fe65e038bc3c8ea92763e02aabc (patch)
treed1d6061123a994fdfe500ee25750eb24c29a5ab5 /doc
parent2bf61fc97ac16b6ae852ad8534c83018d6d17409 (diff)
Move the tutorial to old_tutorial since it's badly out of date and
doesn't really answer questions people commonly have. Add the first bits of a new tutorial that will hopefully be more helpful; more of a "Q: I want to do X, how do I do this?" "A: You do X with this code ..." and spending less time doing things like incrementally building code starting from poorly done versions since that really is probably just confusing people.
Diffstat (limited to 'doc')
-rw-r--r--doc/old_tutorial.tex880
-rw-r--r--doc/tutorial.tex843
2 files changed, 932 insertions, 791 deletions
diff --git a/doc/old_tutorial.tex b/doc/old_tutorial.tex
new file mode 100644
index 000000000..ee3bc936b
--- /dev/null
+++ b/doc/old_tutorial.tex
@@ -0,0 +1,880 @@
+\documentclass{article}
+
+\setlength{\textwidth}{6.5in} % 1 inch side margins
+\setlength{\textheight}{9in} % ~1 inch top and bottom margins
+
+\setlength{\headheight}{0in}
+\setlength{\topmargin}{0in}
+\setlength{\headsep}{0in}
+
+\setlength{\oddsidemargin}{0in}
+\setlength{\evensidemargin}{0in}
+
+\title{\textbf{Botan Tutorial}}
+\author{Jack Lloyd \\
+ \texttt{[email protected]}}
+\date{2009/07/08}
+
+\newcommand{\filename}[1]{\texttt{#1}}
+\newcommand{\manpage}[2]{\texttt{#1}(#2)}
+
+\newcommand{\macro}[1]{\texttt{#1}}
+
+\newcommand{\function}[1]{\textbf{#1}}
+\newcommand{\type}[1]{\texttt{#1}}
+\renewcommand{\arg}[1]{\textsl{#1}}
+\newcommand{\variable}[1]{\textsl{#1}}
+
+\begin{document}
+
+\maketitle
+
+\tableofcontents
+
+\parskip=5pt
+\pagebreak
+
+\section{Introduction}
+
+This document essentially sets up various simple scenarios and then
+shows how to solve the problems using Botan. It's fairly simple, and
+doesn't cover many of the available APIs and algorithms, especially
+the more obscure or unusual ones. It is a supplement to the API
+documentation and the example applications, which are included in the
+distribution.
+
+To quote the Perl man page: '``There's more than one way to do it.''
+Divining how many more is left as an exercise to the reader.'
+
+This is \emph{not} a general introduction to cryptography, and most simple
+terms and ideas are not explained in any great detail.
+
+Finally, most of the code shown in this tutorial has not been tested, it was
+just written down from memory. If you find errors, please let me know.
+
+\section{Initializing the Library}
+
+The first step to using Botan is to create a \type{LibraryInitializer} object,
+which handles creating various internal structures, and also destroying them at
+shutdown. Essentially:
+
+\begin{verbatim}
+#include <botan/botan.h>
+/* include other headers here */
+
+int main()
+ {
+ LibraryInitializer init;
+ /* now do stuff */
+ return 0;
+ }
+\end{verbatim}
+
+\section{Hashing a File}
+
+\section{Symmetric Cryptography}
+
+\subsection{Encryption with a passphrase}
+
+Probably the most common crypto problem is encrypting a file (or some data that
+is in-memory) using a passphrase. There are a million ways to do this, most of
+them bad. In particular, you have to protect against weak passphrases,
+people reusing a passphrase many times, accidental and deliberate modification,
+and a dozen other potential problems.
+
+We'll start with a simple method that is commonly used, and show the problems
+that can arise. Each subsequent solution will modify the previous one to
+prevent one or more common problems, until we arrive at a good version.
+
+In these examples, we'll always use Serpent in Cipher-Block Chaining
+(CBC) mode. Whenever we need a hash function, we'll use SHA-256, since
+that is a common and well-known hash that is thought to be secure.
+
+In all examples, we choose to derive the Initialization Vector (IV) from the
+passphrase. Another (probably more common) alternative is to generate the IV
+randomly and include it at the beginning of the message. Either way is
+acceptable, and can be secure. The method used here was chosen to make for more
+interesting examples (because it's harder to get right), and may not be an
+appropriate choice for some environments.
+
+First, some notation. The passphrase is stored as a \type{std::string} named
+\variable{passphrase}. The input and output files (\variable{infile} and
+\variable{outfile}) are of types \type{std::ifstream} and \type{std::ofstream}
+(respectively).
+
+\subsubsection{First try}
+
+We hash the passphrase with SHA-256, and use the resulting hash to key
+Serpent. To generate the IV, we prepend a single '0' character to the
+passphrase, hash it, and truncate it to 16 bytes (which is Serpent's
+block size).
+
+\begin{verbatim}
+ HashFunction* hash = get_hash("SHA-256");
+
+ SymmetricKey key = hash->process(passphrase);
+ SecureVector<byte> raw_iv = hash->process('0' + passphrase);
+ InitializationVector iv(raw_iv, 16);
+
+ Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION));
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+ outfile << pipe;
+\end{verbatim}
+
+\subsubsection{Problem 1: Buffering}
+
+There is a problem with the above code, if the input file is fairly large as
+compared to available memory. Specifically, all the encrypted data is stored
+in memory, and then flushed to \variable{outfile} in a single go at the very
+end. If the input file is big (say, a gigabyte), this will be most problematic.
+
+The solution is to use a \type{DataSink} to handle the output for us (writing
+to \arg{outfile} will be implicit with writing to the \type{Pipe}). We can do
+this by replacing the last few lines with:
+
+\begin{verbatim}
+ Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
+ new DataSink_Stream(outfile));
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+\end{verbatim}
+
+\subsubsection{Problem 2: Deriving the key and IV}
+
+Hash functions like SHA-256 are deterministic; if the same passphrase
+is supplied twice, then the key (and in our case, the IV) will be the
+same. This is very dangerous, and could easily open the whole system
+up to attack. What we need to do is introduce a salt (or nonce) into
+the generation of the key from the passphrase. This will mean that the
+key will not be the same each time the same passphrase is typed in by
+a user.
+
+There is another problem with using a bare hash function to derive
+keys. While it's inconceivable (based on our current understanding of
+thermodynamics and theories of computation) that an attacker could
+brute-force a 256-bit key, it would be fairly simple for them to
+compute the SHA-256 hashes of various common passwords ('password',
+the name of the dog, the significant other's middle name, favorite
+sports team) and try those as keys. So we want to slow the attacker
+down if we can, and an easy way to do that is to iterate the hash
+function a bunch of times (say, 1024 to 4096 times). This will involve
+only a small amount of effort for a legitimate user (since they only
+have to compute the hashes once, when they type in their passphrase),
+but an attacker, trying out a large list of potential passphrases,
+will be seriously annoyed (and slowed down) by this.
+
+In this iteration of the example, we'll kill these two birds with one
+stone, and derive the key from the passphrase using a PBKDF
+(Password-Based Key Derivation Function). In this example, we use
+PBKDF2 with Hash Message Authentication Code (HMAC(SHA-256)), which is
+specified in PKCS \#5. We replace the first four lines of code from
+the first example with:
+
+\begin{verbatim}
+ PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
+ // hard-coded iteration count for simplicity; should be sufficient
+ pbkdf->set_iterations(10000);
+ // 8 octets == 64-bit salt; again, good enough
+ pbkdf->new_random_salt(8);
+ SecureVector<byte> the_salt = pbkdf->current_salt();
+
+ // 48 octets == 32 for key + 16 for IV
+ SecureVector<byte> key_and_IV = pbkdf->derive_key(48, passphrase).bits_of();
+
+ SymmetricKey key(key_and_IV, 32);
+ InitializationVector iv(key_and_IV + 32, 16);
+\end{verbatim}
+
+To complete the example, we have to remember to write out the salt (stored in
+\variable{the\_salt}) at the beginning of the file. The receiving side needs to
+know this value in order to restore it (by calling the \variable{pbkdf} object's
+\function{change\_salt} function) so it can derive the same key and IV from the
+passphrase.
+
+\subsubsection{Problem 3: Protecting against modification}
+
+As it is, an attacker can undetectably alter the message while it is
+in transit. It is vital to remember that encryption does not imply
+authentication (except when using special modes that are specifically
+designed to provide authentication along with encryption, like OCB and
+EAX). For this purpose, we will append a message authentication code
+to the encrypted message. Specifically, we will generate an extra 256
+bits of key data, and use it to key the ``HMAC(SHA-256)'' MAC
+function. We don't want to have the MAC and the cipher to share the
+same key; that is very much a no-no.
+
+\begin{verbatim}
+ // 80 octets == 32 for cipher key + 16 for IV + 32 for hmac key
+ SecureVector<byte> keys_and_IV = pbkdf->derive_key(80, passphrase);
+
+ SymmetricKey key(keys_and_IV, 32);
+ InitializationVector iv(keys_and_IV + 32, 16);
+ SymmetricKey mac_key(keys_and_IV + 32 + 16, 32);
+
+ Pipe pipe(new Fork(
+ new Chain(
+ get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
+ new DataSink_Stream(outfile)
+ ),
+ new MAC_Filter("HMAC(SHA-256)", mac_key)
+ )
+ );
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+
+ // now read the MAC from message #2. Message numbers start from 0
+ SecureVector<byte> hmac = pipe.read_all(1);
+ outfile.write((const char*)hmac.ptr(), hmac.size());
+\end{verbatim}
+
+The receiver can check the size of the file (in bytes), and since it knows how
+long the MAC is, can figure out how many bytes of ciphertext there are. Then it
+reads in that many bytes, sending them to a Serpent/CBC decryption object
+(which could be obtained by calling \verb|get_cipher| with an argument of
+\type{DECRYPTION} instead of \type{ENCRYPTION}), and storing the final bytes to
+authenticate the message with.
+
+\subsubsection{Problem 4: Cleaning up the key generation}
+
+The method used to derive the keys and IV is rather inelegant, and it would be
+nice to clean that up a bit, algorithmically speaking. A nice solution for this
+is to generate a master key from the passphrase and salt, and then generate the
+two keys and the IV (the cryptovariables) from that.
+
+Starting from the master key, we derive the cryptovariables using a KDF
+algorithm, which is designed, among other things, to ``separate'' keys so that
+we can derive several different keys from the single master key. For this
+purpose, we will use KDF2, which is a generally useful KDF function (defined in
+IEEE 1363a, among other standards). The use of different labels (``cipher
+key'', etc) makes sure that each of the three derived variables will have
+different values.
+
+\begin{verbatim}
+ PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
+ // hard-coded iteration count for simplicity; should be sufficient
+ pbkdf->set_iterations(10000);
+ // 8 octet == 64-bit salt; again, good enough
+ pbkdf->new_random_salt(8);
+ // store the salt so we can write it to a file later
+ SecureVector<byte> the_salt = pbkdf->current_salt();
+
+ SymmetricKey master_key = pbkdf->derive_key(48, passphrase);
+
+ KDF* kdf = get_kdf("KDF2(SHA-256)");
+
+ SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
+ SymmetricKey mac_key = kdf->derive_key(32, master_key, "hmac key");
+ InitializationVector iv = kdf->derive_key(16, master_key, "cipher iv");
+\end{verbatim}
+
+\subsubsection{Final version}
+
+Here is the final version of the encryption code, with all the changes we've
+made:
+
+\begin{verbatim}
+ PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
+ pbkdf->set_iterations(10000);
+ pbkdf->new_random_salt(8);
+ SecureVector<byte> the_salt = pbkdf->current_salt();
+
+ SymmetricKey master_key = pbkdf->derive_key(48, passphrase);
+
+ KDF* kdf = get_kdf("KDF2(SHA-256)");
+
+ SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
+ SymmetricKey mac_key = kdf->derive_key(32, masterkey, "hmac key");
+ InitializationVector iv = kdf->derive_key(16, masterkey, "cipher iv");
+
+ Pipe pipe(new Fork(
+ new Chain(
+ get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
+ new DataSink_Stream(outfile)
+ ),
+ new MAC_Filter("HMAC(SHA-256)", mac_key)
+ )
+ );
+
+ outfile.write((const char*)the_salt.ptr(), the_salt.size());
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+
+ SecureVector<byte> hmac = pipe.read_all(1);
+ outfile.write((const char*)hmac.ptr(), hmac.size());
+\end{verbatim}
+
+\subsubsection{Another buffering technique}
+
+Sometimes the use of \type{DataSink\_Stream} is not practical for whatever
+reason. In this case, an alternate buffering mechanism might be useful. Here is
+some code which will write all the processed data as quickly as possible, so
+that memory pressure is reduced in the case of large inputs.
+
+\begin{verbatim}
+ pipe.start_msg();
+ SecureBuffer<byte, 1024> buffer;
+ while(infile.good())
+ {
+ infile.read((char*)buffer.ptr(), buffer.size());
+ u32bit got_from_infile = infile.gcount();
+ pipe.write(buffer, got_from_infile);
+
+ if(infile.eof())
+ pipe.end_msg();
+
+ while(pipe.remaining() > 0)
+ {
+ u32bit buffered = pipe.read(buffer, buffer.size());
+ outfile.write((const char*)buffer.ptr(), buffered);
+ }
+ }
+ if(infile.bad() || (infile.fail() && !infile.eof()))
+ throw Some_Exception();
+\end{verbatim}
+
+\pagebreak
+
+\subsection{Authentication}
+
+After doing the encryption routines, doing message authentication keyed off a
+passphrase is not very difficult. In fact it's much easier than the encryption
+case, for the following reasons: a) we only need one key, and b) we don't have
+to store anything, so all the input can be done in a single step without
+worrying about it taking up a lot of memory if the input file is large.
+
+In this case, we'll hex-encode the salt and the MAC, and output them both to
+standard output (the salt followed by the MAC).
+
+\begin{verbatim}
+ PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
+ pbkdf->set_iterations(10000);
+ pbkdf->new_random_salt(8);
+ OctetString the_salt = pbkdf->current_salt();
+
+ SymmetricKey hmac_key = pbkdf->derive_key(32, passphrase);
+
+ Pipe pipe(new MAC_Filter("HMAC(SHA-256)", mac_key),
+ new Hex_Encoder
+ );
+
+ std::cout << the_salt.to_string(); // hex encoded
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+ std::cout << pipe.read_all_as_string() << std::endl;
+\end{verbatim}
+
+\subsection{User Authentication}
+
+Doing user authentication off a shared passphrase is fairly easy. Essentially,
+a challenge-response protocol is used - the server sends a random challenge,
+and the client responds with an appropriate response to the challenge. The idea
+is that only someone who knows the passphrase can generate or check to see if a
+response is valid.
+
+Let's say we use 160-bit (20 byte) challenges, which seems fairly
+reasonable. We can create this challenge using the global random
+number generator (RNG):
+
+\begin{verbatim}
+ byte challenge[20];
+ Global_RNG::randomize(challenge, sizeof(challenge), Nonce);
+ // send challenge to client
+\end{verbatim}
+
+After reading the challenge, the client generates a response based on
+the challenge and the passphrase. In this case, we will do it by
+repeatedly hashing the challenge, the passphrase, and (if applicable)
+the previous digest. We iterate this construction 10000 times, to make
+brute force attacks on the passphrase hard to do. Since we are already
+using 160-bit challenges, a 160-bit response seems warranted, so we'll
+use SHA-1.
+
+\begin{verbatim}
+ HashFunction* hash = get_hash("SHA-1");
+ SecureVector<byte> digest;
+ for(u32bit j = 0; j != 10000; j++)
+ {
+ hash->update(digest, digest.size());
+ hash->update(passphrase);
+ hash->update(challenge, challenge.size());
+ digest = hash->final();
+ }
+ delete hash;
+ // send value of digest to the server
+\end{verbatim}
+
+Upon receiving the response from the client, the server computes what the
+response should have been based on the challenge it sent out, and the
+passphrase. If the two responses match, the client is authenticated.
+Otherwise, it is not.
+
+An alternate method is to use PBKDF2 again, using the challenge as the salt. In
+this case, the response could (for example) be the hash of the key produced by
+PBKDF2. There is no reason to have an explicit iteration loop, as PBKDF2 is
+designed to prevent dictionary attacks (assuming PBKDF2 is set up for a large
+iteration count internally).
+
+\pagebreak
+
+\section{Public Key Cryptography}
+
+\subsection{Basic Operations}
+
+In this section, we'll assume we have a \type{X509\_PublicKey*} named
+\arg{pubkey}, and, if necessary, a private key type (a
+\type{PKCS8\_PrivateKey*}) named \arg{privkey}. A description of these types,
+how to create them, and related details appears later in this tutorial. In this
+section, we will use various functions that are defined in
+\filename{look\_pk.h} -- you will have to include this header explicitly.
+
+\subsubsection{Encryption}
+
+Basically, pick an encoding method, create a \type{PK\_Encryptor} (with
+\function{get\_pk\_encryptor}()), and use it. But first we have to make sure
+the public key can actually be used for public key encryption. For encryption
+(and decryption), the key could be RSA, ElGamal, or (in future versions) some
+other public key encryption scheme, like Rabin or an elliptic curve scheme.
+
+\begin{verbatim}
+ PK_Encrypting_Key* key = dynamic_cast<PK_Encrypting_Key*>(pubkey);
+ if(!key)
+ error();
+ PK_Encryptor* enc = get_pk_encryptor(*key, "EME1(SHA-256)");
+
+ byte msg[] = { /* ... */ };
+
+ // will also accept a SecureVector<byte> as input
+ SecureVector<byte> ciphertext = enc->encrypt(msg, sizeof(msg));
+\end{verbatim}
+
+\subsubsection{Decryption}
+
+This is essentially the same as the encryption operation, but using a private
+key instead. One major difference is that the decryption operation can fail due
+to the fact that the ciphertext was invalid (most common padding schemes, such
+as ``EME1(SHA-256)'', include various pieces of redundancy, which are checked
+after decryption).
+
+\begin{verbatim}
+ PK_Decrypting_Key* key = dynamic_cast<PK_Decrypting_Key*>(privkey);
+ if(!key)
+ error();
+ PK_Decryptor* dec = get_pk_decryptor(*key, "EME1(SHA-256)");
+
+ byte msg[] = { /* ... */ };
+
+ SecureVector<byte> plaintext;
+
+ try {
+ // will also accept a SecureVector<byte> as input
+ plaintext = dec->decrypt(msg, sizeof(msg));
+ }
+ catch(Decoding_Error)
+ {
+ /* the ciphertext was invalid */
+ }
+\end{verbatim}
+
+\subsubsection{Signature Generation}
+
+There is one difficulty with signature generation that does not occur with
+encryption or decryption. Specifically, there are various padding methods which
+can be useful for different signature algorithms, and not all are appropriate
+for all signature schemes. The following table breaks down what algorithms
+support which encodings:
+
+\begin{tabular}{|c|c|c|} \hline
+Signature Algorithm & Usable Encoding Methods & Preferred Encoding(s) \\ \hline
+DSA / NR & EMSA1 & EMSA1 \\ \hline
+RSA & EMSA1, EMSA2, EMSA3, EMSA4 & EMSA3, EMSA4 \\ \hline
+Rabin-Williams & EMSA2, EMSA4 & EMSA2, EMSA4 \\ \hline
+\end{tabular}
+
+For new applications, use EMSA4 with both RSA and Rabin-Williams, as it is
+significantly more secure than the alternatives. However, most current
+applications/libraries only support EMSA2 with Rabin-Williams and EMSA3 with
+RSA. Given this, you may be forced to use less secure encoding methods for the
+near future. In these examples, we punt on the problem, and hard-code using
+EMSA1 with SHA-256.
+
+\begin{verbatim}
+ Public_Key* key = /* loaded or generated somehow */
+ PK_Signer* signer = get_pk_signer(*key, "EMSA1(SHA-256)");
+
+ byte msg[] = { /* ... */ };
+
+ /*
+ You can also repeatedly call update(const byte[], u32bit), followed
+ by a call to signature(), which will return the final signature of
+ all the data that was passed through update(). sign_message() is
+ just a stub that calls update() once, and returns the value of
+ signature().
+ */
+
+ SecureVector<byte> signature = signer->sign_message(msg, sizeof(msg));
+\end{verbatim}
+
+\pagebreak
+
+\subsubsection{Signature Verification}
+
+In addition to all the problems with choosing the correct padding method,
+there is yet another complication with verifying a signature. Namely, there are
+two varieties of signature algorithms - those providing message recovery (that
+is, the value that was signed can be directly recovered by someone verifying
+the signature), and those without message recovery (the verify operation simply
+returns if the signature was valid, without telling you exactly what was
+signed). This leads to two slightly different implementations of the
+verification operation, which user code has to work with. As you can see
+however, the implementation is still not at all difficult.
+
+\begin{verbatim}
+ PK_Verifier* verifier = 0;
+
+ PK_Verifying_with_MR_Key* key1 =
+ dynamic_cast<PK_Verifying_with_MR_Key*>(pubkey);
+ PK_Verifying_wo_MR_Key* key2 =
+ dynamic_cast<PK_Verifying_wo_MR_Key*>(pubkey);
+
+ if(key1)
+ verifier = get_pk_verifier(*key1, "EMSA1(SHA-256)");
+ else if(key2)
+ verifier = get_pk_verifier(*key2, "EMSA1(SHA-256)");
+ else
+ error();
+
+ byte msg[] = { /* ... */ };
+ byte sig[] = { /* ... */ };
+
+ /*
+ Like PK_Signer, you can also do repeated calls to
+ void update(const byte some_data[], u32bit length)
+ followed by a call to
+ bool check_signature(const byte the_sig[], u32bit length)
+ which will return true (valid signature) or false (bad signature).
+ The function verify_message() is a simple wrapper around update() and
+ check_signature().
+
+ */
+ bool is_valid = verifier->verify_message(msg, sizeof(msg), sig, sizeof(sig));
+\end{verbatim}
+
+\subsubsection{Key Agreement}
+
+WRITEME
+
+\pagebreak
+
+\subsection{Working with Keys}
+
+\subsubsection{Reading Public Keys (X.509 format)}
+
+There are two separate ways to read X.509 public keys. Remember that the X.509
+public keys are simply that: public keys. There is no associated information
+(such as the owner of that key) included with the public key itself. If you
+need that kind of information, you'll need to use X.509 certificates.
+
+However, there are surely times when a simple public key is sufficient. The
+most obvious is when the key is implicitly trusted, for example if access
+and/or modification of it is controlled by something else (like filesystem
+ACLs). In other cases, it is a perfectly reasonable proposition to use them
+over the network as an anonymous key exchange mechanism. This is, admittedly,
+vulnerable to man-in-the-middle attacks, but it's simple enough that it's hard
+to mess up (see, for example, Peter Guttman's paper ``Lessons Learned in
+Implementing and Deploying Crypto Software'' in Usenix '02).
+
+The way to load a key is basically to set up a \type{DataSource} and then call
+\function{X509::load\_key}, which will return a \type{X509\_PublicKey*}. For
+example:
+
+\begin{verbatim}
+ DataSource_Stream somefile("somefile.pem"); // has 3 public keys
+ X509_PublicKey* key1 = X509::load_key(somefile);
+ X509_PublicKey* key2 = X509::load_key(somefile);
+ X509_PublicKey* key3 = X509::load_key(somefile);
+ // Now we have them all loaded. Huzah!
+\end{verbatim}
+
+At this point you can use \function{dynamic\_cast} to find the operations the
+key supports (by seeing if a cast to \type{PK\_Encrypting\_Key},
+\type{PK\_Verifying\_with\_MR\_Key}, or \type{PK\_Verifying\_wo\_MR\_Key}
+succeeds).
+
+There is a variant of \function{X509::load\_key} (and of
+\function{PKCS8::load\_key}, described in the next section) which take a
+filename (as a \type{std::string}). These are just convenience functions which
+create the appropriate \type{DataSource} for you and call the main
+\function{X509::load\_key}.
+
+\subsubsection{Reading Private Keys (PKCS \#8 format)}
+
+This is very similar to reading raw public keys, with the difference that the
+key may be encrypted with a user passphrase:
+
+\begin{verbatim}
+ // rng is a RandomNumberGenerator, like AutoSeeded_RNG
+
+ DataSource_Stream somefile("somefile");
+ std::string a_passphrase = /* get from the user */
+ PKCS8_PrivateKey* key = PKCS8::load_key(somefile, rng, a_passphrase);
+\end{verbatim}
+
+You can, by the way, convert a \type{PKCS8\_PrivateKey} to a
+\type{X509\_PublicKey} simply by casting it (with \function{dynamic\_cast}), as
+the private key type is derived from \type{X509\_PublicKey}. As with
+\type{X509\_PublicKey}, you can use \function{dynamic\_cast} to figure out what
+operations the private key is capable of; in particular, you can attempt to
+cast it to \type{PK\_Decrypting\_Key}, \type{PK\_Signing\_Key}, or
+\type{PK\_Key\_Agreement\_Key}.
+
+Sometimes you can get away with having a static passphrase passed to
+\function{load\_key}. Typically, however, you'll have to do some user
+interaction to get the appropriate passphrase. In that case you'll want to use
+the \type{UI} related interface, which is fully described in the API
+documentation.
+
+\subsubsection{Generating New Private Keys}
+
+Generate a new private key is the one operation which requires you to
+explicitly name the type of key you are working with. There are (currently) two
+kinds of public key algorithms in Botan: ones based on the integer
+factorization (IF) problem (RSA and Rabin-Williams), and ones based on the
+discrete logarithm (DL) problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and
+ElGamal). Since discrete logarithm parameters (primes and generators) can be
+shared among many keys, there is the notion of these being a combined type
+(called \type{DL\_Group}).
+
+To create a new DL-based private key, simply pass a desired \type{DL\_Group} to
+the constructor of the private key - a new public/private key pair will be
+generated. Since in IF-based algorithms, the modulus used isn't shared by other
+keys, we don't use this notion. You can create a new key by passing in a
+\type{u32bit} telling how long (in bits) the key should be.
+
+There are quite a few ways to get a \type{DL\_Group} object. The best is to use
+the function \function{get\_dl\_group}, which takes a string naming a group; it
+will either return that group, if it knows about it, or throw an
+exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536,
+2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF
+groups are the ones specified for use with IPSec, and the DSA ones are the
+default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, use
+the ``DSA-n'' groups, and for Diffie-Hellman and ElGamal, use the ``IETF-n''
+groups.
+
+You can also generate a new random group. This is not recommend, because it is
+very slow, particularly for ``safe'' primes, which are needed for
+Diffie-Hellman and ElGamal.
+
+Some examples:
+
+\begin{verbatim}
+ RSA_PrivateKey rsa1(512); // 512-bit RSA key
+ RSA_PrivateKey rsa2(2048); // 2048-bit RSA key
+
+ RW_PrivateKey rw1(1024); // 1024-bit Rabin-Williams key
+ RW_PrivateKey rw2(1536); // 1536-bit Rabin-Williams key
+
+ DSA_PrivateKey dsa(get_dl_group("DSA-512")); // 512-bit DSA key
+ DH_PrivateKey dh(get_dl_group("IETF-4096")); // 4096-bit DH key
+ NR_PrivateKey nr(get_dl_group("DSA-1024")); // 1024-bit NR key
+ ElGamal_PrivateKey elg(get_dl_group("IETF-1536")); // 1536-bit ElGamal key
+\end{verbatim}
+
+To export your newly created private key, use the PKCS \#8 routines in
+\filename{pkcs8.h}:
+
+\begin{verbatim}
+ std::string a_passphrase = /* get from the user */
+ std::string the_key = PKCS8::PEM_encode(rsa2, a_passphrase);
+\end{verbatim}
+
+You can read the key back in using \function{PKCS8::load\_key}, described in
+the section ``Reading Private Keys (PKCS \#8 format)'', above. Unfortunately,
+this only works with keys that have an assigned algorithm identifier and
+standardized format. Currently this is only the RSA, DSA, DH, and ElGamal
+algorithms, though RW and NR keys can also be imported and exported by
+assigning them an OID (this can be done either through a configuration file, or
+by calling the function \function{OIDS::add\_oid} in \filename{oids.h}). Be
+aware that the OID and format for ElGamal keys is not exactly standard, but
+there does exist at least one other crypto library which will accept the
+format.
+
+The raw public can be exported using:
+
+\begin{verbatim}
+ std::string the_public_key = X509::PEM_encode(rsa2);
+\end{verbatim}
+
+\pagebreak
+
+\section{X.509v3 Certificates}
+
+Using certificates is rather complicated, so only the very basic mechanisms are
+going to be covered here. The section ``Setting up a CA'' goes into reasonable
+detail about CRLs and certificate requests, but there is a lot that isn't
+covered (else this section would get quite long and complicated).
+
+\subsection{Importing and Exporting Certificates}
+
+Importing and exporting X.509 certificates is easy. Simply call the constructor
+with either a \type{DataSource\&}, or the name of a file:
+
+\begin{verbatim}
+ X509_Certificate cert1("cert1.pem");
+
+ /* This file contains two certificates, concatenated */
+ DataSource_Stream in("certs2_and_3.pem");
+
+ X509_Certificate cert2(in); // read the first cert
+ X509_Certificate cert3(in); // read the second cert
+\end{verbatim}
+
+Exporting the certificate is a simple matter of calling the member function
+\function{PEM\_encode}(), which returns a \type{std::string} containing the
+certificate in PEM encoding.
+
+\begin{verbatim}
+ std::cout << cert3.PEM_encode();
+ some_ostream_object << cert1.PEM_encode();
+ std::string cert2_str = cert2.PEM_encode();
+\end{verbatim}
+
+\subsection{Verifying Certificates}
+
+Verifying a certificate requires that we build up a chain of trust, starting
+from the root (usually a commercial CA), down through some number of
+intermediate CAs, and finally reaching the actual certificate in
+question. Thus, to verify, we actually have to have all those certificates
+on hand (or at the very least, know where we can get the ones we need).
+
+The class which handles both storing certificates, and verifying them, is
+called \type{X509\_Store}. We'll start by assuming that we have all the
+certificates we need, and just want to verify a cert. This is done by calling
+the member function \function{validate\_cert}, which takes the
+\type{X509\_Certificate} in question, and an optional argument of type
+\type{Cert\_Usage} (which is ignored here; read the section in the API doc
+titled ``Verifying Certificates for information). It returns an enum;
+\type{X509\_Code}, which, for most purposes, is either \type{VERIFIED}, or
+something else (which specifies what circumstance caused the certificate to be
+considered invalid). Really, that's it.
+
+Now, how to let \type{X509\_Store} know about all those certificates and CRLs
+we have lying around? The simplest method is to add them directly, using the
+functions \function{add\_cert}, \function{add\_certs},
+\function{add\_trusted\_certs}, and \function{add\_crl}; for details, consult
+the API doc or read the \filename{x509stor.h} header. There is also a much more
+elegant and powerful method: \type{Certificate\_Store}s. A certificate store
+refers to an object that knows how to retrieve certificates from some external
+source (a file, an LDAP directory, a HTTP server, a SQL database, or anything
+else). By calling the function \function{add\_new\_certstore}, you can register
+a new certificate store, which \type{X509\_Store} will use to find certificates
+it needs. Thus, you can get away with only adding whichever root CA cert(s) you
+want to use, letting some online source handle the storage of all intermediate
+X.509 certificates. The API documentation has a more complete discussion of
+\type{Certificate\_Store}.
+
+\subsection{Setting up a CA}
+
+WRITEME
+
+\pagebreak
+
+\section{Special Topics}
+
+This chapter is for subjects that don't really fit into the API documentation
+or into other chapters of the tutorial.
+
+\subsection{GUIs}
+
+There is nothing particularly special about using Botan in a GUI-based
+application. However there are a few tricky spots, as well as a few ways to
+take advantage of an event-based application.
+
+\subsubsection{Initialization}
+
+Generally you will create the \type{LibraryInitializer} somewhere in
+\texttt{main}, before entering the event loop. One problem is that some GUI
+libraries take over \texttt{main} and drop you right into the event loop; the
+question then is how to initialize the library? The simplest way is probably to
+have a static flag that marks if you have already initialized the library or
+not. When you enter the event loop, check to see if this flag has not been set,
+and if so, initialize the library using the function-based initializers. Using
+\type{LibraryInitializer} obviously won't work in this case, since it would be
+destroyed as soon as the current event handler finished. You then deinitialize
+the library whenever your application is signaled to quit.
+
+\subsubsection{Interacting With the Library}
+
+In the simple case, the user will do stuff asynchronously, and then in response
+your code will do things like encrypt a file or whatever, which can all be done
+synchronously, since the data is right there for you. An application doing
+something like this is basically going to look exactly like a command line
+application that uses Botan, the only major difference being that the calls to
+the library are inside event handlers.
+
+Much harder is something like an SSH client, where you're acting as a layer
+between two asynchronous things (the user and the network). This actually isn't
+restricted to GUIs at all (text-mode SSH clients have to deal with most of the
+same problems), but it is probably more common with a GUI app. The following
+discussion is fairly vague, but hopefully somewhat useful.
+
+There are a few facilities in Botan that are primarily designed to be used by
+applications based on an event loop. See the section ``User Interfaces'' in the
+main API doc for details.
+
+\subsubsection{Entropy}
+
+One nice advantage of using a GUI is opening a new method of gathering entropy
+for the library. This is especially handy on Windows, where the available
+sources of entropy are pretty questionable. In many versions,
+\texttt{CryptGenRandom} is really rather poor, and the Toolhelp functions may
+not provide much data on a small system (such as a handheld). For example, in
+GTK+, you can use the following callback to get information about mouse
+movements:
+
+\begin{verbatim}
+static gint add_entropy(GtkWidget* widget, GdkEventMotion* event)
+ {
+ if(event)
+ Global_RNG::add_entropy(event, sizeof(GdkEventMotion));
+ return FALSE;
+ }
+\end{verbatim}
+
+And then register it with your main GTK window (presumably named
+\variable{window}) as follows:
+
+\begin{verbatim}
+gtk_signal_connect(GTK_OBJECT(window), "motion_notify_event",
+ GTK_SIGNAL_FUNC(add_entropy), NULL);
+
+gtk_widget_set_events(window, GDK_POINTER_MOTION_MASK);
+\end{verbatim}
+
+Even though we're catching all mouse movements, and hashing the results into
+the entropy pool, this doesn't use up more than a few percent of even a
+relatively slow desktop CPU. Note that in the case of using X over a network,
+catching all mouse events would cause large amounts of X traffic over the
+network, which might make your application slow, or even unusable (I haven't
+tried it, though).
+
+This could be made nicer if the collection function did something like
+calculating deltas between each run, storing them into a buffer, and then when
+enough of them have been added, hashing them and send them all to the PRNG in
+one shot. This would not only reduce load, but also prevent the PRNG from
+overestimating the amount of entropy it's getting, since its estimates don't
+(can't) take history into account. For example, you could move the mouse back
+and forth one pixel, and the PRNG would think it was getting a full load of
+entropy each time, when actually it was getting (at best) a bit or two.
+
+\end{document}
diff --git a/doc/tutorial.tex b/doc/tutorial.tex
index ee3bc936b..840679d10 100644
--- a/doc/tutorial.tex
+++ b/doc/tutorial.tex
@@ -13,7 +13,7 @@
\title{\textbf{Botan Tutorial}}
\author{Jack Lloyd \\
-\date{2009/07/08}
+\date{2010/08/07}
\newcommand{\filename}[1]{\texttt{#1}}
\newcommand{\manpage}[2]{\texttt{#1}(#2)}
@@ -37,844 +37,105 @@
\section{Introduction}
This document essentially sets up various simple scenarios and then
-shows how to solve the problems using Botan. It's fairly simple, and
+shows how to solve the problems using botan. It's fairly simple, and
doesn't cover many of the available APIs and algorithms, especially
the more obscure or unusual ones. It is a supplement to the API
documentation and the example applications, which are included in the
distribution.
-To quote the Perl man page: '``There's more than one way to do it.''
-Divining how many more is left as an exercise to the reader.'
-
-This is \emph{not} a general introduction to cryptography, and most simple
-terms and ideas are not explained in any great detail.
-
-Finally, most of the code shown in this tutorial has not been tested, it was
-just written down from memory. If you find errors, please let me know.
-
\section{Initializing the Library}
-The first step to using Botan is to create a \type{LibraryInitializer} object,
-which handles creating various internal structures, and also destroying them at
-shutdown. Essentially:
+The first step to using botan is to create a \type{LibraryInitializer}
+object, which handles creating various internal structures, and also
+destroying them at shutdown.
\begin{verbatim}
#include <botan/botan.h>
-/* include other headers here */
int main()
{
- LibraryInitializer init;
- /* now do stuff */
+ Botan::LibraryInitializer init;
return 0;
}
\end{verbatim}
-\section{Hashing a File}
-
-\section{Symmetric Cryptography}
-
-\subsection{Encryption with a passphrase}
-
-Probably the most common crypto problem is encrypting a file (or some data that
-is in-memory) using a passphrase. There are a million ways to do this, most of
-them bad. In particular, you have to protect against weak passphrases,
-people reusing a passphrase many times, accidental and deliberate modification,
-and a dozen other potential problems.
-
-We'll start with a simple method that is commonly used, and show the problems
-that can arise. Each subsequent solution will modify the previous one to
-prevent one or more common problems, until we arrive at a good version.
-
-In these examples, we'll always use Serpent in Cipher-Block Chaining
-(CBC) mode. Whenever we need a hash function, we'll use SHA-256, since
-that is a common and well-known hash that is thought to be secure.
-
-In all examples, we choose to derive the Initialization Vector (IV) from the
-passphrase. Another (probably more common) alternative is to generate the IV
-randomly and include it at the beginning of the message. Either way is
-acceptable, and can be secure. The method used here was chosen to make for more
-interesting examples (because it's harder to get right), and may not be an
-appropriate choice for some environments.
-
-First, some notation. The passphrase is stored as a \type{std::string} named
-\variable{passphrase}. The input and output files (\variable{infile} and
-\variable{outfile}) are of types \type{std::ifstream} and \type{std::ofstream}
-(respectively).
-
-\subsubsection{First try}
-
-We hash the passphrase with SHA-256, and use the resulting hash to key
-Serpent. To generate the IV, we prepend a single '0' character to the
-passphrase, hash it, and truncate it to 16 bytes (which is Serpent's
-block size).
+If your application is multi-threaded, you need to tell botan this so
+that it will use locking where necessary. This is done by passing a
+string to the constructor of \type{LibraryInitializer}:
\begin{verbatim}
- HashFunction* hash = get_hash("SHA-256");
-
- SymmetricKey key = hash->process(passphrase);
- SecureVector<byte> raw_iv = hash->process('0' + passphrase);
- InitializationVector iv(raw_iv, 16);
-
- Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION));
-
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
- outfile << pipe;
+ Botan::LibraryInitializer init("thread_safe=yes");
\end{verbatim}
-\subsubsection{Problem 1: Buffering}
+\section{Introduction to Pipe}
-There is a problem with the above code, if the input file is fairly large as
-compared to available memory. Specifically, all the encrypted data is stored
-in memory, and then flushed to \variable{outfile} in a single go at the very
-end. If the input file is big (say, a gigabyte), this will be most problematic.
+Most operations in botan are specified in terms of transformations on
+streams. The class that handles the I/O and management for these
+streams is called \type{Pipe}. You can construct a \type{Pipe} with
+one or more \type{Filter}s, which sequentially process messages. You
+can only update a single message at a time, but you can leave the
+final output contents in a \type{Pipe} and read them out as desired.
-The solution is to use a \type{DataSink} to handle the output for us (writing
-to \arg{outfile} will be implicit with writing to the \type{Pipe}). We can do
-this by replacing the last few lines with:
+Here is how you might hex encode two messages:
\begin{verbatim}
- Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
- new DataSink_Stream(outfile));
+ std::string message1 = "this is the first message";
+ const byte message2[] = "a second message";
+ Pipe pipe(new Hex_Encoder);
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
-\end{verbatim}
+ pipe.start_msg(); // must be called before writing to the pipe
+ pipe.write(message1);
+ pipe.end_msg(); // must be called to signal completion
-\subsubsection{Problem 2: Deriving the key and IV}
-
-Hash functions like SHA-256 are deterministic; if the same passphrase
-is supplied twice, then the key (and in our case, the IV) will be the
-same. This is very dangerous, and could easily open the whole system
-up to attack. What we need to do is introduce a salt (or nonce) into
-the generation of the key from the passphrase. This will mean that the
-key will not be the same each time the same passphrase is typed in by
-a user.
-
-There is another problem with using a bare hash function to derive
-keys. While it's inconceivable (based on our current understanding of
-thermodynamics and theories of computation) that an attacker could
-brute-force a 256-bit key, it would be fairly simple for them to
-compute the SHA-256 hashes of various common passwords ('password',
-the name of the dog, the significant other's middle name, favorite
-sports team) and try those as keys. So we want to slow the attacker
-down if we can, and an easy way to do that is to iterate the hash
-function a bunch of times (say, 1024 to 4096 times). This will involve
-only a small amount of effort for a legitimate user (since they only
-have to compute the hashes once, when they type in their passphrase),
-but an attacker, trying out a large list of potential passphrases,
-will be seriously annoyed (and slowed down) by this.
-
-In this iteration of the example, we'll kill these two birds with one
-stone, and derive the key from the passphrase using a PBKDF
-(Password-Based Key Derivation Function). In this example, we use
-PBKDF2 with Hash Message Authentication Code (HMAC(SHA-256)), which is
-specified in PKCS \#5. We replace the first four lines of code from
-the first example with:
+ /*
+ process_msg(x) is equivalent to calling
+ start_msg(); write(x); end_msg();
+ */
+ pipe.process_msg(message2);
-\begin{verbatim}
- PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
- // hard-coded iteration count for simplicity; should be sufficient
- pbkdf->set_iterations(10000);
- // 8 octets == 64-bit salt; again, good enough
- pbkdf->new_random_salt(8);
- SecureVector<byte> the_salt = pbkdf->current_salt();
-
- // 48 octets == 32 for key + 16 for IV
- SecureVector<byte> key_and_IV = pbkdf->derive_key(48, passphrase).bits_of();
-
- SymmetricKey key(key_and_IV, 32);
- InitializationVector iv(key_and_IV + 32, 16);
-\end{verbatim}
+ Pipe::message_id n = pipe.message_count(); // returns 2
-To complete the example, we have to remember to write out the salt (stored in
-\variable{the\_salt}) at the beginning of the file. The receiving side needs to
-know this value in order to restore it (by calling the \variable{pbkdf} object's
-\function{change\_salt} function) so it can derive the same key and IV from the
-passphrase.
+ /* you can read a message as a string, here we read message 0 */
+ std::string first_result = pipe.read_all_as_string(0);
-\subsubsection{Problem 3: Protecting against modification}
+ /* or a piece at a time using array/length, now we'll read the
+ second message (message id 1)
+ */
-As it is, an attacker can undetectably alter the message while it is
-in transit. It is vital to remember that encryption does not imply
-authentication (except when using special modes that are specifically
-designed to provide authentication along with encryption, like OCB and
-EAX). For this purpose, we will append a message authentication code
-to the encrypted message. Specifically, we will generate an extra 256
-bits of key data, and use it to key the ``HMAC(SHA-256)'' MAC
-function. We don't want to have the MAC and the cipher to share the
-same key; that is very much a no-no.
-
-\begin{verbatim}
- // 80 octets == 32 for cipher key + 16 for IV + 32 for hmac key
- SecureVector<byte> keys_and_IV = pbkdf->derive_key(80, passphrase);
-
- SymmetricKey key(keys_and_IV, 32);
- InitializationVector iv(keys_and_IV + 32, 16);
- SymmetricKey mac_key(keys_and_IV + 32 + 16, 32);
-
- Pipe pipe(new Fork(
- new Chain(
- get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
- new DataSink_Stream(outfile)
- ),
- new MAC_Filter("HMAC(SHA-256)", mac_key)
- )
- );
-
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
-
- // now read the MAC from message #2. Message numbers start from 0
- SecureVector<byte> hmac = pipe.read_all(1);
- outfile.write((const char*)hmac.ptr(), hmac.size());
+ byte output[4096] = { 0 };
+ u32bit got = read(output, sizeof(output), 1);
+ if(got >= sizeof(output))
+ // have to read again to get more of the message
\end{verbatim}
-The receiver can check the size of the file (in bytes), and since it knows how
-long the MAC is, can figure out how many bytes of ciphertext there are. Then it
-reads in that many bytes, sending them to a Serpent/CBC decryption object
-(which could be obtained by calling \verb|get_cipher| with an argument of
-\type{DECRYPTION} instead of \type{ENCRYPTION}), and storing the final bytes to
-authenticate the message with.
+You can also read output while the message is still active (before the
+call to \function{end\_msg}), using the same interfaces. You can find
+out how much data is currently available for a particular \type{Pipe}
+by calling the member function \function{remaining}, which takes a
+message sequence number and returns the number of bytes that are
+currently available to read from that message.
-\subsubsection{Problem 4: Cleaning up the key generation}
-
-The method used to derive the keys and IV is rather inelegant, and it would be
-nice to clean that up a bit, algorithmically speaking. A nice solution for this
-is to generate a master key from the passphrase and salt, and then generate the
-two keys and the IV (the cryptovariables) from that.
+\section{Hashing a File}
-Starting from the master key, we derive the cryptovariables using a KDF
-algorithm, which is designed, among other things, to ``separate'' keys so that
-we can derive several different keys from the single master key. For this
-purpose, we will use KDF2, which is a generally useful KDF function (defined in
-IEEE 1363a, among other standards). The use of different labels (``cipher
-key'', etc) makes sure that each of the three derived variables will have
-different values.
+Hashing a file is done using a \type{Hash\_Filter}, which takes a string
+which specifies which hash function you want to use:
\begin{verbatim}
- PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
- // hard-coded iteration count for simplicity; should be sufficient
- pbkdf->set_iterations(10000);
- // 8 octet == 64-bit salt; again, good enough
- pbkdf->new_random_salt(8);
- // store the salt so we can write it to a file later
- SecureVector<byte> the_salt = pbkdf->current_salt();
-
- SymmetricKey master_key = pbkdf->derive_key(48, passphrase);
-
- KDF* kdf = get_kdf("KDF2(SHA-256)");
-
- SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
- SymmetricKey mac_key = kdf->derive_key(32, master_key, "hmac key");
- InitializationVector iv = kdf->derive_key(16, master_key, "cipher iv");
+ Pipe pipe(new Hash_Filter("SHA-256"));
\end{verbatim}
-\subsubsection{Final version}
-
-Here is the final version of the encryption code, with all the changes we've
-made:
-
-\begin{verbatim}
- PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
- pbkdf->set_iterations(10000);
- pbkdf->new_random_salt(8);
- SecureVector<byte> the_salt = pbkdf->current_salt();
-
- SymmetricKey master_key = pbkdf->derive_key(48, passphrase);
-
- KDF* kdf = get_kdf("KDF2(SHA-256)");
-
- SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
- SymmetricKey mac_key = kdf->derive_key(32, masterkey, "hmac key");
- InitializationVector iv = kdf->derive_key(16, masterkey, "cipher iv");
+The output of a \type{Hash\_Filter} is raw binary. The filter will not
+produce any output at all until you call \function{end\_msg}.
- Pipe pipe(new Fork(
- new Chain(
- get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
- new DataSink_Stream(outfile)
- ),
- new MAC_Filter("HMAC(SHA-256)", mac_key)
- )
- );
-
- outfile.write((const char*)the_salt.ptr(), the_salt.size());
-
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
-
- SecureVector<byte> hmac = pipe.read_all(1);
- outfile.write((const char*)hmac.ptr(), hmac.size());
-\end{verbatim}
-
-\subsubsection{Another buffering technique}
+\section{Symmetric Cryptography}
-Sometimes the use of \type{DataSink\_Stream} is not practical for whatever
-reason. In this case, an alternate buffering mechanism might be useful. Here is
-some code which will write all the processed data as quickly as possible, so
-that memory pressure is reduced in the case of large inputs.
-\begin{verbatim}
- pipe.start_msg();
- SecureBuffer<byte, 1024> buffer;
- while(infile.good())
- {
- infile.read((char*)buffer.ptr(), buffer.size());
- u32bit got_from_infile = infile.gcount();
- pipe.write(buffer, got_from_infile);
-
- if(infile.eof())
- pipe.end_msg();
-
- while(pipe.remaining() > 0)
- {
- u32bit buffered = pipe.read(buffer, buffer.size());
- outfile.write((const char*)buffer.ptr(), buffered);
- }
- }
- if(infile.bad() || (infile.fail() && !infile.eof()))
- throw Some_Exception();
-\end{verbatim}
-
-\pagebreak
\subsection{Authentication}
-After doing the encryption routines, doing message authentication keyed off a
-passphrase is not very difficult. In fact it's much easier than the encryption
-case, for the following reasons: a) we only need one key, and b) we don't have
-to store anything, so all the input can be done in a single step without
-worrying about it taking up a lot of memory if the input file is large.
-
-In this case, we'll hex-encode the salt and the MAC, and output them both to
-standard output (the salt followed by the MAC).
-
-\begin{verbatim}
- PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
- pbkdf->set_iterations(10000);
- pbkdf->new_random_salt(8);
- OctetString the_salt = pbkdf->current_salt();
-
- SymmetricKey hmac_key = pbkdf->derive_key(32, passphrase);
-
- Pipe pipe(new MAC_Filter("HMAC(SHA-256)", mac_key),
- new Hex_Encoder
- );
-
- std::cout << the_salt.to_string(); // hex encoded
-
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
- std::cout << pipe.read_all_as_string() << std::endl;
-\end{verbatim}
-
\subsection{User Authentication}
-Doing user authentication off a shared passphrase is fairly easy. Essentially,
-a challenge-response protocol is used - the server sends a random challenge,
-and the client responds with an appropriate response to the challenge. The idea
-is that only someone who knows the passphrase can generate or check to see if a
-response is valid.
-
-Let's say we use 160-bit (20 byte) challenges, which seems fairly
-reasonable. We can create this challenge using the global random
-number generator (RNG):
-
-\begin{verbatim}
- byte challenge[20];
- Global_RNG::randomize(challenge, sizeof(challenge), Nonce);
- // send challenge to client
-\end{verbatim}
-
-After reading the challenge, the client generates a response based on
-the challenge and the passphrase. In this case, we will do it by
-repeatedly hashing the challenge, the passphrase, and (if applicable)
-the previous digest. We iterate this construction 10000 times, to make
-brute force attacks on the passphrase hard to do. Since we are already
-using 160-bit challenges, a 160-bit response seems warranted, so we'll
-use SHA-1.
-
-\begin{verbatim}
- HashFunction* hash = get_hash("SHA-1");
- SecureVector<byte> digest;
- for(u32bit j = 0; j != 10000; j++)
- {
- hash->update(digest, digest.size());
- hash->update(passphrase);
- hash->update(challenge, challenge.size());
- digest = hash->final();
- }
- delete hash;
- // send value of digest to the server
-\end{verbatim}
-
-Upon receiving the response from the client, the server computes what the
-response should have been based on the challenge it sent out, and the
-passphrase. If the two responses match, the client is authenticated.
-Otherwise, it is not.
-
-An alternate method is to use PBKDF2 again, using the challenge as the salt. In
-this case, the response could (for example) be the hash of the key produced by
-PBKDF2. There is no reason to have an explicit iteration loop, as PBKDF2 is
-designed to prevent dictionary attacks (assuming PBKDF2 is set up for a large
-iteration count internally).
-
-\pagebreak
-
\section{Public Key Cryptography}
-\subsection{Basic Operations}
-
-In this section, we'll assume we have a \type{X509\_PublicKey*} named
-\arg{pubkey}, and, if necessary, a private key type (a
-\type{PKCS8\_PrivateKey*}) named \arg{privkey}. A description of these types,
-how to create them, and related details appears later in this tutorial. In this
-section, we will use various functions that are defined in
-\filename{look\_pk.h} -- you will have to include this header explicitly.
-
-\subsubsection{Encryption}
-
-Basically, pick an encoding method, create a \type{PK\_Encryptor} (with
-\function{get\_pk\_encryptor}()), and use it. But first we have to make sure
-the public key can actually be used for public key encryption. For encryption
-(and decryption), the key could be RSA, ElGamal, or (in future versions) some
-other public key encryption scheme, like Rabin or an elliptic curve scheme.
-
-\begin{verbatim}
- PK_Encrypting_Key* key = dynamic_cast<PK_Encrypting_Key*>(pubkey);
- if(!key)
- error();
- PK_Encryptor* enc = get_pk_encryptor(*key, "EME1(SHA-256)");
-
- byte msg[] = { /* ... */ };
-
- // will also accept a SecureVector<byte> as input
- SecureVector<byte> ciphertext = enc->encrypt(msg, sizeof(msg));
-\end{verbatim}
-
-\subsubsection{Decryption}
-
-This is essentially the same as the encryption operation, but using a private
-key instead. One major difference is that the decryption operation can fail due
-to the fact that the ciphertext was invalid (most common padding schemes, such
-as ``EME1(SHA-256)'', include various pieces of redundancy, which are checked
-after decryption).
-
-\begin{verbatim}
- PK_Decrypting_Key* key = dynamic_cast<PK_Decrypting_Key*>(privkey);
- if(!key)
- error();
- PK_Decryptor* dec = get_pk_decryptor(*key, "EME1(SHA-256)");
-
- byte msg[] = { /* ... */ };
-
- SecureVector<byte> plaintext;
-
- try {
- // will also accept a SecureVector<byte> as input
- plaintext = dec->decrypt(msg, sizeof(msg));
- }
- catch(Decoding_Error)
- {
- /* the ciphertext was invalid */
- }
-\end{verbatim}
-
-\subsubsection{Signature Generation}
-
-There is one difficulty with signature generation that does not occur with
-encryption or decryption. Specifically, there are various padding methods which
-can be useful for different signature algorithms, and not all are appropriate
-for all signature schemes. The following table breaks down what algorithms
-support which encodings:
-
-\begin{tabular}{|c|c|c|} \hline
-Signature Algorithm & Usable Encoding Methods & Preferred Encoding(s) \\ \hline
-DSA / NR & EMSA1 & EMSA1 \\ \hline
-RSA & EMSA1, EMSA2, EMSA3, EMSA4 & EMSA3, EMSA4 \\ \hline
-Rabin-Williams & EMSA2, EMSA4 & EMSA2, EMSA4 \\ \hline
-\end{tabular}
-
-For new applications, use EMSA4 with both RSA and Rabin-Williams, as it is
-significantly more secure than the alternatives. However, most current
-applications/libraries only support EMSA2 with Rabin-Williams and EMSA3 with
-RSA. Given this, you may be forced to use less secure encoding methods for the
-near future. In these examples, we punt on the problem, and hard-code using
-EMSA1 with SHA-256.
-
-\begin{verbatim}
- Public_Key* key = /* loaded or generated somehow */
- PK_Signer* signer = get_pk_signer(*key, "EMSA1(SHA-256)");
-
- byte msg[] = { /* ... */ };
-
- /*
- You can also repeatedly call update(const byte[], u32bit), followed
- by a call to signature(), which will return the final signature of
- all the data that was passed through update(). sign_message() is
- just a stub that calls update() once, and returns the value of
- signature().
- */
-
- SecureVector<byte> signature = signer->sign_message(msg, sizeof(msg));
-\end{verbatim}
-
-\pagebreak
-
-\subsubsection{Signature Verification}
-
-In addition to all the problems with choosing the correct padding method,
-there is yet another complication with verifying a signature. Namely, there are
-two varieties of signature algorithms - those providing message recovery (that
-is, the value that was signed can be directly recovered by someone verifying
-the signature), and those without message recovery (the verify operation simply
-returns if the signature was valid, without telling you exactly what was
-signed). This leads to two slightly different implementations of the
-verification operation, which user code has to work with. As you can see
-however, the implementation is still not at all difficult.
-
-\begin{verbatim}
- PK_Verifier* verifier = 0;
-
- PK_Verifying_with_MR_Key* key1 =
- dynamic_cast<PK_Verifying_with_MR_Key*>(pubkey);
- PK_Verifying_wo_MR_Key* key2 =
- dynamic_cast<PK_Verifying_wo_MR_Key*>(pubkey);
-
- if(key1)
- verifier = get_pk_verifier(*key1, "EMSA1(SHA-256)");
- else if(key2)
- verifier = get_pk_verifier(*key2, "EMSA1(SHA-256)");
- else
- error();
-
- byte msg[] = { /* ... */ };
- byte sig[] = { /* ... */ };
-
- /*
- Like PK_Signer, you can also do repeated calls to
- void update(const byte some_data[], u32bit length)
- followed by a call to
- bool check_signature(const byte the_sig[], u32bit length)
- which will return true (valid signature) or false (bad signature).
- The function verify_message() is a simple wrapper around update() and
- check_signature().
-
- */
- bool is_valid = verifier->verify_message(msg, sizeof(msg), sig, sizeof(sig));
-\end{verbatim}
-
-\subsubsection{Key Agreement}
-
-WRITEME
-
-\pagebreak
-
-\subsection{Working with Keys}
-
-\subsubsection{Reading Public Keys (X.509 format)}
-
-There are two separate ways to read X.509 public keys. Remember that the X.509
-public keys are simply that: public keys. There is no associated information
-(such as the owner of that key) included with the public key itself. If you
-need that kind of information, you'll need to use X.509 certificates.
-
-However, there are surely times when a simple public key is sufficient. The
-most obvious is when the key is implicitly trusted, for example if access
-and/or modification of it is controlled by something else (like filesystem
-ACLs). In other cases, it is a perfectly reasonable proposition to use them
-over the network as an anonymous key exchange mechanism. This is, admittedly,
-vulnerable to man-in-the-middle attacks, but it's simple enough that it's hard
-to mess up (see, for example, Peter Guttman's paper ``Lessons Learned in
-Implementing and Deploying Crypto Software'' in Usenix '02).
-
-The way to load a key is basically to set up a \type{DataSource} and then call
-\function{X509::load\_key}, which will return a \type{X509\_PublicKey*}. For
-example:
-
-\begin{verbatim}
- DataSource_Stream somefile("somefile.pem"); // has 3 public keys
- X509_PublicKey* key1 = X509::load_key(somefile);
- X509_PublicKey* key2 = X509::load_key(somefile);
- X509_PublicKey* key3 = X509::load_key(somefile);
- // Now we have them all loaded. Huzah!
-\end{verbatim}
-
-At this point you can use \function{dynamic\_cast} to find the operations the
-key supports (by seeing if a cast to \type{PK\_Encrypting\_Key},
-\type{PK\_Verifying\_with\_MR\_Key}, or \type{PK\_Verifying\_wo\_MR\_Key}
-succeeds).
-
-There is a variant of \function{X509::load\_key} (and of
-\function{PKCS8::load\_key}, described in the next section) which take a
-filename (as a \type{std::string}). These are just convenience functions which
-create the appropriate \type{DataSource} for you and call the main
-\function{X509::load\_key}.
-
-\subsubsection{Reading Private Keys (PKCS \#8 format)}
-
-This is very similar to reading raw public keys, with the difference that the
-key may be encrypted with a user passphrase:
-
-\begin{verbatim}
- // rng is a RandomNumberGenerator, like AutoSeeded_RNG
-
- DataSource_Stream somefile("somefile");
- std::string a_passphrase = /* get from the user */
- PKCS8_PrivateKey* key = PKCS8::load_key(somefile, rng, a_passphrase);
-\end{verbatim}
-
-You can, by the way, convert a \type{PKCS8\_PrivateKey} to a
-\type{X509\_PublicKey} simply by casting it (with \function{dynamic\_cast}), as
-the private key type is derived from \type{X509\_PublicKey}. As with
-\type{X509\_PublicKey}, you can use \function{dynamic\_cast} to figure out what
-operations the private key is capable of; in particular, you can attempt to
-cast it to \type{PK\_Decrypting\_Key}, \type{PK\_Signing\_Key}, or
-\type{PK\_Key\_Agreement\_Key}.
-
-Sometimes you can get away with having a static passphrase passed to
-\function{load\_key}. Typically, however, you'll have to do some user
-interaction to get the appropriate passphrase. In that case you'll want to use
-the \type{UI} related interface, which is fully described in the API
-documentation.
-
-\subsubsection{Generating New Private Keys}
-
-Generate a new private key is the one operation which requires you to
-explicitly name the type of key you are working with. There are (currently) two
-kinds of public key algorithms in Botan: ones based on the integer
-factorization (IF) problem (RSA and Rabin-Williams), and ones based on the
-discrete logarithm (DL) problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and
-ElGamal). Since discrete logarithm parameters (primes and generators) can be
-shared among many keys, there is the notion of these being a combined type
-(called \type{DL\_Group}).
-
-To create a new DL-based private key, simply pass a desired \type{DL\_Group} to
-the constructor of the private key - a new public/private key pair will be
-generated. Since in IF-based algorithms, the modulus used isn't shared by other
-keys, we don't use this notion. You can create a new key by passing in a
-\type{u32bit} telling how long (in bits) the key should be.
-
-There are quite a few ways to get a \type{DL\_Group} object. The best is to use
-the function \function{get\_dl\_group}, which takes a string naming a group; it
-will either return that group, if it knows about it, or throw an
-exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536,
-2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF
-groups are the ones specified for use with IPSec, and the DSA ones are the
-default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, use
-the ``DSA-n'' groups, and for Diffie-Hellman and ElGamal, use the ``IETF-n''
-groups.
-
-You can also generate a new random group. This is not recommend, because it is
-very slow, particularly for ``safe'' primes, which are needed for
-Diffie-Hellman and ElGamal.
-
-Some examples:
-
-\begin{verbatim}
- RSA_PrivateKey rsa1(512); // 512-bit RSA key
- RSA_PrivateKey rsa2(2048); // 2048-bit RSA key
-
- RW_PrivateKey rw1(1024); // 1024-bit Rabin-Williams key
- RW_PrivateKey rw2(1536); // 1536-bit Rabin-Williams key
-
- DSA_PrivateKey dsa(get_dl_group("DSA-512")); // 512-bit DSA key
- DH_PrivateKey dh(get_dl_group("IETF-4096")); // 4096-bit DH key
- NR_PrivateKey nr(get_dl_group("DSA-1024")); // 1024-bit NR key
- ElGamal_PrivateKey elg(get_dl_group("IETF-1536")); // 1536-bit ElGamal key
-\end{verbatim}
-
-To export your newly created private key, use the PKCS \#8 routines in
-\filename{pkcs8.h}:
-
-\begin{verbatim}
- std::string a_passphrase = /* get from the user */
- std::string the_key = PKCS8::PEM_encode(rsa2, a_passphrase);
-\end{verbatim}
-
-You can read the key back in using \function{PKCS8::load\_key}, described in
-the section ``Reading Private Keys (PKCS \#8 format)'', above. Unfortunately,
-this only works with keys that have an assigned algorithm identifier and
-standardized format. Currently this is only the RSA, DSA, DH, and ElGamal
-algorithms, though RW and NR keys can also be imported and exported by
-assigning them an OID (this can be done either through a configuration file, or
-by calling the function \function{OIDS::add\_oid} in \filename{oids.h}). Be
-aware that the OID and format for ElGamal keys is not exactly standard, but
-there does exist at least one other crypto library which will accept the
-format.
-
-The raw public can be exported using:
-
-\begin{verbatim}
- std::string the_public_key = X509::PEM_encode(rsa2);
-\end{verbatim}
-
-\pagebreak
-
-\section{X.509v3 Certificates}
-
-Using certificates is rather complicated, so only the very basic mechanisms are
-going to be covered here. The section ``Setting up a CA'' goes into reasonable
-detail about CRLs and certificate requests, but there is a lot that isn't
-covered (else this section would get quite long and complicated).
-
-\subsection{Importing and Exporting Certificates}
-
-Importing and exporting X.509 certificates is easy. Simply call the constructor
-with either a \type{DataSource\&}, or the name of a file:
-
-\begin{verbatim}
- X509_Certificate cert1("cert1.pem");
-
- /* This file contains two certificates, concatenated */
- DataSource_Stream in("certs2_and_3.pem");
-
- X509_Certificate cert2(in); // read the first cert
- X509_Certificate cert3(in); // read the second cert
-\end{verbatim}
-
-Exporting the certificate is a simple matter of calling the member function
-\function{PEM\_encode}(), which returns a \type{std::string} containing the
-certificate in PEM encoding.
-
-\begin{verbatim}
- std::cout << cert3.PEM_encode();
- some_ostream_object << cert1.PEM_encode();
- std::string cert2_str = cert2.PEM_encode();
-\end{verbatim}
-
-\subsection{Verifying Certificates}
-
-Verifying a certificate requires that we build up a chain of trust, starting
-from the root (usually a commercial CA), down through some number of
-intermediate CAs, and finally reaching the actual certificate in
-question. Thus, to verify, we actually have to have all those certificates
-on hand (or at the very least, know where we can get the ones we need).
-
-The class which handles both storing certificates, and verifying them, is
-called \type{X509\_Store}. We'll start by assuming that we have all the
-certificates we need, and just want to verify a cert. This is done by calling
-the member function \function{validate\_cert}, which takes the
-\type{X509\_Certificate} in question, and an optional argument of type
-\type{Cert\_Usage} (which is ignored here; read the section in the API doc
-titled ``Verifying Certificates for information). It returns an enum;
-\type{X509\_Code}, which, for most purposes, is either \type{VERIFIED}, or
-something else (which specifies what circumstance caused the certificate to be
-considered invalid). Really, that's it.
-
-Now, how to let \type{X509\_Store} know about all those certificates and CRLs
-we have lying around? The simplest method is to add them directly, using the
-functions \function{add\_cert}, \function{add\_certs},
-\function{add\_trusted\_certs}, and \function{add\_crl}; for details, consult
-the API doc or read the \filename{x509stor.h} header. There is also a much more
-elegant and powerful method: \type{Certificate\_Store}s. A certificate store
-refers to an object that knows how to retrieve certificates from some external
-source (a file, an LDAP directory, a HTTP server, a SQL database, or anything
-else). By calling the function \function{add\_new\_certstore}, you can register
-a new certificate store, which \type{X509\_Store} will use to find certificates
-it needs. Thus, you can get away with only adding whichever root CA cert(s) you
-want to use, letting some online source handle the storage of all intermediate
-X.509 certificates. The API documentation has a more complete discussion of
-\type{Certificate\_Store}.
-
-\subsection{Setting up a CA}
-
-WRITEME
-
-\pagebreak
-
-\section{Special Topics}
-
-This chapter is for subjects that don't really fit into the API documentation
-or into other chapters of the tutorial.
-
-\subsection{GUIs}
-
-There is nothing particularly special about using Botan in a GUI-based
-application. However there are a few tricky spots, as well as a few ways to
-take advantage of an event-based application.
-
-\subsubsection{Initialization}
-
-Generally you will create the \type{LibraryInitializer} somewhere in
-\texttt{main}, before entering the event loop. One problem is that some GUI
-libraries take over \texttt{main} and drop you right into the event loop; the
-question then is how to initialize the library? The simplest way is probably to
-have a static flag that marks if you have already initialized the library or
-not. When you enter the event loop, check to see if this flag has not been set,
-and if so, initialize the library using the function-based initializers. Using
-\type{LibraryInitializer} obviously won't work in this case, since it would be
-destroyed as soon as the current event handler finished. You then deinitialize
-the library whenever your application is signaled to quit.
-
-\subsubsection{Interacting With the Library}
-
-In the simple case, the user will do stuff asynchronously, and then in response
-your code will do things like encrypt a file or whatever, which can all be done
-synchronously, since the data is right there for you. An application doing
-something like this is basically going to look exactly like a command line
-application that uses Botan, the only major difference being that the calls to
-the library are inside event handlers.
-
-Much harder is something like an SSH client, where you're acting as a layer
-between two asynchronous things (the user and the network). This actually isn't
-restricted to GUIs at all (text-mode SSH clients have to deal with most of the
-same problems), but it is probably more common with a GUI app. The following
-discussion is fairly vague, but hopefully somewhat useful.
-
-There are a few facilities in Botan that are primarily designed to be used by
-applications based on an event loop. See the section ``User Interfaces'' in the
-main API doc for details.
-
-\subsubsection{Entropy}
-
-One nice advantage of using a GUI is opening a new method of gathering entropy
-for the library. This is especially handy on Windows, where the available
-sources of entropy are pretty questionable. In many versions,
-\texttt{CryptGenRandom} is really rather poor, and the Toolhelp functions may
-not provide much data on a small system (such as a handheld). For example, in
-GTK+, you can use the following callback to get information about mouse
-movements:
-
-\begin{verbatim}
-static gint add_entropy(GtkWidget* widget, GdkEventMotion* event)
- {
- if(event)
- Global_RNG::add_entropy(event, sizeof(GdkEventMotion));
- return FALSE;
- }
-\end{verbatim}
-
-And then register it with your main GTK window (presumably named
-\variable{window}) as follows:
-
-\begin{verbatim}
-gtk_signal_connect(GTK_OBJECT(window), "motion_notify_event",
- GTK_SIGNAL_FUNC(add_entropy), NULL);
-
-gtk_widget_set_events(window, GDK_POINTER_MOTION_MASK);
-\end{verbatim}
-
-Even though we're catching all mouse movements, and hashing the results into
-the entropy pool, this doesn't use up more than a few percent of even a
-relatively slow desktop CPU. Note that in the case of using X over a network,
-catching all mouse events would cause large amounts of X traffic over the
-network, which might make your application slow, or even unusable (I haven't
-tried it, though).
-
-This could be made nicer if the collection function did something like
-calculating deltas between each run, storing them into a buffer, and then when
-enough of them have been added, hashing them and send them all to the PRNG in
-one shot. This would not only reduce load, but also prevent the PRNG from
-overestimating the amount of entropy it's getting, since its estimates don't
-(can't) take history into account. For example, you could move the mouse back
-and forth one pixel, and the PRNG would think it was getting a full load of
-entropy each time, when actually it was getting (at best) a bit or two.
\end{document}