aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--doc/old_tutorial.tex880
-rw-r--r--doc/tutorial.tex843
2 files changed, 932 insertions, 791 deletions
diff --git a/doc/old_tutorial.tex b/doc/old_tutorial.tex
new file mode 100644
index 000000000..ee3bc936b
--- /dev/null
+++ b/doc/old_tutorial.tex
@@ -0,0 +1,880 @@
+\documentclass{article}
+
+\setlength{\textwidth}{6.5in} % 1 inch side margins
+\setlength{\textheight}{9in} % ~1 inch top and bottom margins
+
+\setlength{\headheight}{0in}
+\setlength{\topmargin}{0in}
+\setlength{\headsep}{0in}
+
+\setlength{\oddsidemargin}{0in}
+\setlength{\evensidemargin}{0in}
+
+\title{\textbf{Botan Tutorial}}
+\author{Jack Lloyd \\
+ \texttt{[email protected]}}
+\date{2009/07/08}
+
+\newcommand{\filename}[1]{\texttt{#1}}
+\newcommand{\manpage}[2]{\texttt{#1}(#2)}
+
+\newcommand{\macro}[1]{\texttt{#1}}
+
+\newcommand{\function}[1]{\textbf{#1}}
+\newcommand{\type}[1]{\texttt{#1}}
+\renewcommand{\arg}[1]{\textsl{#1}}
+\newcommand{\variable}[1]{\textsl{#1}}
+
+\begin{document}
+
+\maketitle
+
+\tableofcontents
+
+\parskip=5pt
+\pagebreak
+
+\section{Introduction}
+
+This document essentially sets up various simple scenarios and then
+shows how to solve the problems using Botan. It's fairly simple, and
+doesn't cover many of the available APIs and algorithms, especially
+the more obscure or unusual ones. It is a supplement to the API
+documentation and the example applications, which are included in the
+distribution.
+
+To quote the Perl man page: '``There's more than one way to do it.''
+Divining how many more is left as an exercise to the reader.'
+
+This is \emph{not} a general introduction to cryptography, and most simple
+terms and ideas are not explained in any great detail.
+
+Finally, most of the code shown in this tutorial has not been tested, it was
+just written down from memory. If you find errors, please let me know.
+
+\section{Initializing the Library}
+
+The first step to using Botan is to create a \type{LibraryInitializer} object,
+which handles creating various internal structures, and also destroying them at
+shutdown. Essentially:
+
+\begin{verbatim}
+#include <botan/botan.h>
+/* include other headers here */
+
+int main()
+ {
+ LibraryInitializer init;
+ /* now do stuff */
+ return 0;
+ }
+\end{verbatim}
+
+\section{Hashing a File}
+
+\section{Symmetric Cryptography}
+
+\subsection{Encryption with a passphrase}
+
+Probably the most common crypto problem is encrypting a file (or some data that
+is in-memory) using a passphrase. There are a million ways to do this, most of
+them bad. In particular, you have to protect against weak passphrases,
+people reusing a passphrase many times, accidental and deliberate modification,
+and a dozen other potential problems.
+
+We'll start with a simple method that is commonly used, and show the problems
+that can arise. Each subsequent solution will modify the previous one to
+prevent one or more common problems, until we arrive at a good version.
+
+In these examples, we'll always use Serpent in Cipher-Block Chaining
+(CBC) mode. Whenever we need a hash function, we'll use SHA-256, since
+that is a common and well-known hash that is thought to be secure.
+
+In all examples, we choose to derive the Initialization Vector (IV) from the
+passphrase. Another (probably more common) alternative is to generate the IV
+randomly and include it at the beginning of the message. Either way is
+acceptable, and can be secure. The method used here was chosen to make for more
+interesting examples (because it's harder to get right), and may not be an
+appropriate choice for some environments.
+
+First, some notation. The passphrase is stored as a \type{std::string} named
+\variable{passphrase}. The input and output files (\variable{infile} and
+\variable{outfile}) are of types \type{std::ifstream} and \type{std::ofstream}
+(respectively).
+
+\subsubsection{First try}
+
+We hash the passphrase with SHA-256, and use the resulting hash to key
+Serpent. To generate the IV, we prepend a single '0' character to the
+passphrase, hash it, and truncate it to 16 bytes (which is Serpent's
+block size).
+
+\begin{verbatim}
+ HashFunction* hash = get_hash("SHA-256");
+
+ SymmetricKey key = hash->process(passphrase);
+ SecureVector<byte> raw_iv = hash->process('0' + passphrase);
+ InitializationVector iv(raw_iv, 16);
+
+ Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION));
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+ outfile << pipe;
+\end{verbatim}
+
+\subsubsection{Problem 1: Buffering}
+
+There is a problem with the above code, if the input file is fairly large as
+compared to available memory. Specifically, all the encrypted data is stored
+in memory, and then flushed to \variable{outfile} in a single go at the very
+end. If the input file is big (say, a gigabyte), this will be most problematic.
+
+The solution is to use a \type{DataSink} to handle the output for us (writing
+to \arg{outfile} will be implicit with writing to the \type{Pipe}). We can do
+this by replacing the last few lines with:
+
+\begin{verbatim}
+ Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
+ new DataSink_Stream(outfile));
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+\end{verbatim}
+
+\subsubsection{Problem 2: Deriving the key and IV}
+
+Hash functions like SHA-256 are deterministic; if the same passphrase
+is supplied twice, then the key (and in our case, the IV) will be the
+same. This is very dangerous, and could easily open the whole system
+up to attack. What we need to do is introduce a salt (or nonce) into
+the generation of the key from the passphrase. This will mean that the
+key will not be the same each time the same passphrase is typed in by
+a user.
+
+There is another problem with using a bare hash function to derive
+keys. While it's inconceivable (based on our current understanding of
+thermodynamics and theories of computation) that an attacker could
+brute-force a 256-bit key, it would be fairly simple for them to
+compute the SHA-256 hashes of various common passwords ('password',
+the name of the dog, the significant other's middle name, favorite
+sports team) and try those as keys. So we want to slow the attacker
+down if we can, and an easy way to do that is to iterate the hash
+function a bunch of times (say, 1024 to 4096 times). This will involve
+only a small amount of effort for a legitimate user (since they only
+have to compute the hashes once, when they type in their passphrase),
+but an attacker, trying out a large list of potential passphrases,
+will be seriously annoyed (and slowed down) by this.
+
+In this iteration of the example, we'll kill these two birds with one
+stone, and derive the key from the passphrase using a PBKDF
+(Password-Based Key Derivation Function). In this example, we use
+PBKDF2 with Hash Message Authentication Code (HMAC(SHA-256)), which is
+specified in PKCS \#5. We replace the first four lines of code from
+the first example with:
+
+\begin{verbatim}
+ PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
+ // hard-coded iteration count for simplicity; should be sufficient
+ pbkdf->set_iterations(10000);
+ // 8 octets == 64-bit salt; again, good enough
+ pbkdf->new_random_salt(8);
+ SecureVector<byte> the_salt = pbkdf->current_salt();
+
+ // 48 octets == 32 for key + 16 for IV
+ SecureVector<byte> key_and_IV = pbkdf->derive_key(48, passphrase).bits_of();
+
+ SymmetricKey key(key_and_IV, 32);
+ InitializationVector iv(key_and_IV + 32, 16);
+\end{verbatim}
+
+To complete the example, we have to remember to write out the salt (stored in
+\variable{the\_salt}) at the beginning of the file. The receiving side needs to
+know this value in order to restore it (by calling the \variable{pbkdf} object's
+\function{change\_salt} function) so it can derive the same key and IV from the
+passphrase.
+
+\subsubsection{Problem 3: Protecting against modification}
+
+As it is, an attacker can undetectably alter the message while it is
+in transit. It is vital to remember that encryption does not imply
+authentication (except when using special modes that are specifically
+designed to provide authentication along with encryption, like OCB and
+EAX). For this purpose, we will append a message authentication code
+to the encrypted message. Specifically, we will generate an extra 256
+bits of key data, and use it to key the ``HMAC(SHA-256)'' MAC
+function. We don't want to have the MAC and the cipher to share the
+same key; that is very much a no-no.
+
+\begin{verbatim}
+ // 80 octets == 32 for cipher key + 16 for IV + 32 for hmac key
+ SecureVector<byte> keys_and_IV = pbkdf->derive_key(80, passphrase);
+
+ SymmetricKey key(keys_and_IV, 32);
+ InitializationVector iv(keys_and_IV + 32, 16);
+ SymmetricKey mac_key(keys_and_IV + 32 + 16, 32);
+
+ Pipe pipe(new Fork(
+ new Chain(
+ get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
+ new DataSink_Stream(outfile)
+ ),
+ new MAC_Filter("HMAC(SHA-256)", mac_key)
+ )
+ );
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+
+ // now read the MAC from message #2. Message numbers start from 0
+ SecureVector<byte> hmac = pipe.read_all(1);
+ outfile.write((const char*)hmac.ptr(), hmac.size());
+\end{verbatim}
+
+The receiver can check the size of the file (in bytes), and since it knows how
+long the MAC is, can figure out how many bytes of ciphertext there are. Then it
+reads in that many bytes, sending them to a Serpent/CBC decryption object
+(which could be obtained by calling \verb|get_cipher| with an argument of
+\type{DECRYPTION} instead of \type{ENCRYPTION}), and storing the final bytes to
+authenticate the message with.
+
+\subsubsection{Problem 4: Cleaning up the key generation}
+
+The method used to derive the keys and IV is rather inelegant, and it would be
+nice to clean that up a bit, algorithmically speaking. A nice solution for this
+is to generate a master key from the passphrase and salt, and then generate the
+two keys and the IV (the cryptovariables) from that.
+
+Starting from the master key, we derive the cryptovariables using a KDF
+algorithm, which is designed, among other things, to ``separate'' keys so that
+we can derive several different keys from the single master key. For this
+purpose, we will use KDF2, which is a generally useful KDF function (defined in
+IEEE 1363a, among other standards). The use of different labels (``cipher
+key'', etc) makes sure that each of the three derived variables will have
+different values.
+
+\begin{verbatim}
+ PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
+ // hard-coded iteration count for simplicity; should be sufficient
+ pbkdf->set_iterations(10000);
+ // 8 octet == 64-bit salt; again, good enough
+ pbkdf->new_random_salt(8);
+ // store the salt so we can write it to a file later
+ SecureVector<byte> the_salt = pbkdf->current_salt();
+
+ SymmetricKey master_key = pbkdf->derive_key(48, passphrase);
+
+ KDF* kdf = get_kdf("KDF2(SHA-256)");
+
+ SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
+ SymmetricKey mac_key = kdf->derive_key(32, master_key, "hmac key");
+ InitializationVector iv = kdf->derive_key(16, master_key, "cipher iv");
+\end{verbatim}
+
+\subsubsection{Final version}
+
+Here is the final version of the encryption code, with all the changes we've
+made:
+
+\begin{verbatim}
+ PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
+ pbkdf->set_iterations(10000);
+ pbkdf->new_random_salt(8);
+ SecureVector<byte> the_salt = pbkdf->current_salt();
+
+ SymmetricKey master_key = pbkdf->derive_key(48, passphrase);
+
+ KDF* kdf = get_kdf("KDF2(SHA-256)");
+
+ SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
+ SymmetricKey mac_key = kdf->derive_key(32, masterkey, "hmac key");
+ InitializationVector iv = kdf->derive_key(16, masterkey, "cipher iv");
+
+ Pipe pipe(new Fork(
+ new Chain(
+ get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
+ new DataSink_Stream(outfile)
+ ),
+ new MAC_Filter("HMAC(SHA-256)", mac_key)
+ )
+ );
+
+ outfile.write((const char*)the_salt.ptr(), the_salt.size());
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+
+ SecureVector<byte> hmac = pipe.read_all(1);
+ outfile.write((const char*)hmac.ptr(), hmac.size());
+\end{verbatim}
+
+\subsubsection{Another buffering technique}
+
+Sometimes the use of \type{DataSink\_Stream} is not practical for whatever
+reason. In this case, an alternate buffering mechanism might be useful. Here is
+some code which will write all the processed data as quickly as possible, so
+that memory pressure is reduced in the case of large inputs.
+
+\begin{verbatim}
+ pipe.start_msg();
+ SecureBuffer<byte, 1024> buffer;
+ while(infile.good())
+ {
+ infile.read((char*)buffer.ptr(), buffer.size());
+ u32bit got_from_infile = infile.gcount();
+ pipe.write(buffer, got_from_infile);
+
+ if(infile.eof())
+ pipe.end_msg();
+
+ while(pipe.remaining() > 0)
+ {
+ u32bit buffered = pipe.read(buffer, buffer.size());
+ outfile.write((const char*)buffer.ptr(), buffered);
+ }
+ }
+ if(infile.bad() || (infile.fail() && !infile.eof()))
+ throw Some_Exception();
+\end{verbatim}
+
+\pagebreak
+
+\subsection{Authentication}
+
+After doing the encryption routines, doing message authentication keyed off a
+passphrase is not very difficult. In fact it's much easier than the encryption
+case, for the following reasons: a) we only need one key, and b) we don't have
+to store anything, so all the input can be done in a single step without
+worrying about it taking up a lot of memory if the input file is large.
+
+In this case, we'll hex-encode the salt and the MAC, and output them both to
+standard output (the salt followed by the MAC).
+
+\begin{verbatim}
+ PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
+ pbkdf->set_iterations(10000);
+ pbkdf->new_random_salt(8);
+ OctetString the_salt = pbkdf->current_salt();
+
+ SymmetricKey hmac_key = pbkdf->derive_key(32, passphrase);
+
+ Pipe pipe(new MAC_Filter("HMAC(SHA-256)", mac_key),
+ new Hex_Encoder
+ );
+
+ std::cout << the_salt.to_string(); // hex encoded
+
+ pipe.start_msg();
+ infile >> pipe;
+ pipe.end_msg();
+ std::cout << pipe.read_all_as_string() << std::endl;
+\end{verbatim}
+
+\subsection{User Authentication}
+
+Doing user authentication off a shared passphrase is fairly easy. Essentially,
+a challenge-response protocol is used - the server sends a random challenge,
+and the client responds with an appropriate response to the challenge. The idea
+is that only someone who knows the passphrase can generate or check to see if a
+response is valid.
+
+Let's say we use 160-bit (20 byte) challenges, which seems fairly
+reasonable. We can create this challenge using the global random
+number generator (RNG):
+
+\begin{verbatim}
+ byte challenge[20];
+ Global_RNG::randomize(challenge, sizeof(challenge), Nonce);
+ // send challenge to client
+\end{verbatim}
+
+After reading the challenge, the client generates a response based on
+the challenge and the passphrase. In this case, we will do it by
+repeatedly hashing the challenge, the passphrase, and (if applicable)
+the previous digest. We iterate this construction 10000 times, to make
+brute force attacks on the passphrase hard to do. Since we are already
+using 160-bit challenges, a 160-bit response seems warranted, so we'll
+use SHA-1.
+
+\begin{verbatim}
+ HashFunction* hash = get_hash("SHA-1");
+ SecureVector<byte> digest;
+ for(u32bit j = 0; j != 10000; j++)
+ {
+ hash->update(digest, digest.size());
+ hash->update(passphrase);
+ hash->update(challenge, challenge.size());
+ digest = hash->final();
+ }
+ delete hash;
+ // send value of digest to the server
+\end{verbatim}
+
+Upon receiving the response from the client, the server computes what the
+response should have been based on the challenge it sent out, and the
+passphrase. If the two responses match, the client is authenticated.
+Otherwise, it is not.
+
+An alternate method is to use PBKDF2 again, using the challenge as the salt. In
+this case, the response could (for example) be the hash of the key produced by
+PBKDF2. There is no reason to have an explicit iteration loop, as PBKDF2 is
+designed to prevent dictionary attacks (assuming PBKDF2 is set up for a large
+iteration count internally).
+
+\pagebreak
+
+\section{Public Key Cryptography}
+
+\subsection{Basic Operations}
+
+In this section, we'll assume we have a \type{X509\_PublicKey*} named
+\arg{pubkey}, and, if necessary, a private key type (a
+\type{PKCS8\_PrivateKey*}) named \arg{privkey}. A description of these types,
+how to create them, and related details appears later in this tutorial. In this
+section, we will use various functions that are defined in
+\filename{look\_pk.h} -- you will have to include this header explicitly.
+
+\subsubsection{Encryption}
+
+Basically, pick an encoding method, create a \type{PK\_Encryptor} (with
+\function{get\_pk\_encryptor}()), and use it. But first we have to make sure
+the public key can actually be used for public key encryption. For encryption
+(and decryption), the key could be RSA, ElGamal, or (in future versions) some
+other public key encryption scheme, like Rabin or an elliptic curve scheme.
+
+\begin{verbatim}
+ PK_Encrypting_Key* key = dynamic_cast<PK_Encrypting_Key*>(pubkey);
+ if(!key)
+ error();
+ PK_Encryptor* enc = get_pk_encryptor(*key, "EME1(SHA-256)");
+
+ byte msg[] = { /* ... */ };
+
+ // will also accept a SecureVector<byte> as input
+ SecureVector<byte> ciphertext = enc->encrypt(msg, sizeof(msg));
+\end{verbatim}
+
+\subsubsection{Decryption}
+
+This is essentially the same as the encryption operation, but using a private
+key instead. One major difference is that the decryption operation can fail due
+to the fact that the ciphertext was invalid (most common padding schemes, such
+as ``EME1(SHA-256)'', include various pieces of redundancy, which are checked
+after decryption).
+
+\begin{verbatim}
+ PK_Decrypting_Key* key = dynamic_cast<PK_Decrypting_Key*>(privkey);
+ if(!key)
+ error();
+ PK_Decryptor* dec = get_pk_decryptor(*key, "EME1(SHA-256)");
+
+ byte msg[] = { /* ... */ };
+
+ SecureVector<byte> plaintext;
+
+ try {
+ // will also accept a SecureVector<byte> as input
+ plaintext = dec->decrypt(msg, sizeof(msg));
+ }
+ catch(Decoding_Error)
+ {
+ /* the ciphertext was invalid */
+ }
+\end{verbatim}
+
+\subsubsection{Signature Generation}
+
+There is one difficulty with signature generation that does not occur with
+encryption or decryption. Specifically, there are various padding methods which
+can be useful for different signature algorithms, and not all are appropriate
+for all signature schemes. The following table breaks down what algorithms
+support which encodings:
+
+\begin{tabular}{|c|c|c|} \hline
+Signature Algorithm & Usable Encoding Methods & Preferred Encoding(s) \\ \hline
+DSA / NR & EMSA1 & EMSA1 \\ \hline
+RSA & EMSA1, EMSA2, EMSA3, EMSA4 & EMSA3, EMSA4 \\ \hline
+Rabin-Williams & EMSA2, EMSA4 & EMSA2, EMSA4 \\ \hline
+\end{tabular}
+
+For new applications, use EMSA4 with both RSA and Rabin-Williams, as it is
+significantly more secure than the alternatives. However, most current
+applications/libraries only support EMSA2 with Rabin-Williams and EMSA3 with
+RSA. Given this, you may be forced to use less secure encoding methods for the
+near future. In these examples, we punt on the problem, and hard-code using
+EMSA1 with SHA-256.
+
+\begin{verbatim}
+ Public_Key* key = /* loaded or generated somehow */
+ PK_Signer* signer = get_pk_signer(*key, "EMSA1(SHA-256)");
+
+ byte msg[] = { /* ... */ };
+
+ /*
+ You can also repeatedly call update(const byte[], u32bit), followed
+ by a call to signature(), which will return the final signature of
+ all the data that was passed through update(). sign_message() is
+ just a stub that calls update() once, and returns the value of
+ signature().
+ */
+
+ SecureVector<byte> signature = signer->sign_message(msg, sizeof(msg));
+\end{verbatim}
+
+\pagebreak
+
+\subsubsection{Signature Verification}
+
+In addition to all the problems with choosing the correct padding method,
+there is yet another complication with verifying a signature. Namely, there are
+two varieties of signature algorithms - those providing message recovery (that
+is, the value that was signed can be directly recovered by someone verifying
+the signature), and those without message recovery (the verify operation simply
+returns if the signature was valid, without telling you exactly what was
+signed). This leads to two slightly different implementations of the
+verification operation, which user code has to work with. As you can see
+however, the implementation is still not at all difficult.
+
+\begin{verbatim}
+ PK_Verifier* verifier = 0;
+
+ PK_Verifying_with_MR_Key* key1 =
+ dynamic_cast<PK_Verifying_with_MR_Key*>(pubkey);
+ PK_Verifying_wo_MR_Key* key2 =
+ dynamic_cast<PK_Verifying_wo_MR_Key*>(pubkey);
+
+ if(key1)
+ verifier = get_pk_verifier(*key1, "EMSA1(SHA-256)");
+ else if(key2)
+ verifier = get_pk_verifier(*key2, "EMSA1(SHA-256)");
+ else
+ error();
+
+ byte msg[] = { /* ... */ };
+ byte sig[] = { /* ... */ };
+
+ /*
+ Like PK_Signer, you can also do repeated calls to
+ void update(const byte some_data[], u32bit length)
+ followed by a call to
+ bool check_signature(const byte the_sig[], u32bit length)
+ which will return true (valid signature) or false (bad signature).
+ The function verify_message() is a simple wrapper around update() and
+ check_signature().
+
+ */
+ bool is_valid = verifier->verify_message(msg, sizeof(msg), sig, sizeof(sig));
+\end{verbatim}
+
+\subsubsection{Key Agreement}
+
+WRITEME
+
+\pagebreak
+
+\subsection{Working with Keys}
+
+\subsubsection{Reading Public Keys (X.509 format)}
+
+There are two separate ways to read X.509 public keys. Remember that the X.509
+public keys are simply that: public keys. There is no associated information
+(such as the owner of that key) included with the public key itself. If you
+need that kind of information, you'll need to use X.509 certificates.
+
+However, there are surely times when a simple public key is sufficient. The
+most obvious is when the key is implicitly trusted, for example if access
+and/or modification of it is controlled by something else (like filesystem
+ACLs). In other cases, it is a perfectly reasonable proposition to use them
+over the network as an anonymous key exchange mechanism. This is, admittedly,
+vulnerable to man-in-the-middle attacks, but it's simple enough that it's hard
+to mess up (see, for example, Peter Guttman's paper ``Lessons Learned in
+Implementing and Deploying Crypto Software'' in Usenix '02).
+
+The way to load a key is basically to set up a \type{DataSource} and then call
+\function{X509::load\_key}, which will return a \type{X509\_PublicKey*}. For
+example:
+
+\begin{verbatim}
+ DataSource_Stream somefile("somefile.pem"); // has 3 public keys
+ X509_PublicKey* key1 = X509::load_key(somefile);
+ X509_PublicKey* key2 = X509::load_key(somefile);
+ X509_PublicKey* key3 = X509::load_key(somefile);
+ // Now we have them all loaded. Huzah!
+\end{verbatim}
+
+At this point you can use \function{dynamic\_cast} to find the operations the
+key supports (by seeing if a cast to \type{PK\_Encrypting\_Key},
+\type{PK\_Verifying\_with\_MR\_Key}, or \type{PK\_Verifying\_wo\_MR\_Key}
+succeeds).
+
+There is a variant of \function{X509::load\_key} (and of
+\function{PKCS8::load\_key}, described in the next section) which take a
+filename (as a \type{std::string}). These are just convenience functions which
+create the appropriate \type{DataSource} for you and call the main
+\function{X509::load\_key}.
+
+\subsubsection{Reading Private Keys (PKCS \#8 format)}
+
+This is very similar to reading raw public keys, with the difference that the
+key may be encrypted with a user passphrase:
+
+\begin{verbatim}
+ // rng is a RandomNumberGenerator, like AutoSeeded_RNG
+
+ DataSource_Stream somefile("somefile");
+ std::string a_passphrase = /* get from the user */
+ PKCS8_PrivateKey* key = PKCS8::load_key(somefile, rng, a_passphrase);
+\end{verbatim}
+
+You can, by the way, convert a \type{PKCS8\_PrivateKey} to a
+\type{X509\_PublicKey} simply by casting it (with \function{dynamic\_cast}), as
+the private key type is derived from \type{X509\_PublicKey}. As with
+\type{X509\_PublicKey}, you can use \function{dynamic\_cast} to figure out what
+operations the private key is capable of; in particular, you can attempt to
+cast it to \type{PK\_Decrypting\_Key}, \type{PK\_Signing\_Key}, or
+\type{PK\_Key\_Agreement\_Key}.
+
+Sometimes you can get away with having a static passphrase passed to
+\function{load\_key}. Typically, however, you'll have to do some user
+interaction to get the appropriate passphrase. In that case you'll want to use
+the \type{UI} related interface, which is fully described in the API
+documentation.
+
+\subsubsection{Generating New Private Keys}
+
+Generate a new private key is the one operation which requires you to
+explicitly name the type of key you are working with. There are (currently) two
+kinds of public key algorithms in Botan: ones based on the integer
+factorization (IF) problem (RSA and Rabin-Williams), and ones based on the
+discrete logarithm (DL) problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and
+ElGamal). Since discrete logarithm parameters (primes and generators) can be
+shared among many keys, there is the notion of these being a combined type
+(called \type{DL\_Group}).
+
+To create a new DL-based private key, simply pass a desired \type{DL\_Group} to
+the constructor of the private key - a new public/private key pair will be
+generated. Since in IF-based algorithms, the modulus used isn't shared by other
+keys, we don't use this notion. You can create a new key by passing in a
+\type{u32bit} telling how long (in bits) the key should be.
+
+There are quite a few ways to get a \type{DL\_Group} object. The best is to use
+the function \function{get\_dl\_group}, which takes a string naming a group; it
+will either return that group, if it knows about it, or throw an
+exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536,
+2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF
+groups are the ones specified for use with IPSec, and the DSA ones are the
+default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, use
+the ``DSA-n'' groups, and for Diffie-Hellman and ElGamal, use the ``IETF-n''
+groups.
+
+You can also generate a new random group. This is not recommend, because it is
+very slow, particularly for ``safe'' primes, which are needed for
+Diffie-Hellman and ElGamal.
+
+Some examples:
+
+\begin{verbatim}
+ RSA_PrivateKey rsa1(512); // 512-bit RSA key
+ RSA_PrivateKey rsa2(2048); // 2048-bit RSA key
+
+ RW_PrivateKey rw1(1024); // 1024-bit Rabin-Williams key
+ RW_PrivateKey rw2(1536); // 1536-bit Rabin-Williams key
+
+ DSA_PrivateKey dsa(get_dl_group("DSA-512")); // 512-bit DSA key
+ DH_PrivateKey dh(get_dl_group("IETF-4096")); // 4096-bit DH key
+ NR_PrivateKey nr(get_dl_group("DSA-1024")); // 1024-bit NR key
+ ElGamal_PrivateKey elg(get_dl_group("IETF-1536")); // 1536-bit ElGamal key
+\end{verbatim}
+
+To export your newly created private key, use the PKCS \#8 routines in
+\filename{pkcs8.h}:
+
+\begin{verbatim}
+ std::string a_passphrase = /* get from the user */
+ std::string the_key = PKCS8::PEM_encode(rsa2, a_passphrase);
+\end{verbatim}
+
+You can read the key back in using \function{PKCS8::load\_key}, described in
+the section ``Reading Private Keys (PKCS \#8 format)'', above. Unfortunately,
+this only works with keys that have an assigned algorithm identifier and
+standardized format. Currently this is only the RSA, DSA, DH, and ElGamal
+algorithms, though RW and NR keys can also be imported and exported by
+assigning them an OID (this can be done either through a configuration file, or
+by calling the function \function{OIDS::add\_oid} in \filename{oids.h}). Be
+aware that the OID and format for ElGamal keys is not exactly standard, but
+there does exist at least one other crypto library which will accept the
+format.
+
+The raw public can be exported using:
+
+\begin{verbatim}
+ std::string the_public_key = X509::PEM_encode(rsa2);
+\end{verbatim}
+
+\pagebreak
+
+\section{X.509v3 Certificates}
+
+Using certificates is rather complicated, so only the very basic mechanisms are
+going to be covered here. The section ``Setting up a CA'' goes into reasonable
+detail about CRLs and certificate requests, but there is a lot that isn't
+covered (else this section would get quite long and complicated).
+
+\subsection{Importing and Exporting Certificates}
+
+Importing and exporting X.509 certificates is easy. Simply call the constructor
+with either a \type{DataSource\&}, or the name of a file:
+
+\begin{verbatim}
+ X509_Certificate cert1("cert1.pem");
+
+ /* This file contains two certificates, concatenated */
+ DataSource_Stream in("certs2_and_3.pem");
+
+ X509_Certificate cert2(in); // read the first cert
+ X509_Certificate cert3(in); // read the second cert
+\end{verbatim}
+
+Exporting the certificate is a simple matter of calling the member function
+\function{PEM\_encode}(), which returns a \type{std::string} containing the
+certificate in PEM encoding.
+
+\begin{verbatim}
+ std::cout << cert3.PEM_encode();
+ some_ostream_object << cert1.PEM_encode();
+ std::string cert2_str = cert2.PEM_encode();
+\end{verbatim}
+
+\subsection{Verifying Certificates}
+
+Verifying a certificate requires that we build up a chain of trust, starting
+from the root (usually a commercial CA), down through some number of
+intermediate CAs, and finally reaching the actual certificate in
+question. Thus, to verify, we actually have to have all those certificates
+on hand (or at the very least, know where we can get the ones we need).
+
+The class which handles both storing certificates, and verifying them, is
+called \type{X509\_Store}. We'll start by assuming that we have all the
+certificates we need, and just want to verify a cert. This is done by calling
+the member function \function{validate\_cert}, which takes the
+\type{X509\_Certificate} in question, and an optional argument of type
+\type{Cert\_Usage} (which is ignored here; read the section in the API doc
+titled ``Verifying Certificates for information). It returns an enum;
+\type{X509\_Code}, which, for most purposes, is either \type{VERIFIED}, or
+something else (which specifies what circumstance caused the certificate to be
+considered invalid). Really, that's it.
+
+Now, how to let \type{X509\_Store} know about all those certificates and CRLs
+we have lying around? The simplest method is to add them directly, using the
+functions \function{add\_cert}, \function{add\_certs},
+\function{add\_trusted\_certs}, and \function{add\_crl}; for details, consult
+the API doc or read the \filename{x509stor.h} header. There is also a much more
+elegant and powerful method: \type{Certificate\_Store}s. A certificate store
+refers to an object that knows how to retrieve certificates from some external
+source (a file, an LDAP directory, a HTTP server, a SQL database, or anything
+else). By calling the function \function{add\_new\_certstore}, you can register
+a new certificate store, which \type{X509\_Store} will use to find certificates
+it needs. Thus, you can get away with only adding whichever root CA cert(s) you
+want to use, letting some online source handle the storage of all intermediate
+X.509 certificates. The API documentation has a more complete discussion of
+\type{Certificate\_Store}.
+
+\subsection{Setting up a CA}
+
+WRITEME
+
+\pagebreak
+
+\section{Special Topics}
+
+This chapter is for subjects that don't really fit into the API documentation
+or into other chapters of the tutorial.
+
+\subsection{GUIs}
+
+There is nothing particularly special about using Botan in a GUI-based
+application. However there are a few tricky spots, as well as a few ways to
+take advantage of an event-based application.
+
+\subsubsection{Initialization}
+
+Generally you will create the \type{LibraryInitializer} somewhere in
+\texttt{main}, before entering the event loop. One problem is that some GUI
+libraries take over \texttt{main} and drop you right into the event loop; the
+question then is how to initialize the library? The simplest way is probably to
+have a static flag that marks if you have already initialized the library or
+not. When you enter the event loop, check to see if this flag has not been set,
+and if so, initialize the library using the function-based initializers. Using
+\type{LibraryInitializer} obviously won't work in this case, since it would be
+destroyed as soon as the current event handler finished. You then deinitialize
+the library whenever your application is signaled to quit.
+
+\subsubsection{Interacting With the Library}
+
+In the simple case, the user will do stuff asynchronously, and then in response
+your code will do things like encrypt a file or whatever, which can all be done
+synchronously, since the data is right there for you. An application doing
+something like this is basically going to look exactly like a command line
+application that uses Botan, the only major difference being that the calls to
+the library are inside event handlers.
+
+Much harder is something like an SSH client, where you're acting as a layer
+between two asynchronous things (the user and the network). This actually isn't
+restricted to GUIs at all (text-mode SSH clients have to deal with most of the
+same problems), but it is probably more common with a GUI app. The following
+discussion is fairly vague, but hopefully somewhat useful.
+
+There are a few facilities in Botan that are primarily designed to be used by
+applications based on an event loop. See the section ``User Interfaces'' in the
+main API doc for details.
+
+\subsubsection{Entropy}
+
+One nice advantage of using a GUI is opening a new method of gathering entropy
+for the library. This is especially handy on Windows, where the available
+sources of entropy are pretty questionable. In many versions,
+\texttt{CryptGenRandom} is really rather poor, and the Toolhelp functions may
+not provide much data on a small system (such as a handheld). For example, in
+GTK+, you can use the following callback to get information about mouse
+movements:
+
+\begin{verbatim}
+static gint add_entropy(GtkWidget* widget, GdkEventMotion* event)
+ {
+ if(event)
+ Global_RNG::add_entropy(event, sizeof(GdkEventMotion));
+ return FALSE;
+ }
+\end{verbatim}
+
+And then register it with your main GTK window (presumably named
+\variable{window}) as follows:
+
+\begin{verbatim}
+gtk_signal_connect(GTK_OBJECT(window), "motion_notify_event",
+ GTK_SIGNAL_FUNC(add_entropy), NULL);
+
+gtk_widget_set_events(window, GDK_POINTER_MOTION_MASK);
+\end{verbatim}
+
+Even though we're catching all mouse movements, and hashing the results into
+the entropy pool, this doesn't use up more than a few percent of even a
+relatively slow desktop CPU. Note that in the case of using X over a network,
+catching all mouse events would cause large amounts of X traffic over the
+network, which might make your application slow, or even unusable (I haven't
+tried it, though).
+
+This could be made nicer if the collection function did something like
+calculating deltas between each run, storing them into a buffer, and then when
+enough of them have been added, hashing them and send them all to the PRNG in
+one shot. This would not only reduce load, but also prevent the PRNG from
+overestimating the amount of entropy it's getting, since its estimates don't
+(can't) take history into account. For example, you could move the mouse back
+and forth one pixel, and the PRNG would think it was getting a full load of
+entropy each time, when actually it was getting (at best) a bit or two.
+
+\end{document}
diff --git a/doc/tutorial.tex b/doc/tutorial.tex
index ee3bc936b..840679d10 100644
--- a/doc/tutorial.tex
+++ b/doc/tutorial.tex
@@ -13,7 +13,7 @@
\title{\textbf{Botan Tutorial}}
\author{Jack Lloyd \\
-\date{2009/07/08}
+\date{2010/08/07}
\newcommand{\filename}[1]{\texttt{#1}}
\newcommand{\manpage}[2]{\texttt{#1}(#2)}
@@ -37,844 +37,105 @@
\section{Introduction}
This document essentially sets up various simple scenarios and then
-shows how to solve the problems using Botan. It's fairly simple, and
+shows how to solve the problems using botan. It's fairly simple, and
doesn't cover many of the available APIs and algorithms, especially
the more obscure or unusual ones. It is a supplement to the API
documentation and the example applications, which are included in the
distribution.
-To quote the Perl man page: '``There's more than one way to do it.''
-Divining how many more is left as an exercise to the reader.'
-
-This is \emph{not} a general introduction to cryptography, and most simple
-terms and ideas are not explained in any great detail.
-
-Finally, most of the code shown in this tutorial has not been tested, it was
-just written down from memory. If you find errors, please let me know.
-
\section{Initializing the Library}
-The first step to using Botan is to create a \type{LibraryInitializer} object,
-which handles creating various internal structures, and also destroying them at
-shutdown. Essentially:
+The first step to using botan is to create a \type{LibraryInitializer}
+object, which handles creating various internal structures, and also
+destroying them at shutdown.
\begin{verbatim}
#include <botan/botan.h>
-/* include other headers here */
int main()
{
- LibraryInitializer init;
- /* now do stuff */
+ Botan::LibraryInitializer init;
return 0;
}
\end{verbatim}
-\section{Hashing a File}
-
-\section{Symmetric Cryptography}
-
-\subsection{Encryption with a passphrase}
-
-Probably the most common crypto problem is encrypting a file (or some data that
-is in-memory) using a passphrase. There are a million ways to do this, most of
-them bad. In particular, you have to protect against weak passphrases,
-people reusing a passphrase many times, accidental and deliberate modification,
-and a dozen other potential problems.
-
-We'll start with a simple method that is commonly used, and show the problems
-that can arise. Each subsequent solution will modify the previous one to
-prevent one or more common problems, until we arrive at a good version.
-
-In these examples, we'll always use Serpent in Cipher-Block Chaining
-(CBC) mode. Whenever we need a hash function, we'll use SHA-256, since
-that is a common and well-known hash that is thought to be secure.
-
-In all examples, we choose to derive the Initialization Vector (IV) from the
-passphrase. Another (probably more common) alternative is to generate the IV
-randomly and include it at the beginning of the message. Either way is
-acceptable, and can be secure. The method used here was chosen to make for more
-interesting examples (because it's harder to get right), and may not be an
-appropriate choice for some environments.
-
-First, some notation. The passphrase is stored as a \type{std::string} named
-\variable{passphrase}. The input and output files (\variable{infile} and
-\variable{outfile}) are of types \type{std::ifstream} and \type{std::ofstream}
-(respectively).
-
-\subsubsection{First try}
-
-We hash the passphrase with SHA-256, and use the resulting hash to key
-Serpent. To generate the IV, we prepend a single '0' character to the
-passphrase, hash it, and truncate it to 16 bytes (which is Serpent's
-block size).
+If your application is multi-threaded, you need to tell botan this so
+that it will use locking where necessary. This is done by passing a
+string to the constructor of \type{LibraryInitializer}:
\begin{verbatim}
- HashFunction* hash = get_hash("SHA-256");
-
- SymmetricKey key = hash->process(passphrase);
- SecureVector<byte> raw_iv = hash->process('0' + passphrase);
- InitializationVector iv(raw_iv, 16);
-
- Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION));
-
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
- outfile << pipe;
+ Botan::LibraryInitializer init("thread_safe=yes");
\end{verbatim}
-\subsubsection{Problem 1: Buffering}
+\section{Introduction to Pipe}
-There is a problem with the above code, if the input file is fairly large as
-compared to available memory. Specifically, all the encrypted data is stored
-in memory, and then flushed to \variable{outfile} in a single go at the very
-end. If the input file is big (say, a gigabyte), this will be most problematic.
+Most operations in botan are specified in terms of transformations on
+streams. The class that handles the I/O and management for these
+streams is called \type{Pipe}. You can construct a \type{Pipe} with
+one or more \type{Filter}s, which sequentially process messages. You
+can only update a single message at a time, but you can leave the
+final output contents in a \type{Pipe} and read them out as desired.
-The solution is to use a \type{DataSink} to handle the output for us (writing
-to \arg{outfile} will be implicit with writing to the \type{Pipe}). We can do
-this by replacing the last few lines with:
+Here is how you might hex encode two messages:
\begin{verbatim}
- Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
- new DataSink_Stream(outfile));
+ std::string message1 = "this is the first message";
+ const byte message2[] = "a second message";
+ Pipe pipe(new Hex_Encoder);
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
-\end{verbatim}
+ pipe.start_msg(); // must be called before writing to the pipe
+ pipe.write(message1);
+ pipe.end_msg(); // must be called to signal completion
-\subsubsection{Problem 2: Deriving the key and IV}
-
-Hash functions like SHA-256 are deterministic; if the same passphrase
-is supplied twice, then the key (and in our case, the IV) will be the
-same. This is very dangerous, and could easily open the whole system
-up to attack. What we need to do is introduce a salt (or nonce) into
-the generation of the key from the passphrase. This will mean that the
-key will not be the same each time the same passphrase is typed in by
-a user.
-
-There is another problem with using a bare hash function to derive
-keys. While it's inconceivable (based on our current understanding of
-thermodynamics and theories of computation) that an attacker could
-brute-force a 256-bit key, it would be fairly simple for them to
-compute the SHA-256 hashes of various common passwords ('password',
-the name of the dog, the significant other's middle name, favorite
-sports team) and try those as keys. So we want to slow the attacker
-down if we can, and an easy way to do that is to iterate the hash
-function a bunch of times (say, 1024 to 4096 times). This will involve
-only a small amount of effort for a legitimate user (since they only
-have to compute the hashes once, when they type in their passphrase),
-but an attacker, trying out a large list of potential passphrases,
-will be seriously annoyed (and slowed down) by this.
-
-In this iteration of the example, we'll kill these two birds with one
-stone, and derive the key from the passphrase using a PBKDF
-(Password-Based Key Derivation Function). In this example, we use
-PBKDF2 with Hash Message Authentication Code (HMAC(SHA-256)), which is
-specified in PKCS \#5. We replace the first four lines of code from
-the first example with:
+ /*
+ process_msg(x) is equivalent to calling
+ start_msg(); write(x); end_msg();
+ */
+ pipe.process_msg(message2);
-\begin{verbatim}
- PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
- // hard-coded iteration count for simplicity; should be sufficient
- pbkdf->set_iterations(10000);
- // 8 octets == 64-bit salt; again, good enough
- pbkdf->new_random_salt(8);
- SecureVector<byte> the_salt = pbkdf->current_salt();
-
- // 48 octets == 32 for key + 16 for IV
- SecureVector<byte> key_and_IV = pbkdf->derive_key(48, passphrase).bits_of();
-
- SymmetricKey key(key_and_IV, 32);
- InitializationVector iv(key_and_IV + 32, 16);
-\end{verbatim}
+ Pipe::message_id n = pipe.message_count(); // returns 2
-To complete the example, we have to remember to write out the salt (stored in
-\variable{the\_salt}) at the beginning of the file. The receiving side needs to
-know this value in order to restore it (by calling the \variable{pbkdf} object's
-\function{change\_salt} function) so it can derive the same key and IV from the
-passphrase.
+ /* you can read a message as a string, here we read message 0 */
+ std::string first_result = pipe.read_all_as_string(0);
-\subsubsection{Problem 3: Protecting against modification}
+ /* or a piece at a time using array/length, now we'll read the
+ second message (message id 1)
+ */
-As it is, an attacker can undetectably alter the message while it is
-in transit. It is vital to remember that encryption does not imply
-authentication (except when using special modes that are specifically
-designed to provide authentication along with encryption, like OCB and
-EAX). For this purpose, we will append a message authentication code
-to the encrypted message. Specifically, we will generate an extra 256
-bits of key data, and use it to key the ``HMAC(SHA-256)'' MAC
-function. We don't want to have the MAC and the cipher to share the
-same key; that is very much a no-no.
-
-\begin{verbatim}
- // 80 octets == 32 for cipher key + 16 for IV + 32 for hmac key
- SecureVector<byte> keys_and_IV = pbkdf->derive_key(80, passphrase);
-
- SymmetricKey key(keys_and_IV, 32);
- InitializationVector iv(keys_and_IV + 32, 16);
- SymmetricKey mac_key(keys_and_IV + 32 + 16, 32);
-
- Pipe pipe(new Fork(
- new Chain(
- get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
- new DataSink_Stream(outfile)
- ),
- new MAC_Filter("HMAC(SHA-256)", mac_key)
- )
- );
-
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
-
- // now read the MAC from message #2. Message numbers start from 0
- SecureVector<byte> hmac = pipe.read_all(1);
- outfile.write((const char*)hmac.ptr(), hmac.size());
+ byte output[4096] = { 0 };
+ u32bit got = read(output, sizeof(output), 1);
+ if(got >= sizeof(output))
+ // have to read again to get more of the message
\end{verbatim}
-The receiver can check the size of the file (in bytes), and since it knows how
-long the MAC is, can figure out how many bytes of ciphertext there are. Then it
-reads in that many bytes, sending them to a Serpent/CBC decryption object
-(which could be obtained by calling \verb|get_cipher| with an argument of
-\type{DECRYPTION} instead of \type{ENCRYPTION}), and storing the final bytes to
-authenticate the message with.
+You can also read output while the message is still active (before the
+call to \function{end\_msg}), using the same interfaces. You can find
+out how much data is currently available for a particular \type{Pipe}
+by calling the member function \function{remaining}, which takes a
+message sequence number and returns the number of bytes that are
+currently available to read from that message.
-\subsubsection{Problem 4: Cleaning up the key generation}
-
-The method used to derive the keys and IV is rather inelegant, and it would be
-nice to clean that up a bit, algorithmically speaking. A nice solution for this
-is to generate a master key from the passphrase and salt, and then generate the
-two keys and the IV (the cryptovariables) from that.
+\section{Hashing a File}
-Starting from the master key, we derive the cryptovariables using a KDF
-algorithm, which is designed, among other things, to ``separate'' keys so that
-we can derive several different keys from the single master key. For this
-purpose, we will use KDF2, which is a generally useful KDF function (defined in
-IEEE 1363a, among other standards). The use of different labels (``cipher
-key'', etc) makes sure that each of the three derived variables will have
-different values.
+Hashing a file is done using a \type{Hash\_Filter}, which takes a string
+which specifies which hash function you want to use:
\begin{verbatim}
- PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
- // hard-coded iteration count for simplicity; should be sufficient
- pbkdf->set_iterations(10000);
- // 8 octet == 64-bit salt; again, good enough
- pbkdf->new_random_salt(8);
- // store the salt so we can write it to a file later
- SecureVector<byte> the_salt = pbkdf->current_salt();
-
- SymmetricKey master_key = pbkdf->derive_key(48, passphrase);
-
- KDF* kdf = get_kdf("KDF2(SHA-256)");
-
- SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
- SymmetricKey mac_key = kdf->derive_key(32, master_key, "hmac key");
- InitializationVector iv = kdf->derive_key(16, master_key, "cipher iv");
+ Pipe pipe(new Hash_Filter("SHA-256"));
\end{verbatim}
-\subsubsection{Final version}
-
-Here is the final version of the encryption code, with all the changes we've
-made:
-
-\begin{verbatim}
- PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
- pbkdf->set_iterations(10000);
- pbkdf->new_random_salt(8);
- SecureVector<byte> the_salt = pbkdf->current_salt();
-
- SymmetricKey master_key = pbkdf->derive_key(48, passphrase);
-
- KDF* kdf = get_kdf("KDF2(SHA-256)");
-
- SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
- SymmetricKey mac_key = kdf->derive_key(32, masterkey, "hmac key");
- InitializationVector iv = kdf->derive_key(16, masterkey, "cipher iv");
+The output of a \type{Hash\_Filter} is raw binary. The filter will not
+produce any output at all until you call \function{end\_msg}.
- Pipe pipe(new Fork(
- new Chain(
- get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
- new DataSink_Stream(outfile)
- ),
- new MAC_Filter("HMAC(SHA-256)", mac_key)
- )
- );
-
- outfile.write((const char*)the_salt.ptr(), the_salt.size());
-
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
-
- SecureVector<byte> hmac = pipe.read_all(1);
- outfile.write((const char*)hmac.ptr(), hmac.size());
-\end{verbatim}
-
-\subsubsection{Another buffering technique}
+\section{Symmetric Cryptography}
-Sometimes the use of \type{DataSink\_Stream} is not practical for whatever
-reason. In this case, an alternate buffering mechanism might be useful. Here is
-some code which will write all the processed data as quickly as possible, so
-that memory pressure is reduced in the case of large inputs.
-\begin{verbatim}
- pipe.start_msg();
- SecureBuffer<byte, 1024> buffer;
- while(infile.good())
- {
- infile.read((char*)buffer.ptr(), buffer.size());
- u32bit got_from_infile = infile.gcount();
- pipe.write(buffer, got_from_infile);
-
- if(infile.eof())
- pipe.end_msg();
-
- while(pipe.remaining() > 0)
- {
- u32bit buffered = pipe.read(buffer, buffer.size());
- outfile.write((const char*)buffer.ptr(), buffered);
- }
- }
- if(infile.bad() || (infile.fail() && !infile.eof()))
- throw Some_Exception();
-\end{verbatim}
-
-\pagebreak
\subsection{Authentication}
-After doing the encryption routines, doing message authentication keyed off a
-passphrase is not very difficult. In fact it's much easier than the encryption
-case, for the following reasons: a) we only need one key, and b) we don't have
-to store anything, so all the input can be done in a single step without
-worrying about it taking up a lot of memory if the input file is large.
-
-In this case, we'll hex-encode the salt and the MAC, and output them both to
-standard output (the salt followed by the MAC).
-
-\begin{verbatim}
- PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)");
- pbkdf->set_iterations(10000);
- pbkdf->new_random_salt(8);
- OctetString the_salt = pbkdf->current_salt();
-
- SymmetricKey hmac_key = pbkdf->derive_key(32, passphrase);
-
- Pipe pipe(new MAC_Filter("HMAC(SHA-256)", mac_key),
- new Hex_Encoder
- );
-
- std::cout << the_salt.to_string(); // hex encoded
-
- pipe.start_msg();
- infile >> pipe;
- pipe.end_msg();
- std::cout << pipe.read_all_as_string() << std::endl;
-\end{verbatim}
-
\subsection{User Authentication}
-Doing user authentication off a shared passphrase is fairly easy. Essentially,
-a challenge-response protocol is used - the server sends a random challenge,
-and the client responds with an appropriate response to the challenge. The idea
-is that only someone who knows the passphrase can generate or check to see if a
-response is valid.
-
-Let's say we use 160-bit (20 byte) challenges, which seems fairly
-reasonable. We can create this challenge using the global random
-number generator (RNG):
-
-\begin{verbatim}
- byte challenge[20];
- Global_RNG::randomize(challenge, sizeof(challenge), Nonce);
- // send challenge to client
-\end{verbatim}
-
-After reading the challenge, the client generates a response based on
-the challenge and the passphrase. In this case, we will do it by
-repeatedly hashing the challenge, the passphrase, and (if applicable)
-the previous digest. We iterate this construction 10000 times, to make
-brute force attacks on the passphrase hard to do. Since we are already
-using 160-bit challenges, a 160-bit response seems warranted, so we'll
-use SHA-1.
-
-\begin{verbatim}
- HashFunction* hash = get_hash("SHA-1");
- SecureVector<byte> digest;
- for(u32bit j = 0; j != 10000; j++)
- {
- hash->update(digest, digest.size());
- hash->update(passphrase);
- hash->update(challenge, challenge.size());
- digest = hash->final();
- }
- delete hash;
- // send value of digest to the server
-\end{verbatim}
-
-Upon receiving the response from the client, the server computes what the
-response should have been based on the challenge it sent out, and the
-passphrase. If the two responses match, the client is authenticated.
-Otherwise, it is not.
-
-An alternate method is to use PBKDF2 again, using the challenge as the salt. In
-this case, the response could (for example) be the hash of the key produced by
-PBKDF2. There is no reason to have an explicit iteration loop, as PBKDF2 is
-designed to prevent dictionary attacks (assuming PBKDF2 is set up for a large
-iteration count internally).
-
-\pagebreak
-
\section{Public Key Cryptography}
-\subsection{Basic Operations}
-
-In this section, we'll assume we have a \type{X509\_PublicKey*} named
-\arg{pubkey}, and, if necessary, a private key type (a
-\type{PKCS8\_PrivateKey*}) named \arg{privkey}. A description of these types,
-how to create them, and related details appears later in this tutorial. In this
-section, we will use various functions that are defined in
-\filename{look\_pk.h} -- you will have to include this header explicitly.
-
-\subsubsection{Encryption}
-
-Basically, pick an encoding method, create a \type{PK\_Encryptor} (with
-\function{get\_pk\_encryptor}()), and use it. But first we have to make sure
-the public key can actually be used for public key encryption. For encryption
-(and decryption), the key could be RSA, ElGamal, or (in future versions) some
-other public key encryption scheme, like Rabin or an elliptic curve scheme.
-
-\begin{verbatim}
- PK_Encrypting_Key* key = dynamic_cast<PK_Encrypting_Key*>(pubkey);
- if(!key)
- error();
- PK_Encryptor* enc = get_pk_encryptor(*key, "EME1(SHA-256)");
-
- byte msg[] = { /* ... */ };
-
- // will also accept a SecureVector<byte> as input
- SecureVector<byte> ciphertext = enc->encrypt(msg, sizeof(msg));
-\end{verbatim}
-
-\subsubsection{Decryption}
-
-This is essentially the same as the encryption operation, but using a private
-key instead. One major difference is that the decryption operation can fail due
-to the fact that the ciphertext was invalid (most common padding schemes, such
-as ``EME1(SHA-256)'', include various pieces of redundancy, which are checked
-after decryption).
-
-\begin{verbatim}
- PK_Decrypting_Key* key = dynamic_cast<PK_Decrypting_Key*>(privkey);
- if(!key)
- error();
- PK_Decryptor* dec = get_pk_decryptor(*key, "EME1(SHA-256)");
-
- byte msg[] = { /* ... */ };
-
- SecureVector<byte> plaintext;
-
- try {
- // will also accept a SecureVector<byte> as input
- plaintext = dec->decrypt(msg, sizeof(msg));
- }
- catch(Decoding_Error)
- {
- /* the ciphertext was invalid */
- }
-\end{verbatim}
-
-\subsubsection{Signature Generation}
-
-There is one difficulty with signature generation that does not occur with
-encryption or decryption. Specifically, there are various padding methods which
-can be useful for different signature algorithms, and not all are appropriate
-for all signature schemes. The following table breaks down what algorithms
-support which encodings:
-
-\begin{tabular}{|c|c|c|} \hline
-Signature Algorithm & Usable Encoding Methods & Preferred Encoding(s) \\ \hline
-DSA / NR & EMSA1 & EMSA1 \\ \hline
-RSA & EMSA1, EMSA2, EMSA3, EMSA4 & EMSA3, EMSA4 \\ \hline
-Rabin-Williams & EMSA2, EMSA4 & EMSA2, EMSA4 \\ \hline
-\end{tabular}
-
-For new applications, use EMSA4 with both RSA and Rabin-Williams, as it is
-significantly more secure than the alternatives. However, most current
-applications/libraries only support EMSA2 with Rabin-Williams and EMSA3 with
-RSA. Given this, you may be forced to use less secure encoding methods for the
-near future. In these examples, we punt on the problem, and hard-code using
-EMSA1 with SHA-256.
-
-\begin{verbatim}
- Public_Key* key = /* loaded or generated somehow */
- PK_Signer* signer = get_pk_signer(*key, "EMSA1(SHA-256)");
-
- byte msg[] = { /* ... */ };
-
- /*
- You can also repeatedly call update(const byte[], u32bit), followed
- by a call to signature(), which will return the final signature of
- all the data that was passed through update(). sign_message() is
- just a stub that calls update() once, and returns the value of
- signature().
- */
-
- SecureVector<byte> signature = signer->sign_message(msg, sizeof(msg));
-\end{verbatim}
-
-\pagebreak
-
-\subsubsection{Signature Verification}
-
-In addition to all the problems with choosing the correct padding method,
-there is yet another complication with verifying a signature. Namely, there are
-two varieties of signature algorithms - those providing message recovery (that
-is, the value that was signed can be directly recovered by someone verifying
-the signature), and those without message recovery (the verify operation simply
-returns if the signature was valid, without telling you exactly what was
-signed). This leads to two slightly different implementations of the
-verification operation, which user code has to work with. As you can see
-however, the implementation is still not at all difficult.
-
-\begin{verbatim}
- PK_Verifier* verifier = 0;
-
- PK_Verifying_with_MR_Key* key1 =
- dynamic_cast<PK_Verifying_with_MR_Key*>(pubkey);
- PK_Verifying_wo_MR_Key* key2 =
- dynamic_cast<PK_Verifying_wo_MR_Key*>(pubkey);
-
- if(key1)
- verifier = get_pk_verifier(*key1, "EMSA1(SHA-256)");
- else if(key2)
- verifier = get_pk_verifier(*key2, "EMSA1(SHA-256)");
- else
- error();
-
- byte msg[] = { /* ... */ };
- byte sig[] = { /* ... */ };
-
- /*
- Like PK_Signer, you can also do repeated calls to
- void update(const byte some_data[], u32bit length)
- followed by a call to
- bool check_signature(const byte the_sig[], u32bit length)
- which will return true (valid signature) or false (bad signature).
- The function verify_message() is a simple wrapper around update() and
- check_signature().
-
- */
- bool is_valid = verifier->verify_message(msg, sizeof(msg), sig, sizeof(sig));
-\end{verbatim}
-
-\subsubsection{Key Agreement}
-
-WRITEME
-
-\pagebreak
-
-\subsection{Working with Keys}
-
-\subsubsection{Reading Public Keys (X.509 format)}
-
-There are two separate ways to read X.509 public keys. Remember that the X.509
-public keys are simply that: public keys. There is no associated information
-(such as the owner of that key) included with the public key itself. If you
-need that kind of information, you'll need to use X.509 certificates.
-
-However, there are surely times when a simple public key is sufficient. The
-most obvious is when the key is implicitly trusted, for example if access
-and/or modification of it is controlled by something else (like filesystem
-ACLs). In other cases, it is a perfectly reasonable proposition to use them
-over the network as an anonymous key exchange mechanism. This is, admittedly,
-vulnerable to man-in-the-middle attacks, but it's simple enough that it's hard
-to mess up (see, for example, Peter Guttman's paper ``Lessons Learned in
-Implementing and Deploying Crypto Software'' in Usenix '02).
-
-The way to load a key is basically to set up a \type{DataSource} and then call
-\function{X509::load\_key}, which will return a \type{X509\_PublicKey*}. For
-example:
-
-\begin{verbatim}
- DataSource_Stream somefile("somefile.pem"); // has 3 public keys
- X509_PublicKey* key1 = X509::load_key(somefile);
- X509_PublicKey* key2 = X509::load_key(somefile);
- X509_PublicKey* key3 = X509::load_key(somefile);
- // Now we have them all loaded. Huzah!
-\end{verbatim}
-
-At this point you can use \function{dynamic\_cast} to find the operations the
-key supports (by seeing if a cast to \type{PK\_Encrypting\_Key},
-\type{PK\_Verifying\_with\_MR\_Key}, or \type{PK\_Verifying\_wo\_MR\_Key}
-succeeds).
-
-There is a variant of \function{X509::load\_key} (and of
-\function{PKCS8::load\_key}, described in the next section) which take a
-filename (as a \type{std::string}). These are just convenience functions which
-create the appropriate \type{DataSource} for you and call the main
-\function{X509::load\_key}.
-
-\subsubsection{Reading Private Keys (PKCS \#8 format)}
-
-This is very similar to reading raw public keys, with the difference that the
-key may be encrypted with a user passphrase:
-
-\begin{verbatim}
- // rng is a RandomNumberGenerator, like AutoSeeded_RNG
-
- DataSource_Stream somefile("somefile");
- std::string a_passphrase = /* get from the user */
- PKCS8_PrivateKey* key = PKCS8::load_key(somefile, rng, a_passphrase);
-\end{verbatim}
-
-You can, by the way, convert a \type{PKCS8\_PrivateKey} to a
-\type{X509\_PublicKey} simply by casting it (with \function{dynamic\_cast}), as
-the private key type is derived from \type{X509\_PublicKey}. As with
-\type{X509\_PublicKey}, you can use \function{dynamic\_cast} to figure out what
-operations the private key is capable of; in particular, you can attempt to
-cast it to \type{PK\_Decrypting\_Key}, \type{PK\_Signing\_Key}, or
-\type{PK\_Key\_Agreement\_Key}.
-
-Sometimes you can get away with having a static passphrase passed to
-\function{load\_key}. Typically, however, you'll have to do some user
-interaction to get the appropriate passphrase. In that case you'll want to use
-the \type{UI} related interface, which is fully described in the API
-documentation.
-
-\subsubsection{Generating New Private Keys}
-
-Generate a new private key is the one operation which requires you to
-explicitly name the type of key you are working with. There are (currently) two
-kinds of public key algorithms in Botan: ones based on the integer
-factorization (IF) problem (RSA and Rabin-Williams), and ones based on the
-discrete logarithm (DL) problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and
-ElGamal). Since discrete logarithm parameters (primes and generators) can be
-shared among many keys, there is the notion of these being a combined type
-(called \type{DL\_Group}).
-
-To create a new DL-based private key, simply pass a desired \type{DL\_Group} to
-the constructor of the private key - a new public/private key pair will be
-generated. Since in IF-based algorithms, the modulus used isn't shared by other
-keys, we don't use this notion. You can create a new key by passing in a
-\type{u32bit} telling how long (in bits) the key should be.
-
-There are quite a few ways to get a \type{DL\_Group} object. The best is to use
-the function \function{get\_dl\_group}, which takes a string naming a group; it
-will either return that group, if it knows about it, or throw an
-exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536,
-2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF
-groups are the ones specified for use with IPSec, and the DSA ones are the
-default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, use
-the ``DSA-n'' groups, and for Diffie-Hellman and ElGamal, use the ``IETF-n''
-groups.
-
-You can also generate a new random group. This is not recommend, because it is
-very slow, particularly for ``safe'' primes, which are needed for
-Diffie-Hellman and ElGamal.
-
-Some examples:
-
-\begin{verbatim}
- RSA_PrivateKey rsa1(512); // 512-bit RSA key
- RSA_PrivateKey rsa2(2048); // 2048-bit RSA key
-
- RW_PrivateKey rw1(1024); // 1024-bit Rabin-Williams key
- RW_PrivateKey rw2(1536); // 1536-bit Rabin-Williams key
-
- DSA_PrivateKey dsa(get_dl_group("DSA-512")); // 512-bit DSA key
- DH_PrivateKey dh(get_dl_group("IETF-4096")); // 4096-bit DH key
- NR_PrivateKey nr(get_dl_group("DSA-1024")); // 1024-bit NR key
- ElGamal_PrivateKey elg(get_dl_group("IETF-1536")); // 1536-bit ElGamal key
-\end{verbatim}
-
-To export your newly created private key, use the PKCS \#8 routines in
-\filename{pkcs8.h}:
-
-\begin{verbatim}
- std::string a_passphrase = /* get from the user */
- std::string the_key = PKCS8::PEM_encode(rsa2, a_passphrase);
-\end{verbatim}
-
-You can read the key back in using \function{PKCS8::load\_key}, described in
-the section ``Reading Private Keys (PKCS \#8 format)'', above. Unfortunately,
-this only works with keys that have an assigned algorithm identifier and
-standardized format. Currently this is only the RSA, DSA, DH, and ElGamal
-algorithms, though RW and NR keys can also be imported and exported by
-assigning them an OID (this can be done either through a configuration file, or
-by calling the function \function{OIDS::add\_oid} in \filename{oids.h}). Be
-aware that the OID and format for ElGamal keys is not exactly standard, but
-there does exist at least one other crypto library which will accept the
-format.
-
-The raw public can be exported using:
-
-\begin{verbatim}
- std::string the_public_key = X509::PEM_encode(rsa2);
-\end{verbatim}
-
-\pagebreak
-
-\section{X.509v3 Certificates}
-
-Using certificates is rather complicated, so only the very basic mechanisms are
-going to be covered here. The section ``Setting up a CA'' goes into reasonable
-detail about CRLs and certificate requests, but there is a lot that isn't
-covered (else this section would get quite long and complicated).
-
-\subsection{Importing and Exporting Certificates}
-
-Importing and exporting X.509 certificates is easy. Simply call the constructor
-with either a \type{DataSource\&}, or the name of a file:
-
-\begin{verbatim}
- X509_Certificate cert1("cert1.pem");
-
- /* This file contains two certificates, concatenated */
- DataSource_Stream in("certs2_and_3.pem");
-
- X509_Certificate cert2(in); // read the first cert
- X509_Certificate cert3(in); // read the second cert
-\end{verbatim}
-
-Exporting the certificate is a simple matter of calling the member function
-\function{PEM\_encode}(), which returns a \type{std::string} containing the
-certificate in PEM encoding.
-
-\begin{verbatim}
- std::cout << cert3.PEM_encode();
- some_ostream_object << cert1.PEM_encode();
- std::string cert2_str = cert2.PEM_encode();
-\end{verbatim}
-
-\subsection{Verifying Certificates}
-
-Verifying a certificate requires that we build up a chain of trust, starting
-from the root (usually a commercial CA), down through some number of
-intermediate CAs, and finally reaching the actual certificate in
-question. Thus, to verify, we actually have to have all those certificates
-on hand (or at the very least, know where we can get the ones we need).
-
-The class which handles both storing certificates, and verifying them, is
-called \type{X509\_Store}. We'll start by assuming that we have all the
-certificates we need, and just want to verify a cert. This is done by calling
-the member function \function{validate\_cert}, which takes the
-\type{X509\_Certificate} in question, and an optional argument of type
-\type{Cert\_Usage} (which is ignored here; read the section in the API doc
-titled ``Verifying Certificates for information). It returns an enum;
-\type{X509\_Code}, which, for most purposes, is either \type{VERIFIED}, or
-something else (which specifies what circumstance caused the certificate to be
-considered invalid). Really, that's it.
-
-Now, how to let \type{X509\_Store} know about all those certificates and CRLs
-we have lying around? The simplest method is to add them directly, using the
-functions \function{add\_cert}, \function{add\_certs},
-\function{add\_trusted\_certs}, and \function{add\_crl}; for details, consult
-the API doc or read the \filename{x509stor.h} header. There is also a much more
-elegant and powerful method: \type{Certificate\_Store}s. A certificate store
-refers to an object that knows how to retrieve certificates from some external
-source (a file, an LDAP directory, a HTTP server, a SQL database, or anything
-else). By calling the function \function{add\_new\_certstore}, you can register
-a new certificate store, which \type{X509\_Store} will use to find certificates
-it needs. Thus, you can get away with only adding whichever root CA cert(s) you
-want to use, letting some online source handle the storage of all intermediate
-X.509 certificates. The API documentation has a more complete discussion of
-\type{Certificate\_Store}.
-
-\subsection{Setting up a CA}
-
-WRITEME
-
-\pagebreak
-
-\section{Special Topics}
-
-This chapter is for subjects that don't really fit into the API documentation
-or into other chapters of the tutorial.
-
-\subsection{GUIs}
-
-There is nothing particularly special about using Botan in a GUI-based
-application. However there are a few tricky spots, as well as a few ways to
-take advantage of an event-based application.
-
-\subsubsection{Initialization}
-
-Generally you will create the \type{LibraryInitializer} somewhere in
-\texttt{main}, before entering the event loop. One problem is that some GUI
-libraries take over \texttt{main} and drop you right into the event loop; the
-question then is how to initialize the library? The simplest way is probably to
-have a static flag that marks if you have already initialized the library or
-not. When you enter the event loop, check to see if this flag has not been set,
-and if so, initialize the library using the function-based initializers. Using
-\type{LibraryInitializer} obviously won't work in this case, since it would be
-destroyed as soon as the current event handler finished. You then deinitialize
-the library whenever your application is signaled to quit.
-
-\subsubsection{Interacting With the Library}
-
-In the simple case, the user will do stuff asynchronously, and then in response
-your code will do things like encrypt a file or whatever, which can all be done
-synchronously, since the data is right there for you. An application doing
-something like this is basically going to look exactly like a command line
-application that uses Botan, the only major difference being that the calls to
-the library are inside event handlers.
-
-Much harder is something like an SSH client, where you're acting as a layer
-between two asynchronous things (the user and the network). This actually isn't
-restricted to GUIs at all (text-mode SSH clients have to deal with most of the
-same problems), but it is probably more common with a GUI app. The following
-discussion is fairly vague, but hopefully somewhat useful.
-
-There are a few facilities in Botan that are primarily designed to be used by
-applications based on an event loop. See the section ``User Interfaces'' in the
-main API doc for details.
-
-\subsubsection{Entropy}
-
-One nice advantage of using a GUI is opening a new method of gathering entropy
-for the library. This is especially handy on Windows, where the available
-sources of entropy are pretty questionable. In many versions,
-\texttt{CryptGenRandom} is really rather poor, and the Toolhelp functions may
-not provide much data on a small system (such as a handheld). For example, in
-GTK+, you can use the following callback to get information about mouse
-movements:
-
-\begin{verbatim}
-static gint add_entropy(GtkWidget* widget, GdkEventMotion* event)
- {
- if(event)
- Global_RNG::add_entropy(event, sizeof(GdkEventMotion));
- return FALSE;
- }
-\end{verbatim}
-
-And then register it with your main GTK window (presumably named
-\variable{window}) as follows:
-
-\begin{verbatim}
-gtk_signal_connect(GTK_OBJECT(window), "motion_notify_event",
- GTK_SIGNAL_FUNC(add_entropy), NULL);
-
-gtk_widget_set_events(window, GDK_POINTER_MOTION_MASK);
-\end{verbatim}
-
-Even though we're catching all mouse movements, and hashing the results into
-the entropy pool, this doesn't use up more than a few percent of even a
-relatively slow desktop CPU. Note that in the case of using X over a network,
-catching all mouse events would cause large amounts of X traffic over the
-network, which might make your application slow, or even unusable (I haven't
-tried it, though).
-
-This could be made nicer if the collection function did something like
-calculating deltas between each run, storing them into a buffer, and then when
-enough of them have been added, hashing them and send them all to the PRNG in
-one shot. This would not only reduce load, but also prevent the PRNG from
-overestimating the amount of entropy it's getting, since its estimates don't
-(can't) take history into account. For example, you could move the mouse back
-and forth one pixel, and the PRNG would think it was getting a full load of
-entropy each time, when actually it was getting (at best) a bit or two.
\end{document}