diff options
-rw-r--r-- | doc/old_tutorial.tex | 880 | ||||
-rw-r--r-- | doc/tutorial.tex | 843 |
2 files changed, 932 insertions, 791 deletions
diff --git a/doc/old_tutorial.tex b/doc/old_tutorial.tex new file mode 100644 index 000000000..ee3bc936b --- /dev/null +++ b/doc/old_tutorial.tex @@ -0,0 +1,880 @@ +\documentclass{article} + +\setlength{\textwidth}{6.5in} % 1 inch side margins +\setlength{\textheight}{9in} % ~1 inch top and bottom margins + +\setlength{\headheight}{0in} +\setlength{\topmargin}{0in} +\setlength{\headsep}{0in} + +\setlength{\oddsidemargin}{0in} +\setlength{\evensidemargin}{0in} + +\title{\textbf{Botan Tutorial}} +\author{Jack Lloyd \\ + \texttt{[email protected]}} +\date{2009/07/08} + +\newcommand{\filename}[1]{\texttt{#1}} +\newcommand{\manpage}[2]{\texttt{#1}(#2)} + +\newcommand{\macro}[1]{\texttt{#1}} + +\newcommand{\function}[1]{\textbf{#1}} +\newcommand{\type}[1]{\texttt{#1}} +\renewcommand{\arg}[1]{\textsl{#1}} +\newcommand{\variable}[1]{\textsl{#1}} + +\begin{document} + +\maketitle + +\tableofcontents + +\parskip=5pt +\pagebreak + +\section{Introduction} + +This document essentially sets up various simple scenarios and then +shows how to solve the problems using Botan. It's fairly simple, and +doesn't cover many of the available APIs and algorithms, especially +the more obscure or unusual ones. It is a supplement to the API +documentation and the example applications, which are included in the +distribution. + +To quote the Perl man page: '``There's more than one way to do it.'' +Divining how many more is left as an exercise to the reader.' + +This is \emph{not} a general introduction to cryptography, and most simple +terms and ideas are not explained in any great detail. + +Finally, most of the code shown in this tutorial has not been tested, it was +just written down from memory. If you find errors, please let me know. + +\section{Initializing the Library} + +The first step to using Botan is to create a \type{LibraryInitializer} object, +which handles creating various internal structures, and also destroying them at +shutdown. Essentially: + +\begin{verbatim} +#include <botan/botan.h> +/* include other headers here */ + +int main() + { + LibraryInitializer init; + /* now do stuff */ + return 0; + } +\end{verbatim} + +\section{Hashing a File} + +\section{Symmetric Cryptography} + +\subsection{Encryption with a passphrase} + +Probably the most common crypto problem is encrypting a file (or some data that +is in-memory) using a passphrase. There are a million ways to do this, most of +them bad. In particular, you have to protect against weak passphrases, +people reusing a passphrase many times, accidental and deliberate modification, +and a dozen other potential problems. + +We'll start with a simple method that is commonly used, and show the problems +that can arise. Each subsequent solution will modify the previous one to +prevent one or more common problems, until we arrive at a good version. + +In these examples, we'll always use Serpent in Cipher-Block Chaining +(CBC) mode. Whenever we need a hash function, we'll use SHA-256, since +that is a common and well-known hash that is thought to be secure. + +In all examples, we choose to derive the Initialization Vector (IV) from the +passphrase. Another (probably more common) alternative is to generate the IV +randomly and include it at the beginning of the message. Either way is +acceptable, and can be secure. The method used here was chosen to make for more +interesting examples (because it's harder to get right), and may not be an +appropriate choice for some environments. + +First, some notation. The passphrase is stored as a \type{std::string} named +\variable{passphrase}. The input and output files (\variable{infile} and +\variable{outfile}) are of types \type{std::ifstream} and \type{std::ofstream} +(respectively). + +\subsubsection{First try} + +We hash the passphrase with SHA-256, and use the resulting hash to key +Serpent. To generate the IV, we prepend a single '0' character to the +passphrase, hash it, and truncate it to 16 bytes (which is Serpent's +block size). + +\begin{verbatim} + HashFunction* hash = get_hash("SHA-256"); + + SymmetricKey key = hash->process(passphrase); + SecureVector<byte> raw_iv = hash->process('0' + passphrase); + InitializationVector iv(raw_iv, 16); + + Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION)); + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); + outfile << pipe; +\end{verbatim} + +\subsubsection{Problem 1: Buffering} + +There is a problem with the above code, if the input file is fairly large as +compared to available memory. Specifically, all the encrypted data is stored +in memory, and then flushed to \variable{outfile} in a single go at the very +end. If the input file is big (say, a gigabyte), this will be most problematic. + +The solution is to use a \type{DataSink} to handle the output for us (writing +to \arg{outfile} will be implicit with writing to the \type{Pipe}). We can do +this by replacing the last few lines with: + +\begin{verbatim} + Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION), + new DataSink_Stream(outfile)); + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); +\end{verbatim} + +\subsubsection{Problem 2: Deriving the key and IV} + +Hash functions like SHA-256 are deterministic; if the same passphrase +is supplied twice, then the key (and in our case, the IV) will be the +same. This is very dangerous, and could easily open the whole system +up to attack. What we need to do is introduce a salt (or nonce) into +the generation of the key from the passphrase. This will mean that the +key will not be the same each time the same passphrase is typed in by +a user. + +There is another problem with using a bare hash function to derive +keys. While it's inconceivable (based on our current understanding of +thermodynamics and theories of computation) that an attacker could +brute-force a 256-bit key, it would be fairly simple for them to +compute the SHA-256 hashes of various common passwords ('password', +the name of the dog, the significant other's middle name, favorite +sports team) and try those as keys. So we want to slow the attacker +down if we can, and an easy way to do that is to iterate the hash +function a bunch of times (say, 1024 to 4096 times). This will involve +only a small amount of effort for a legitimate user (since they only +have to compute the hashes once, when they type in their passphrase), +but an attacker, trying out a large list of potential passphrases, +will be seriously annoyed (and slowed down) by this. + +In this iteration of the example, we'll kill these two birds with one +stone, and derive the key from the passphrase using a PBKDF +(Password-Based Key Derivation Function). In this example, we use +PBKDF2 with Hash Message Authentication Code (HMAC(SHA-256)), which is +specified in PKCS \#5. We replace the first four lines of code from +the first example with: + +\begin{verbatim} + PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)"); + // hard-coded iteration count for simplicity; should be sufficient + pbkdf->set_iterations(10000); + // 8 octets == 64-bit salt; again, good enough + pbkdf->new_random_salt(8); + SecureVector<byte> the_salt = pbkdf->current_salt(); + + // 48 octets == 32 for key + 16 for IV + SecureVector<byte> key_and_IV = pbkdf->derive_key(48, passphrase).bits_of(); + + SymmetricKey key(key_and_IV, 32); + InitializationVector iv(key_and_IV + 32, 16); +\end{verbatim} + +To complete the example, we have to remember to write out the salt (stored in +\variable{the\_salt}) at the beginning of the file. The receiving side needs to +know this value in order to restore it (by calling the \variable{pbkdf} object's +\function{change\_salt} function) so it can derive the same key and IV from the +passphrase. + +\subsubsection{Problem 3: Protecting against modification} + +As it is, an attacker can undetectably alter the message while it is +in transit. It is vital to remember that encryption does not imply +authentication (except when using special modes that are specifically +designed to provide authentication along with encryption, like OCB and +EAX). For this purpose, we will append a message authentication code +to the encrypted message. Specifically, we will generate an extra 256 +bits of key data, and use it to key the ``HMAC(SHA-256)'' MAC +function. We don't want to have the MAC and the cipher to share the +same key; that is very much a no-no. + +\begin{verbatim} + // 80 octets == 32 for cipher key + 16 for IV + 32 for hmac key + SecureVector<byte> keys_and_IV = pbkdf->derive_key(80, passphrase); + + SymmetricKey key(keys_and_IV, 32); + InitializationVector iv(keys_and_IV + 32, 16); + SymmetricKey mac_key(keys_and_IV + 32 + 16, 32); + + Pipe pipe(new Fork( + new Chain( + get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION), + new DataSink_Stream(outfile) + ), + new MAC_Filter("HMAC(SHA-256)", mac_key) + ) + ); + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); + + // now read the MAC from message #2. Message numbers start from 0 + SecureVector<byte> hmac = pipe.read_all(1); + outfile.write((const char*)hmac.ptr(), hmac.size()); +\end{verbatim} + +The receiver can check the size of the file (in bytes), and since it knows how +long the MAC is, can figure out how many bytes of ciphertext there are. Then it +reads in that many bytes, sending them to a Serpent/CBC decryption object +(which could be obtained by calling \verb|get_cipher| with an argument of +\type{DECRYPTION} instead of \type{ENCRYPTION}), and storing the final bytes to +authenticate the message with. + +\subsubsection{Problem 4: Cleaning up the key generation} + +The method used to derive the keys and IV is rather inelegant, and it would be +nice to clean that up a bit, algorithmically speaking. A nice solution for this +is to generate a master key from the passphrase and salt, and then generate the +two keys and the IV (the cryptovariables) from that. + +Starting from the master key, we derive the cryptovariables using a KDF +algorithm, which is designed, among other things, to ``separate'' keys so that +we can derive several different keys from the single master key. For this +purpose, we will use KDF2, which is a generally useful KDF function (defined in +IEEE 1363a, among other standards). The use of different labels (``cipher +key'', etc) makes sure that each of the three derived variables will have +different values. + +\begin{verbatim} + PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)"); + // hard-coded iteration count for simplicity; should be sufficient + pbkdf->set_iterations(10000); + // 8 octet == 64-bit salt; again, good enough + pbkdf->new_random_salt(8); + // store the salt so we can write it to a file later + SecureVector<byte> the_salt = pbkdf->current_salt(); + + SymmetricKey master_key = pbkdf->derive_key(48, passphrase); + + KDF* kdf = get_kdf("KDF2(SHA-256)"); + + SymmetricKey key = kdf->derive_key(32, master_key, "cipher key"); + SymmetricKey mac_key = kdf->derive_key(32, master_key, "hmac key"); + InitializationVector iv = kdf->derive_key(16, master_key, "cipher iv"); +\end{verbatim} + +\subsubsection{Final version} + +Here is the final version of the encryption code, with all the changes we've +made: + +\begin{verbatim} + PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)"); + pbkdf->set_iterations(10000); + pbkdf->new_random_salt(8); + SecureVector<byte> the_salt = pbkdf->current_salt(); + + SymmetricKey master_key = pbkdf->derive_key(48, passphrase); + + KDF* kdf = get_kdf("KDF2(SHA-256)"); + + SymmetricKey key = kdf->derive_key(32, master_key, "cipher key"); + SymmetricKey mac_key = kdf->derive_key(32, masterkey, "hmac key"); + InitializationVector iv = kdf->derive_key(16, masterkey, "cipher iv"); + + Pipe pipe(new Fork( + new Chain( + get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION), + new DataSink_Stream(outfile) + ), + new MAC_Filter("HMAC(SHA-256)", mac_key) + ) + ); + + outfile.write((const char*)the_salt.ptr(), the_salt.size()); + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); + + SecureVector<byte> hmac = pipe.read_all(1); + outfile.write((const char*)hmac.ptr(), hmac.size()); +\end{verbatim} + +\subsubsection{Another buffering technique} + +Sometimes the use of \type{DataSink\_Stream} is not practical for whatever +reason. In this case, an alternate buffering mechanism might be useful. Here is +some code which will write all the processed data as quickly as possible, so +that memory pressure is reduced in the case of large inputs. + +\begin{verbatim} + pipe.start_msg(); + SecureBuffer<byte, 1024> buffer; + while(infile.good()) + { + infile.read((char*)buffer.ptr(), buffer.size()); + u32bit got_from_infile = infile.gcount(); + pipe.write(buffer, got_from_infile); + + if(infile.eof()) + pipe.end_msg(); + + while(pipe.remaining() > 0) + { + u32bit buffered = pipe.read(buffer, buffer.size()); + outfile.write((const char*)buffer.ptr(), buffered); + } + } + if(infile.bad() || (infile.fail() && !infile.eof())) + throw Some_Exception(); +\end{verbatim} + +\pagebreak + +\subsection{Authentication} + +After doing the encryption routines, doing message authentication keyed off a +passphrase is not very difficult. In fact it's much easier than the encryption +case, for the following reasons: a) we only need one key, and b) we don't have +to store anything, so all the input can be done in a single step without +worrying about it taking up a lot of memory if the input file is large. + +In this case, we'll hex-encode the salt and the MAC, and output them both to +standard output (the salt followed by the MAC). + +\begin{verbatim} + PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)"); + pbkdf->set_iterations(10000); + pbkdf->new_random_salt(8); + OctetString the_salt = pbkdf->current_salt(); + + SymmetricKey hmac_key = pbkdf->derive_key(32, passphrase); + + Pipe pipe(new MAC_Filter("HMAC(SHA-256)", mac_key), + new Hex_Encoder + ); + + std::cout << the_salt.to_string(); // hex encoded + + pipe.start_msg(); + infile >> pipe; + pipe.end_msg(); + std::cout << pipe.read_all_as_string() << std::endl; +\end{verbatim} + +\subsection{User Authentication} + +Doing user authentication off a shared passphrase is fairly easy. Essentially, +a challenge-response protocol is used - the server sends a random challenge, +and the client responds with an appropriate response to the challenge. The idea +is that only someone who knows the passphrase can generate or check to see if a +response is valid. + +Let's say we use 160-bit (20 byte) challenges, which seems fairly +reasonable. We can create this challenge using the global random +number generator (RNG): + +\begin{verbatim} + byte challenge[20]; + Global_RNG::randomize(challenge, sizeof(challenge), Nonce); + // send challenge to client +\end{verbatim} + +After reading the challenge, the client generates a response based on +the challenge and the passphrase. In this case, we will do it by +repeatedly hashing the challenge, the passphrase, and (if applicable) +the previous digest. We iterate this construction 10000 times, to make +brute force attacks on the passphrase hard to do. Since we are already +using 160-bit challenges, a 160-bit response seems warranted, so we'll +use SHA-1. + +\begin{verbatim} + HashFunction* hash = get_hash("SHA-1"); + SecureVector<byte> digest; + for(u32bit j = 0; j != 10000; j++) + { + hash->update(digest, digest.size()); + hash->update(passphrase); + hash->update(challenge, challenge.size()); + digest = hash->final(); + } + delete hash; + // send value of digest to the server +\end{verbatim} + +Upon receiving the response from the client, the server computes what the +response should have been based on the challenge it sent out, and the +passphrase. If the two responses match, the client is authenticated. +Otherwise, it is not. + +An alternate method is to use PBKDF2 again, using the challenge as the salt. In +this case, the response could (for example) be the hash of the key produced by +PBKDF2. There is no reason to have an explicit iteration loop, as PBKDF2 is +designed to prevent dictionary attacks (assuming PBKDF2 is set up for a large +iteration count internally). + +\pagebreak + +\section{Public Key Cryptography} + +\subsection{Basic Operations} + +In this section, we'll assume we have a \type{X509\_PublicKey*} named +\arg{pubkey}, and, if necessary, a private key type (a +\type{PKCS8\_PrivateKey*}) named \arg{privkey}. A description of these types, +how to create them, and related details appears later in this tutorial. In this +section, we will use various functions that are defined in +\filename{look\_pk.h} -- you will have to include this header explicitly. + +\subsubsection{Encryption} + +Basically, pick an encoding method, create a \type{PK\_Encryptor} (with +\function{get\_pk\_encryptor}()), and use it. But first we have to make sure +the public key can actually be used for public key encryption. For encryption +(and decryption), the key could be RSA, ElGamal, or (in future versions) some +other public key encryption scheme, like Rabin or an elliptic curve scheme. + +\begin{verbatim} + PK_Encrypting_Key* key = dynamic_cast<PK_Encrypting_Key*>(pubkey); + if(!key) + error(); + PK_Encryptor* enc = get_pk_encryptor(*key, "EME1(SHA-256)"); + + byte msg[] = { /* ... */ }; + + // will also accept a SecureVector<byte> as input + SecureVector<byte> ciphertext = enc->encrypt(msg, sizeof(msg)); +\end{verbatim} + +\subsubsection{Decryption} + +This is essentially the same as the encryption operation, but using a private +key instead. One major difference is that the decryption operation can fail due +to the fact that the ciphertext was invalid (most common padding schemes, such +as ``EME1(SHA-256)'', include various pieces of redundancy, which are checked +after decryption). + +\begin{verbatim} + PK_Decrypting_Key* key = dynamic_cast<PK_Decrypting_Key*>(privkey); + if(!key) + error(); + PK_Decryptor* dec = get_pk_decryptor(*key, "EME1(SHA-256)"); + + byte msg[] = { /* ... */ }; + + SecureVector<byte> plaintext; + + try { + // will also accept a SecureVector<byte> as input + plaintext = dec->decrypt(msg, sizeof(msg)); + } + catch(Decoding_Error) + { + /* the ciphertext was invalid */ + } +\end{verbatim} + +\subsubsection{Signature Generation} + +There is one difficulty with signature generation that does not occur with +encryption or decryption. Specifically, there are various padding methods which +can be useful for different signature algorithms, and not all are appropriate +for all signature schemes. The following table breaks down what algorithms +support which encodings: + +\begin{tabular}{|c|c|c|} \hline +Signature Algorithm & Usable Encoding Methods & Preferred Encoding(s) \\ \hline +DSA / NR & EMSA1 & EMSA1 \\ \hline +RSA & EMSA1, EMSA2, EMSA3, EMSA4 & EMSA3, EMSA4 \\ \hline +Rabin-Williams & EMSA2, EMSA4 & EMSA2, EMSA4 \\ \hline +\end{tabular} + +For new applications, use EMSA4 with both RSA and Rabin-Williams, as it is +significantly more secure than the alternatives. However, most current +applications/libraries only support EMSA2 with Rabin-Williams and EMSA3 with +RSA. Given this, you may be forced to use less secure encoding methods for the +near future. In these examples, we punt on the problem, and hard-code using +EMSA1 with SHA-256. + +\begin{verbatim} + Public_Key* key = /* loaded or generated somehow */ + PK_Signer* signer = get_pk_signer(*key, "EMSA1(SHA-256)"); + + byte msg[] = { /* ... */ }; + + /* + You can also repeatedly call update(const byte[], u32bit), followed + by a call to signature(), which will return the final signature of + all the data that was passed through update(). sign_message() is + just a stub that calls update() once, and returns the value of + signature(). + */ + + SecureVector<byte> signature = signer->sign_message(msg, sizeof(msg)); +\end{verbatim} + +\pagebreak + +\subsubsection{Signature Verification} + +In addition to all the problems with choosing the correct padding method, +there is yet another complication with verifying a signature. Namely, there are +two varieties of signature algorithms - those providing message recovery (that +is, the value that was signed can be directly recovered by someone verifying +the signature), and those without message recovery (the verify operation simply +returns if the signature was valid, without telling you exactly what was +signed). This leads to two slightly different implementations of the +verification operation, which user code has to work with. As you can see +however, the implementation is still not at all difficult. + +\begin{verbatim} + PK_Verifier* verifier = 0; + + PK_Verifying_with_MR_Key* key1 = + dynamic_cast<PK_Verifying_with_MR_Key*>(pubkey); + PK_Verifying_wo_MR_Key* key2 = + dynamic_cast<PK_Verifying_wo_MR_Key*>(pubkey); + + if(key1) + verifier = get_pk_verifier(*key1, "EMSA1(SHA-256)"); + else if(key2) + verifier = get_pk_verifier(*key2, "EMSA1(SHA-256)"); + else + error(); + + byte msg[] = { /* ... */ }; + byte sig[] = { /* ... */ }; + + /* + Like PK_Signer, you can also do repeated calls to + void update(const byte some_data[], u32bit length) + followed by a call to + bool check_signature(const byte the_sig[], u32bit length) + which will return true (valid signature) or false (bad signature). + The function verify_message() is a simple wrapper around update() and + check_signature(). + + */ + bool is_valid = verifier->verify_message(msg, sizeof(msg), sig, sizeof(sig)); +\end{verbatim} + +\subsubsection{Key Agreement} + +WRITEME + +\pagebreak + +\subsection{Working with Keys} + +\subsubsection{Reading Public Keys (X.509 format)} + +There are two separate ways to read X.509 public keys. Remember that the X.509 +public keys are simply that: public keys. There is no associated information +(such as the owner of that key) included with the public key itself. If you +need that kind of information, you'll need to use X.509 certificates. + +However, there are surely times when a simple public key is sufficient. The +most obvious is when the key is implicitly trusted, for example if access +and/or modification of it is controlled by something else (like filesystem +ACLs). In other cases, it is a perfectly reasonable proposition to use them +over the network as an anonymous key exchange mechanism. This is, admittedly, +vulnerable to man-in-the-middle attacks, but it's simple enough that it's hard +to mess up (see, for example, Peter Guttman's paper ``Lessons Learned in +Implementing and Deploying Crypto Software'' in Usenix '02). + +The way to load a key is basically to set up a \type{DataSource} and then call +\function{X509::load\_key}, which will return a \type{X509\_PublicKey*}. For +example: + +\begin{verbatim} + DataSource_Stream somefile("somefile.pem"); // has 3 public keys + X509_PublicKey* key1 = X509::load_key(somefile); + X509_PublicKey* key2 = X509::load_key(somefile); + X509_PublicKey* key3 = X509::load_key(somefile); + // Now we have them all loaded. Huzah! +\end{verbatim} + +At this point you can use \function{dynamic\_cast} to find the operations the +key supports (by seeing if a cast to \type{PK\_Encrypting\_Key}, +\type{PK\_Verifying\_with\_MR\_Key}, or \type{PK\_Verifying\_wo\_MR\_Key} +succeeds). + +There is a variant of \function{X509::load\_key} (and of +\function{PKCS8::load\_key}, described in the next section) which take a +filename (as a \type{std::string}). These are just convenience functions which +create the appropriate \type{DataSource} for you and call the main +\function{X509::load\_key}. + +\subsubsection{Reading Private Keys (PKCS \#8 format)} + +This is very similar to reading raw public keys, with the difference that the +key may be encrypted with a user passphrase: + +\begin{verbatim} + // rng is a RandomNumberGenerator, like AutoSeeded_RNG + + DataSource_Stream somefile("somefile"); + std::string a_passphrase = /* get from the user */ + PKCS8_PrivateKey* key = PKCS8::load_key(somefile, rng, a_passphrase); +\end{verbatim} + +You can, by the way, convert a \type{PKCS8\_PrivateKey} to a +\type{X509\_PublicKey} simply by casting it (with \function{dynamic\_cast}), as +the private key type is derived from \type{X509\_PublicKey}. As with +\type{X509\_PublicKey}, you can use \function{dynamic\_cast} to figure out what +operations the private key is capable of; in particular, you can attempt to +cast it to \type{PK\_Decrypting\_Key}, \type{PK\_Signing\_Key}, or +\type{PK\_Key\_Agreement\_Key}. + +Sometimes you can get away with having a static passphrase passed to +\function{load\_key}. Typically, however, you'll have to do some user +interaction to get the appropriate passphrase. In that case you'll want to use +the \type{UI} related interface, which is fully described in the API +documentation. + +\subsubsection{Generating New Private Keys} + +Generate a new private key is the one operation which requires you to +explicitly name the type of key you are working with. There are (currently) two +kinds of public key algorithms in Botan: ones based on the integer +factorization (IF) problem (RSA and Rabin-Williams), and ones based on the +discrete logarithm (DL) problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and +ElGamal). Since discrete logarithm parameters (primes and generators) can be +shared among many keys, there is the notion of these being a combined type +(called \type{DL\_Group}). + +To create a new DL-based private key, simply pass a desired \type{DL\_Group} to +the constructor of the private key - a new public/private key pair will be +generated. Since in IF-based algorithms, the modulus used isn't shared by other +keys, we don't use this notion. You can create a new key by passing in a +\type{u32bit} telling how long (in bits) the key should be. + +There are quite a few ways to get a \type{DL\_Group} object. The best is to use +the function \function{get\_dl\_group}, which takes a string naming a group; it +will either return that group, if it knows about it, or throw an +exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536, +2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF +groups are the ones specified for use with IPSec, and the DSA ones are the +default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, use +the ``DSA-n'' groups, and for Diffie-Hellman and ElGamal, use the ``IETF-n'' +groups. + +You can also generate a new random group. This is not recommend, because it is +very slow, particularly for ``safe'' primes, which are needed for +Diffie-Hellman and ElGamal. + +Some examples: + +\begin{verbatim} + RSA_PrivateKey rsa1(512); // 512-bit RSA key + RSA_PrivateKey rsa2(2048); // 2048-bit RSA key + + RW_PrivateKey rw1(1024); // 1024-bit Rabin-Williams key + RW_PrivateKey rw2(1536); // 1536-bit Rabin-Williams key + + DSA_PrivateKey dsa(get_dl_group("DSA-512")); // 512-bit DSA key + DH_PrivateKey dh(get_dl_group("IETF-4096")); // 4096-bit DH key + NR_PrivateKey nr(get_dl_group("DSA-1024")); // 1024-bit NR key + ElGamal_PrivateKey elg(get_dl_group("IETF-1536")); // 1536-bit ElGamal key +\end{verbatim} + +To export your newly created private key, use the PKCS \#8 routines in +\filename{pkcs8.h}: + +\begin{verbatim} + std::string a_passphrase = /* get from the user */ + std::string the_key = PKCS8::PEM_encode(rsa2, a_passphrase); +\end{verbatim} + +You can read the key back in using \function{PKCS8::load\_key}, described in +the section ``Reading Private Keys (PKCS \#8 format)'', above. Unfortunately, +this only works with keys that have an assigned algorithm identifier and +standardized format. Currently this is only the RSA, DSA, DH, and ElGamal +algorithms, though RW and NR keys can also be imported and exported by +assigning them an OID (this can be done either through a configuration file, or +by calling the function \function{OIDS::add\_oid} in \filename{oids.h}). Be +aware that the OID and format for ElGamal keys is not exactly standard, but +there does exist at least one other crypto library which will accept the +format. + +The raw public can be exported using: + +\begin{verbatim} + std::string the_public_key = X509::PEM_encode(rsa2); +\end{verbatim} + +\pagebreak + +\section{X.509v3 Certificates} + +Using certificates is rather complicated, so only the very basic mechanisms are +going to be covered here. The section ``Setting up a CA'' goes into reasonable +detail about CRLs and certificate requests, but there is a lot that isn't +covered (else this section would get quite long and complicated). + +\subsection{Importing and Exporting Certificates} + +Importing and exporting X.509 certificates is easy. Simply call the constructor +with either a \type{DataSource\&}, or the name of a file: + +\begin{verbatim} + X509_Certificate cert1("cert1.pem"); + + /* This file contains two certificates, concatenated */ + DataSource_Stream in("certs2_and_3.pem"); + + X509_Certificate cert2(in); // read the first cert + X509_Certificate cert3(in); // read the second cert +\end{verbatim} + +Exporting the certificate is a simple matter of calling the member function +\function{PEM\_encode}(), which returns a \type{std::string} containing the +certificate in PEM encoding. + +\begin{verbatim} + std::cout << cert3.PEM_encode(); + some_ostream_object << cert1.PEM_encode(); + std::string cert2_str = cert2.PEM_encode(); +\end{verbatim} + +\subsection{Verifying Certificates} + +Verifying a certificate requires that we build up a chain of trust, starting +from the root (usually a commercial CA), down through some number of +intermediate CAs, and finally reaching the actual certificate in +question. Thus, to verify, we actually have to have all those certificates +on hand (or at the very least, know where we can get the ones we need). + +The class which handles both storing certificates, and verifying them, is +called \type{X509\_Store}. We'll start by assuming that we have all the +certificates we need, and just want to verify a cert. This is done by calling +the member function \function{validate\_cert}, which takes the +\type{X509\_Certificate} in question, and an optional argument of type +\type{Cert\_Usage} (which is ignored here; read the section in the API doc +titled ``Verifying Certificates for information). It returns an enum; +\type{X509\_Code}, which, for most purposes, is either \type{VERIFIED}, or +something else (which specifies what circumstance caused the certificate to be +considered invalid). Really, that's it. + +Now, how to let \type{X509\_Store} know about all those certificates and CRLs +we have lying around? The simplest method is to add them directly, using the +functions \function{add\_cert}, \function{add\_certs}, +\function{add\_trusted\_certs}, and \function{add\_crl}; for details, consult +the API doc or read the \filename{x509stor.h} header. There is also a much more +elegant and powerful method: \type{Certificate\_Store}s. A certificate store +refers to an object that knows how to retrieve certificates from some external +source (a file, an LDAP directory, a HTTP server, a SQL database, or anything +else). By calling the function \function{add\_new\_certstore}, you can register +a new certificate store, which \type{X509\_Store} will use to find certificates +it needs. Thus, you can get away with only adding whichever root CA cert(s) you +want to use, letting some online source handle the storage of all intermediate +X.509 certificates. The API documentation has a more complete discussion of +\type{Certificate\_Store}. + +\subsection{Setting up a CA} + +WRITEME + +\pagebreak + +\section{Special Topics} + +This chapter is for subjects that don't really fit into the API documentation +or into other chapters of the tutorial. + +\subsection{GUIs} + +There is nothing particularly special about using Botan in a GUI-based +application. However there are a few tricky spots, as well as a few ways to +take advantage of an event-based application. + +\subsubsection{Initialization} + +Generally you will create the \type{LibraryInitializer} somewhere in +\texttt{main}, before entering the event loop. One problem is that some GUI +libraries take over \texttt{main} and drop you right into the event loop; the +question then is how to initialize the library? The simplest way is probably to +have a static flag that marks if you have already initialized the library or +not. When you enter the event loop, check to see if this flag has not been set, +and if so, initialize the library using the function-based initializers. Using +\type{LibraryInitializer} obviously won't work in this case, since it would be +destroyed as soon as the current event handler finished. You then deinitialize +the library whenever your application is signaled to quit. + +\subsubsection{Interacting With the Library} + +In the simple case, the user will do stuff asynchronously, and then in response +your code will do things like encrypt a file or whatever, which can all be done +synchronously, since the data is right there for you. An application doing +something like this is basically going to look exactly like a command line +application that uses Botan, the only major difference being that the calls to +the library are inside event handlers. + +Much harder is something like an SSH client, where you're acting as a layer +between two asynchronous things (the user and the network). This actually isn't +restricted to GUIs at all (text-mode SSH clients have to deal with most of the +same problems), but it is probably more common with a GUI app. The following +discussion is fairly vague, but hopefully somewhat useful. + +There are a few facilities in Botan that are primarily designed to be used by +applications based on an event loop. See the section ``User Interfaces'' in the +main API doc for details. + +\subsubsection{Entropy} + +One nice advantage of using a GUI is opening a new method of gathering entropy +for the library. This is especially handy on Windows, where the available +sources of entropy are pretty questionable. In many versions, +\texttt{CryptGenRandom} is really rather poor, and the Toolhelp functions may +not provide much data on a small system (such as a handheld). For example, in +GTK+, you can use the following callback to get information about mouse +movements: + +\begin{verbatim} +static gint add_entropy(GtkWidget* widget, GdkEventMotion* event) + { + if(event) + Global_RNG::add_entropy(event, sizeof(GdkEventMotion)); + return FALSE; + } +\end{verbatim} + +And then register it with your main GTK window (presumably named +\variable{window}) as follows: + +\begin{verbatim} +gtk_signal_connect(GTK_OBJECT(window), "motion_notify_event", + GTK_SIGNAL_FUNC(add_entropy), NULL); + +gtk_widget_set_events(window, GDK_POINTER_MOTION_MASK); +\end{verbatim} + +Even though we're catching all mouse movements, and hashing the results into +the entropy pool, this doesn't use up more than a few percent of even a +relatively slow desktop CPU. Note that in the case of using X over a network, +catching all mouse events would cause large amounts of X traffic over the +network, which might make your application slow, or even unusable (I haven't +tried it, though). + +This could be made nicer if the collection function did something like +calculating deltas between each run, storing them into a buffer, and then when +enough of them have been added, hashing them and send them all to the PRNG in +one shot. This would not only reduce load, but also prevent the PRNG from +overestimating the amount of entropy it's getting, since its estimates don't +(can't) take history into account. For example, you could move the mouse back +and forth one pixel, and the PRNG would think it was getting a full load of +entropy each time, when actually it was getting (at best) a bit or two. + +\end{document} diff --git a/doc/tutorial.tex b/doc/tutorial.tex index ee3bc936b..840679d10 100644 --- a/doc/tutorial.tex +++ b/doc/tutorial.tex @@ -13,7 +13,7 @@ \title{\textbf{Botan Tutorial}} \author{Jack Lloyd \\ \texttt{[email protected]}} -\date{2009/07/08} +\date{2010/08/07} \newcommand{\filename}[1]{\texttt{#1}} \newcommand{\manpage}[2]{\texttt{#1}(#2)} @@ -37,844 +37,105 @@ \section{Introduction} This document essentially sets up various simple scenarios and then -shows how to solve the problems using Botan. It's fairly simple, and +shows how to solve the problems using botan. It's fairly simple, and doesn't cover many of the available APIs and algorithms, especially the more obscure or unusual ones. It is a supplement to the API documentation and the example applications, which are included in the distribution. -To quote the Perl man page: '``There's more than one way to do it.'' -Divining how many more is left as an exercise to the reader.' - -This is \emph{not} a general introduction to cryptography, and most simple -terms and ideas are not explained in any great detail. - -Finally, most of the code shown in this tutorial has not been tested, it was -just written down from memory. If you find errors, please let me know. - \section{Initializing the Library} -The first step to using Botan is to create a \type{LibraryInitializer} object, -which handles creating various internal structures, and also destroying them at -shutdown. Essentially: +The first step to using botan is to create a \type{LibraryInitializer} +object, which handles creating various internal structures, and also +destroying them at shutdown. \begin{verbatim} #include <botan/botan.h> -/* include other headers here */ int main() { - LibraryInitializer init; - /* now do stuff */ + Botan::LibraryInitializer init; return 0; } \end{verbatim} -\section{Hashing a File} - -\section{Symmetric Cryptography} - -\subsection{Encryption with a passphrase} - -Probably the most common crypto problem is encrypting a file (or some data that -is in-memory) using a passphrase. There are a million ways to do this, most of -them bad. In particular, you have to protect against weak passphrases, -people reusing a passphrase many times, accidental and deliberate modification, -and a dozen other potential problems. - -We'll start with a simple method that is commonly used, and show the problems -that can arise. Each subsequent solution will modify the previous one to -prevent one or more common problems, until we arrive at a good version. - -In these examples, we'll always use Serpent in Cipher-Block Chaining -(CBC) mode. Whenever we need a hash function, we'll use SHA-256, since -that is a common and well-known hash that is thought to be secure. - -In all examples, we choose to derive the Initialization Vector (IV) from the -passphrase. Another (probably more common) alternative is to generate the IV -randomly and include it at the beginning of the message. Either way is -acceptable, and can be secure. The method used here was chosen to make for more -interesting examples (because it's harder to get right), and may not be an -appropriate choice for some environments. - -First, some notation. The passphrase is stored as a \type{std::string} named -\variable{passphrase}. The input and output files (\variable{infile} and -\variable{outfile}) are of types \type{std::ifstream} and \type{std::ofstream} -(respectively). - -\subsubsection{First try} - -We hash the passphrase with SHA-256, and use the resulting hash to key -Serpent. To generate the IV, we prepend a single '0' character to the -passphrase, hash it, and truncate it to 16 bytes (which is Serpent's -block size). +If your application is multi-threaded, you need to tell botan this so +that it will use locking where necessary. This is done by passing a +string to the constructor of \type{LibraryInitializer}: \begin{verbatim} - HashFunction* hash = get_hash("SHA-256"); - - SymmetricKey key = hash->process(passphrase); - SecureVector<byte> raw_iv = hash->process('0' + passphrase); - InitializationVector iv(raw_iv, 16); - - Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION)); - - pipe.start_msg(); - infile >> pipe; - pipe.end_msg(); - outfile << pipe; + Botan::LibraryInitializer init("thread_safe=yes"); \end{verbatim} -\subsubsection{Problem 1: Buffering} +\section{Introduction to Pipe} -There is a problem with the above code, if the input file is fairly large as -compared to available memory. Specifically, all the encrypted data is stored -in memory, and then flushed to \variable{outfile} in a single go at the very -end. If the input file is big (say, a gigabyte), this will be most problematic. +Most operations in botan are specified in terms of transformations on +streams. The class that handles the I/O and management for these +streams is called \type{Pipe}. You can construct a \type{Pipe} with +one or more \type{Filter}s, which sequentially process messages. You +can only update a single message at a time, but you can leave the +final output contents in a \type{Pipe} and read them out as desired. -The solution is to use a \type{DataSink} to handle the output for us (writing -to \arg{outfile} will be implicit with writing to the \type{Pipe}). We can do -this by replacing the last few lines with: +Here is how you might hex encode two messages: \begin{verbatim} - Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION), - new DataSink_Stream(outfile)); + std::string message1 = "this is the first message"; + const byte message2[] = "a second message"; + Pipe pipe(new Hex_Encoder); - pipe.start_msg(); - infile >> pipe; - pipe.end_msg(); -\end{verbatim} + pipe.start_msg(); // must be called before writing to the pipe + pipe.write(message1); + pipe.end_msg(); // must be called to signal completion -\subsubsection{Problem 2: Deriving the key and IV} - -Hash functions like SHA-256 are deterministic; if the same passphrase -is supplied twice, then the key (and in our case, the IV) will be the -same. This is very dangerous, and could easily open the whole system -up to attack. What we need to do is introduce a salt (or nonce) into -the generation of the key from the passphrase. This will mean that the -key will not be the same each time the same passphrase is typed in by -a user. - -There is another problem with using a bare hash function to derive -keys. While it's inconceivable (based on our current understanding of -thermodynamics and theories of computation) that an attacker could -brute-force a 256-bit key, it would be fairly simple for them to -compute the SHA-256 hashes of various common passwords ('password', -the name of the dog, the significant other's middle name, favorite -sports team) and try those as keys. So we want to slow the attacker -down if we can, and an easy way to do that is to iterate the hash -function a bunch of times (say, 1024 to 4096 times). This will involve -only a small amount of effort for a legitimate user (since they only -have to compute the hashes once, when they type in their passphrase), -but an attacker, trying out a large list of potential passphrases, -will be seriously annoyed (and slowed down) by this. - -In this iteration of the example, we'll kill these two birds with one -stone, and derive the key from the passphrase using a PBKDF -(Password-Based Key Derivation Function). In this example, we use -PBKDF2 with Hash Message Authentication Code (HMAC(SHA-256)), which is -specified in PKCS \#5. We replace the first four lines of code from -the first example with: + /* + process_msg(x) is equivalent to calling + start_msg(); write(x); end_msg(); + */ + pipe.process_msg(message2); -\begin{verbatim} - PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)"); - // hard-coded iteration count for simplicity; should be sufficient - pbkdf->set_iterations(10000); - // 8 octets == 64-bit salt; again, good enough - pbkdf->new_random_salt(8); - SecureVector<byte> the_salt = pbkdf->current_salt(); - - // 48 octets == 32 for key + 16 for IV - SecureVector<byte> key_and_IV = pbkdf->derive_key(48, passphrase).bits_of(); - - SymmetricKey key(key_and_IV, 32); - InitializationVector iv(key_and_IV + 32, 16); -\end{verbatim} + Pipe::message_id n = pipe.message_count(); // returns 2 -To complete the example, we have to remember to write out the salt (stored in -\variable{the\_salt}) at the beginning of the file. The receiving side needs to -know this value in order to restore it (by calling the \variable{pbkdf} object's -\function{change\_salt} function) so it can derive the same key and IV from the -passphrase. + /* you can read a message as a string, here we read message 0 */ + std::string first_result = pipe.read_all_as_string(0); -\subsubsection{Problem 3: Protecting against modification} + /* or a piece at a time using array/length, now we'll read the + second message (message id 1) + */ -As it is, an attacker can undetectably alter the message while it is -in transit. It is vital to remember that encryption does not imply -authentication (except when using special modes that are specifically -designed to provide authentication along with encryption, like OCB and -EAX). For this purpose, we will append a message authentication code -to the encrypted message. Specifically, we will generate an extra 256 -bits of key data, and use it to key the ``HMAC(SHA-256)'' MAC -function. We don't want to have the MAC and the cipher to share the -same key; that is very much a no-no. - -\begin{verbatim} - // 80 octets == 32 for cipher key + 16 for IV + 32 for hmac key - SecureVector<byte> keys_and_IV = pbkdf->derive_key(80, passphrase); - - SymmetricKey key(keys_and_IV, 32); - InitializationVector iv(keys_and_IV + 32, 16); - SymmetricKey mac_key(keys_and_IV + 32 + 16, 32); - - Pipe pipe(new Fork( - new Chain( - get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION), - new DataSink_Stream(outfile) - ), - new MAC_Filter("HMAC(SHA-256)", mac_key) - ) - ); - - pipe.start_msg(); - infile >> pipe; - pipe.end_msg(); - - // now read the MAC from message #2. Message numbers start from 0 - SecureVector<byte> hmac = pipe.read_all(1); - outfile.write((const char*)hmac.ptr(), hmac.size()); + byte output[4096] = { 0 }; + u32bit got = read(output, sizeof(output), 1); + if(got >= sizeof(output)) + // have to read again to get more of the message \end{verbatim} -The receiver can check the size of the file (in bytes), and since it knows how -long the MAC is, can figure out how many bytes of ciphertext there are. Then it -reads in that many bytes, sending them to a Serpent/CBC decryption object -(which could be obtained by calling \verb|get_cipher| with an argument of -\type{DECRYPTION} instead of \type{ENCRYPTION}), and storing the final bytes to -authenticate the message with. +You can also read output while the message is still active (before the +call to \function{end\_msg}), using the same interfaces. You can find +out how much data is currently available for a particular \type{Pipe} +by calling the member function \function{remaining}, which takes a +message sequence number and returns the number of bytes that are +currently available to read from that message. -\subsubsection{Problem 4: Cleaning up the key generation} - -The method used to derive the keys and IV is rather inelegant, and it would be -nice to clean that up a bit, algorithmically speaking. A nice solution for this -is to generate a master key from the passphrase and salt, and then generate the -two keys and the IV (the cryptovariables) from that. +\section{Hashing a File} -Starting from the master key, we derive the cryptovariables using a KDF -algorithm, which is designed, among other things, to ``separate'' keys so that -we can derive several different keys from the single master key. For this -purpose, we will use KDF2, which is a generally useful KDF function (defined in -IEEE 1363a, among other standards). The use of different labels (``cipher -key'', etc) makes sure that each of the three derived variables will have -different values. +Hashing a file is done using a \type{Hash\_Filter}, which takes a string +which specifies which hash function you want to use: \begin{verbatim} - PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)"); - // hard-coded iteration count for simplicity; should be sufficient - pbkdf->set_iterations(10000); - // 8 octet == 64-bit salt; again, good enough - pbkdf->new_random_salt(8); - // store the salt so we can write it to a file later - SecureVector<byte> the_salt = pbkdf->current_salt(); - - SymmetricKey master_key = pbkdf->derive_key(48, passphrase); - - KDF* kdf = get_kdf("KDF2(SHA-256)"); - - SymmetricKey key = kdf->derive_key(32, master_key, "cipher key"); - SymmetricKey mac_key = kdf->derive_key(32, master_key, "hmac key"); - InitializationVector iv = kdf->derive_key(16, master_key, "cipher iv"); + Pipe pipe(new Hash_Filter("SHA-256")); \end{verbatim} -\subsubsection{Final version} - -Here is the final version of the encryption code, with all the changes we've -made: - -\begin{verbatim} - PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)"); - pbkdf->set_iterations(10000); - pbkdf->new_random_salt(8); - SecureVector<byte> the_salt = pbkdf->current_salt(); - - SymmetricKey master_key = pbkdf->derive_key(48, passphrase); - - KDF* kdf = get_kdf("KDF2(SHA-256)"); - - SymmetricKey key = kdf->derive_key(32, master_key, "cipher key"); - SymmetricKey mac_key = kdf->derive_key(32, masterkey, "hmac key"); - InitializationVector iv = kdf->derive_key(16, masterkey, "cipher iv"); +The output of a \type{Hash\_Filter} is raw binary. The filter will not +produce any output at all until you call \function{end\_msg}. - Pipe pipe(new Fork( - new Chain( - get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION), - new DataSink_Stream(outfile) - ), - new MAC_Filter("HMAC(SHA-256)", mac_key) - ) - ); - - outfile.write((const char*)the_salt.ptr(), the_salt.size()); - - pipe.start_msg(); - infile >> pipe; - pipe.end_msg(); - - SecureVector<byte> hmac = pipe.read_all(1); - outfile.write((const char*)hmac.ptr(), hmac.size()); -\end{verbatim} - -\subsubsection{Another buffering technique} +\section{Symmetric Cryptography} -Sometimes the use of \type{DataSink\_Stream} is not practical for whatever -reason. In this case, an alternate buffering mechanism might be useful. Here is -some code which will write all the processed data as quickly as possible, so -that memory pressure is reduced in the case of large inputs. -\begin{verbatim} - pipe.start_msg(); - SecureBuffer<byte, 1024> buffer; - while(infile.good()) - { - infile.read((char*)buffer.ptr(), buffer.size()); - u32bit got_from_infile = infile.gcount(); - pipe.write(buffer, got_from_infile); - - if(infile.eof()) - pipe.end_msg(); - - while(pipe.remaining() > 0) - { - u32bit buffered = pipe.read(buffer, buffer.size()); - outfile.write((const char*)buffer.ptr(), buffered); - } - } - if(infile.bad() || (infile.fail() && !infile.eof())) - throw Some_Exception(); -\end{verbatim} - -\pagebreak \subsection{Authentication} -After doing the encryption routines, doing message authentication keyed off a -passphrase is not very difficult. In fact it's much easier than the encryption -case, for the following reasons: a) we only need one key, and b) we don't have -to store anything, so all the input can be done in a single step without -worrying about it taking up a lot of memory if the input file is large. - -In this case, we'll hex-encode the salt and the MAC, and output them both to -standard output (the salt followed by the MAC). - -\begin{verbatim} - PBKDF* pbkdf = get_pbkdf("PBKDF2(SHA-256)"); - pbkdf->set_iterations(10000); - pbkdf->new_random_salt(8); - OctetString the_salt = pbkdf->current_salt(); - - SymmetricKey hmac_key = pbkdf->derive_key(32, passphrase); - - Pipe pipe(new MAC_Filter("HMAC(SHA-256)", mac_key), - new Hex_Encoder - ); - - std::cout << the_salt.to_string(); // hex encoded - - pipe.start_msg(); - infile >> pipe; - pipe.end_msg(); - std::cout << pipe.read_all_as_string() << std::endl; -\end{verbatim} - \subsection{User Authentication} -Doing user authentication off a shared passphrase is fairly easy. Essentially, -a challenge-response protocol is used - the server sends a random challenge, -and the client responds with an appropriate response to the challenge. The idea -is that only someone who knows the passphrase can generate or check to see if a -response is valid. - -Let's say we use 160-bit (20 byte) challenges, which seems fairly -reasonable. We can create this challenge using the global random -number generator (RNG): - -\begin{verbatim} - byte challenge[20]; - Global_RNG::randomize(challenge, sizeof(challenge), Nonce); - // send challenge to client -\end{verbatim} - -After reading the challenge, the client generates a response based on -the challenge and the passphrase. In this case, we will do it by -repeatedly hashing the challenge, the passphrase, and (if applicable) -the previous digest. We iterate this construction 10000 times, to make -brute force attacks on the passphrase hard to do. Since we are already -using 160-bit challenges, a 160-bit response seems warranted, so we'll -use SHA-1. - -\begin{verbatim} - HashFunction* hash = get_hash("SHA-1"); - SecureVector<byte> digest; - for(u32bit j = 0; j != 10000; j++) - { - hash->update(digest, digest.size()); - hash->update(passphrase); - hash->update(challenge, challenge.size()); - digest = hash->final(); - } - delete hash; - // send value of digest to the server -\end{verbatim} - -Upon receiving the response from the client, the server computes what the -response should have been based on the challenge it sent out, and the -passphrase. If the two responses match, the client is authenticated. -Otherwise, it is not. - -An alternate method is to use PBKDF2 again, using the challenge as the salt. In -this case, the response could (for example) be the hash of the key produced by -PBKDF2. There is no reason to have an explicit iteration loop, as PBKDF2 is -designed to prevent dictionary attacks (assuming PBKDF2 is set up for a large -iteration count internally). - -\pagebreak - \section{Public Key Cryptography} -\subsection{Basic Operations} - -In this section, we'll assume we have a \type{X509\_PublicKey*} named -\arg{pubkey}, and, if necessary, a private key type (a -\type{PKCS8\_PrivateKey*}) named \arg{privkey}. A description of these types, -how to create them, and related details appears later in this tutorial. In this -section, we will use various functions that are defined in -\filename{look\_pk.h} -- you will have to include this header explicitly. - -\subsubsection{Encryption} - -Basically, pick an encoding method, create a \type{PK\_Encryptor} (with -\function{get\_pk\_encryptor}()), and use it. But first we have to make sure -the public key can actually be used for public key encryption. For encryption -(and decryption), the key could be RSA, ElGamal, or (in future versions) some -other public key encryption scheme, like Rabin or an elliptic curve scheme. - -\begin{verbatim} - PK_Encrypting_Key* key = dynamic_cast<PK_Encrypting_Key*>(pubkey); - if(!key) - error(); - PK_Encryptor* enc = get_pk_encryptor(*key, "EME1(SHA-256)"); - - byte msg[] = { /* ... */ }; - - // will also accept a SecureVector<byte> as input - SecureVector<byte> ciphertext = enc->encrypt(msg, sizeof(msg)); -\end{verbatim} - -\subsubsection{Decryption} - -This is essentially the same as the encryption operation, but using a private -key instead. One major difference is that the decryption operation can fail due -to the fact that the ciphertext was invalid (most common padding schemes, such -as ``EME1(SHA-256)'', include various pieces of redundancy, which are checked -after decryption). - -\begin{verbatim} - PK_Decrypting_Key* key = dynamic_cast<PK_Decrypting_Key*>(privkey); - if(!key) - error(); - PK_Decryptor* dec = get_pk_decryptor(*key, "EME1(SHA-256)"); - - byte msg[] = { /* ... */ }; - - SecureVector<byte> plaintext; - - try { - // will also accept a SecureVector<byte> as input - plaintext = dec->decrypt(msg, sizeof(msg)); - } - catch(Decoding_Error) - { - /* the ciphertext was invalid */ - } -\end{verbatim} - -\subsubsection{Signature Generation} - -There is one difficulty with signature generation that does not occur with -encryption or decryption. Specifically, there are various padding methods which -can be useful for different signature algorithms, and not all are appropriate -for all signature schemes. The following table breaks down what algorithms -support which encodings: - -\begin{tabular}{|c|c|c|} \hline -Signature Algorithm & Usable Encoding Methods & Preferred Encoding(s) \\ \hline -DSA / NR & EMSA1 & EMSA1 \\ \hline -RSA & EMSA1, EMSA2, EMSA3, EMSA4 & EMSA3, EMSA4 \\ \hline -Rabin-Williams & EMSA2, EMSA4 & EMSA2, EMSA4 \\ \hline -\end{tabular} - -For new applications, use EMSA4 with both RSA and Rabin-Williams, as it is -significantly more secure than the alternatives. However, most current -applications/libraries only support EMSA2 with Rabin-Williams and EMSA3 with -RSA. Given this, you may be forced to use less secure encoding methods for the -near future. In these examples, we punt on the problem, and hard-code using -EMSA1 with SHA-256. - -\begin{verbatim} - Public_Key* key = /* loaded or generated somehow */ - PK_Signer* signer = get_pk_signer(*key, "EMSA1(SHA-256)"); - - byte msg[] = { /* ... */ }; - - /* - You can also repeatedly call update(const byte[], u32bit), followed - by a call to signature(), which will return the final signature of - all the data that was passed through update(). sign_message() is - just a stub that calls update() once, and returns the value of - signature(). - */ - - SecureVector<byte> signature = signer->sign_message(msg, sizeof(msg)); -\end{verbatim} - -\pagebreak - -\subsubsection{Signature Verification} - -In addition to all the problems with choosing the correct padding method, -there is yet another complication with verifying a signature. Namely, there are -two varieties of signature algorithms - those providing message recovery (that -is, the value that was signed can be directly recovered by someone verifying -the signature), and those without message recovery (the verify operation simply -returns if the signature was valid, without telling you exactly what was -signed). This leads to two slightly different implementations of the -verification operation, which user code has to work with. As you can see -however, the implementation is still not at all difficult. - -\begin{verbatim} - PK_Verifier* verifier = 0; - - PK_Verifying_with_MR_Key* key1 = - dynamic_cast<PK_Verifying_with_MR_Key*>(pubkey); - PK_Verifying_wo_MR_Key* key2 = - dynamic_cast<PK_Verifying_wo_MR_Key*>(pubkey); - - if(key1) - verifier = get_pk_verifier(*key1, "EMSA1(SHA-256)"); - else if(key2) - verifier = get_pk_verifier(*key2, "EMSA1(SHA-256)"); - else - error(); - - byte msg[] = { /* ... */ }; - byte sig[] = { /* ... */ }; - - /* - Like PK_Signer, you can also do repeated calls to - void update(const byte some_data[], u32bit length) - followed by a call to - bool check_signature(const byte the_sig[], u32bit length) - which will return true (valid signature) or false (bad signature). - The function verify_message() is a simple wrapper around update() and - check_signature(). - - */ - bool is_valid = verifier->verify_message(msg, sizeof(msg), sig, sizeof(sig)); -\end{verbatim} - -\subsubsection{Key Agreement} - -WRITEME - -\pagebreak - -\subsection{Working with Keys} - -\subsubsection{Reading Public Keys (X.509 format)} - -There are two separate ways to read X.509 public keys. Remember that the X.509 -public keys are simply that: public keys. There is no associated information -(such as the owner of that key) included with the public key itself. If you -need that kind of information, you'll need to use X.509 certificates. - -However, there are surely times when a simple public key is sufficient. The -most obvious is when the key is implicitly trusted, for example if access -and/or modification of it is controlled by something else (like filesystem -ACLs). In other cases, it is a perfectly reasonable proposition to use them -over the network as an anonymous key exchange mechanism. This is, admittedly, -vulnerable to man-in-the-middle attacks, but it's simple enough that it's hard -to mess up (see, for example, Peter Guttman's paper ``Lessons Learned in -Implementing and Deploying Crypto Software'' in Usenix '02). - -The way to load a key is basically to set up a \type{DataSource} and then call -\function{X509::load\_key}, which will return a \type{X509\_PublicKey*}. For -example: - -\begin{verbatim} - DataSource_Stream somefile("somefile.pem"); // has 3 public keys - X509_PublicKey* key1 = X509::load_key(somefile); - X509_PublicKey* key2 = X509::load_key(somefile); - X509_PublicKey* key3 = X509::load_key(somefile); - // Now we have them all loaded. Huzah! -\end{verbatim} - -At this point you can use \function{dynamic\_cast} to find the operations the -key supports (by seeing if a cast to \type{PK\_Encrypting\_Key}, -\type{PK\_Verifying\_with\_MR\_Key}, or \type{PK\_Verifying\_wo\_MR\_Key} -succeeds). - -There is a variant of \function{X509::load\_key} (and of -\function{PKCS8::load\_key}, described in the next section) which take a -filename (as a \type{std::string}). These are just convenience functions which -create the appropriate \type{DataSource} for you and call the main -\function{X509::load\_key}. - -\subsubsection{Reading Private Keys (PKCS \#8 format)} - -This is very similar to reading raw public keys, with the difference that the -key may be encrypted with a user passphrase: - -\begin{verbatim} - // rng is a RandomNumberGenerator, like AutoSeeded_RNG - - DataSource_Stream somefile("somefile"); - std::string a_passphrase = /* get from the user */ - PKCS8_PrivateKey* key = PKCS8::load_key(somefile, rng, a_passphrase); -\end{verbatim} - -You can, by the way, convert a \type{PKCS8\_PrivateKey} to a -\type{X509\_PublicKey} simply by casting it (with \function{dynamic\_cast}), as -the private key type is derived from \type{X509\_PublicKey}. As with -\type{X509\_PublicKey}, you can use \function{dynamic\_cast} to figure out what -operations the private key is capable of; in particular, you can attempt to -cast it to \type{PK\_Decrypting\_Key}, \type{PK\_Signing\_Key}, or -\type{PK\_Key\_Agreement\_Key}. - -Sometimes you can get away with having a static passphrase passed to -\function{load\_key}. Typically, however, you'll have to do some user -interaction to get the appropriate passphrase. In that case you'll want to use -the \type{UI} related interface, which is fully described in the API -documentation. - -\subsubsection{Generating New Private Keys} - -Generate a new private key is the one operation which requires you to -explicitly name the type of key you are working with. There are (currently) two -kinds of public key algorithms in Botan: ones based on the integer -factorization (IF) problem (RSA and Rabin-Williams), and ones based on the -discrete logarithm (DL) problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and -ElGamal). Since discrete logarithm parameters (primes and generators) can be -shared among many keys, there is the notion of these being a combined type -(called \type{DL\_Group}). - -To create a new DL-based private key, simply pass a desired \type{DL\_Group} to -the constructor of the private key - a new public/private key pair will be -generated. Since in IF-based algorithms, the modulus used isn't shared by other -keys, we don't use this notion. You can create a new key by passing in a -\type{u32bit} telling how long (in bits) the key should be. - -There are quite a few ways to get a \type{DL\_Group} object. The best is to use -the function \function{get\_dl\_group}, which takes a string naming a group; it -will either return that group, if it knows about it, or throw an -exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536, -2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF -groups are the ones specified for use with IPSec, and the DSA ones are the -default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, use -the ``DSA-n'' groups, and for Diffie-Hellman and ElGamal, use the ``IETF-n'' -groups. - -You can also generate a new random group. This is not recommend, because it is -very slow, particularly for ``safe'' primes, which are needed for -Diffie-Hellman and ElGamal. - -Some examples: - -\begin{verbatim} - RSA_PrivateKey rsa1(512); // 512-bit RSA key - RSA_PrivateKey rsa2(2048); // 2048-bit RSA key - - RW_PrivateKey rw1(1024); // 1024-bit Rabin-Williams key - RW_PrivateKey rw2(1536); // 1536-bit Rabin-Williams key - - DSA_PrivateKey dsa(get_dl_group("DSA-512")); // 512-bit DSA key - DH_PrivateKey dh(get_dl_group("IETF-4096")); // 4096-bit DH key - NR_PrivateKey nr(get_dl_group("DSA-1024")); // 1024-bit NR key - ElGamal_PrivateKey elg(get_dl_group("IETF-1536")); // 1536-bit ElGamal key -\end{verbatim} - -To export your newly created private key, use the PKCS \#8 routines in -\filename{pkcs8.h}: - -\begin{verbatim} - std::string a_passphrase = /* get from the user */ - std::string the_key = PKCS8::PEM_encode(rsa2, a_passphrase); -\end{verbatim} - -You can read the key back in using \function{PKCS8::load\_key}, described in -the section ``Reading Private Keys (PKCS \#8 format)'', above. Unfortunately, -this only works with keys that have an assigned algorithm identifier and -standardized format. Currently this is only the RSA, DSA, DH, and ElGamal -algorithms, though RW and NR keys can also be imported and exported by -assigning them an OID (this can be done either through a configuration file, or -by calling the function \function{OIDS::add\_oid} in \filename{oids.h}). Be -aware that the OID and format for ElGamal keys is not exactly standard, but -there does exist at least one other crypto library which will accept the -format. - -The raw public can be exported using: - -\begin{verbatim} - std::string the_public_key = X509::PEM_encode(rsa2); -\end{verbatim} - -\pagebreak - -\section{X.509v3 Certificates} - -Using certificates is rather complicated, so only the very basic mechanisms are -going to be covered here. The section ``Setting up a CA'' goes into reasonable -detail about CRLs and certificate requests, but there is a lot that isn't -covered (else this section would get quite long and complicated). - -\subsection{Importing and Exporting Certificates} - -Importing and exporting X.509 certificates is easy. Simply call the constructor -with either a \type{DataSource\&}, or the name of a file: - -\begin{verbatim} - X509_Certificate cert1("cert1.pem"); - - /* This file contains two certificates, concatenated */ - DataSource_Stream in("certs2_and_3.pem"); - - X509_Certificate cert2(in); // read the first cert - X509_Certificate cert3(in); // read the second cert -\end{verbatim} - -Exporting the certificate is a simple matter of calling the member function -\function{PEM\_encode}(), which returns a \type{std::string} containing the -certificate in PEM encoding. - -\begin{verbatim} - std::cout << cert3.PEM_encode(); - some_ostream_object << cert1.PEM_encode(); - std::string cert2_str = cert2.PEM_encode(); -\end{verbatim} - -\subsection{Verifying Certificates} - -Verifying a certificate requires that we build up a chain of trust, starting -from the root (usually a commercial CA), down through some number of -intermediate CAs, and finally reaching the actual certificate in -question. Thus, to verify, we actually have to have all those certificates -on hand (or at the very least, know where we can get the ones we need). - -The class which handles both storing certificates, and verifying them, is -called \type{X509\_Store}. We'll start by assuming that we have all the -certificates we need, and just want to verify a cert. This is done by calling -the member function \function{validate\_cert}, which takes the -\type{X509\_Certificate} in question, and an optional argument of type -\type{Cert\_Usage} (which is ignored here; read the section in the API doc -titled ``Verifying Certificates for information). It returns an enum; -\type{X509\_Code}, which, for most purposes, is either \type{VERIFIED}, or -something else (which specifies what circumstance caused the certificate to be -considered invalid). Really, that's it. - -Now, how to let \type{X509\_Store} know about all those certificates and CRLs -we have lying around? The simplest method is to add them directly, using the -functions \function{add\_cert}, \function{add\_certs}, -\function{add\_trusted\_certs}, and \function{add\_crl}; for details, consult -the API doc or read the \filename{x509stor.h} header. There is also a much more -elegant and powerful method: \type{Certificate\_Store}s. A certificate store -refers to an object that knows how to retrieve certificates from some external -source (a file, an LDAP directory, a HTTP server, a SQL database, or anything -else). By calling the function \function{add\_new\_certstore}, you can register -a new certificate store, which \type{X509\_Store} will use to find certificates -it needs. Thus, you can get away with only adding whichever root CA cert(s) you -want to use, letting some online source handle the storage of all intermediate -X.509 certificates. The API documentation has a more complete discussion of -\type{Certificate\_Store}. - -\subsection{Setting up a CA} - -WRITEME - -\pagebreak - -\section{Special Topics} - -This chapter is for subjects that don't really fit into the API documentation -or into other chapters of the tutorial. - -\subsection{GUIs} - -There is nothing particularly special about using Botan in a GUI-based -application. However there are a few tricky spots, as well as a few ways to -take advantage of an event-based application. - -\subsubsection{Initialization} - -Generally you will create the \type{LibraryInitializer} somewhere in -\texttt{main}, before entering the event loop. One problem is that some GUI -libraries take over \texttt{main} and drop you right into the event loop; the -question then is how to initialize the library? The simplest way is probably to -have a static flag that marks if you have already initialized the library or -not. When you enter the event loop, check to see if this flag has not been set, -and if so, initialize the library using the function-based initializers. Using -\type{LibraryInitializer} obviously won't work in this case, since it would be -destroyed as soon as the current event handler finished. You then deinitialize -the library whenever your application is signaled to quit. - -\subsubsection{Interacting With the Library} - -In the simple case, the user will do stuff asynchronously, and then in response -your code will do things like encrypt a file or whatever, which can all be done -synchronously, since the data is right there for you. An application doing -something like this is basically going to look exactly like a command line -application that uses Botan, the only major difference being that the calls to -the library are inside event handlers. - -Much harder is something like an SSH client, where you're acting as a layer -between two asynchronous things (the user and the network). This actually isn't -restricted to GUIs at all (text-mode SSH clients have to deal with most of the -same problems), but it is probably more common with a GUI app. The following -discussion is fairly vague, but hopefully somewhat useful. - -There are a few facilities in Botan that are primarily designed to be used by -applications based on an event loop. See the section ``User Interfaces'' in the -main API doc for details. - -\subsubsection{Entropy} - -One nice advantage of using a GUI is opening a new method of gathering entropy -for the library. This is especially handy on Windows, where the available -sources of entropy are pretty questionable. In many versions, -\texttt{CryptGenRandom} is really rather poor, and the Toolhelp functions may -not provide much data on a small system (such as a handheld). For example, in -GTK+, you can use the following callback to get information about mouse -movements: - -\begin{verbatim} -static gint add_entropy(GtkWidget* widget, GdkEventMotion* event) - { - if(event) - Global_RNG::add_entropy(event, sizeof(GdkEventMotion)); - return FALSE; - } -\end{verbatim} - -And then register it with your main GTK window (presumably named -\variable{window}) as follows: - -\begin{verbatim} -gtk_signal_connect(GTK_OBJECT(window), "motion_notify_event", - GTK_SIGNAL_FUNC(add_entropy), NULL); - -gtk_widget_set_events(window, GDK_POINTER_MOTION_MASK); -\end{verbatim} - -Even though we're catching all mouse movements, and hashing the results into -the entropy pool, this doesn't use up more than a few percent of even a -relatively slow desktop CPU. Note that in the case of using X over a network, -catching all mouse events would cause large amounts of X traffic over the -network, which might make your application slow, or even unusable (I haven't -tried it, though). - -This could be made nicer if the collection function did something like -calculating deltas between each run, storing them into a buffer, and then when -enough of them have been added, hashing them and send them all to the PRNG in -one shot. This would not only reduce load, but also prevent the PRNG from -overestimating the amount of entropy it's getting, since its estimates don't -(can't) take history into account. For example, you could move the mouse back -and forth one pixel, and the PRNG would think it was getting a full load of -entropy each time, when actually it was getting (at best) a bit or two. \end{document} |