aboutsummaryrefslogtreecommitdiffstats
path: root/doc/tutorial.tex
blob: 4023ab20d8a48886a212ef8873c4b8e142e72e8d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
\documentclass{article}

\setlength{\textwidth}{6.5in} % 1 inch side margins
\setlength{\textheight}{9in} % ~1 inch top and bottom margins

\setlength{\headheight}{0in}
\setlength{\topmargin}{0in}
\setlength{\headsep}{0in}

\setlength{\oddsidemargin}{0in}
\setlength{\evensidemargin}{0in}

\title{\textbf{Botan Tutorial}}
\author{Jack Lloyd \\
        \texttt{lloyd@randombit.net}}
\date{2009/07/08}

\newcommand{\filename}[1]{\texttt{#1}}
\newcommand{\manpage}[2]{\texttt{#1}(#2)}

\newcommand{\macro}[1]{\texttt{#1}}

\newcommand{\function}[1]{\textbf{#1}}
\newcommand{\type}[1]{\texttt{#1}}
\renewcommand{\arg}[1]{\textsl{#1}}
\newcommand{\variable}[1]{\textsl{#1}}

\begin{document}

\maketitle

\tableofcontents

\parskip=5pt
\pagebreak

\section{Introduction}

This document essentially sets up various simple scenarios and then
shows how to solve the problems using Botan. It's fairly simple, and
doesn't cover many of the available APIs and algorithms, especially
the more obscure or unusual ones. It is a supplement to the API
documentation and the example applications, which are included in the
distribution.

To quote the Perl man page: '``There's more than one way to do it.''
Divining how many more is left as an exercise to the reader.'

This is \emph{not} a general introduction to cryptography, and most simple
terms and ideas are not explained in any great detail.

Finally, most of the code shown in this tutorial has not been tested, it was
just written down from memory. If you find errors, please let me know.

\section{Initializing the Library}

The first step to using Botan is to create a \type{LibraryInitializer} object,
which handles creating various internal structures, and also destroying them at
shutdown. Essentially:

\begin{verbatim}
#include <botan/botan.h>
/* include other headers here */

int main()
   {
   LibraryInitializer init;
   /* now do stuff */
   return 0;
   }
\end{verbatim}

\section{Hashing a File}

\section{Symmetric Cryptography}

\subsection{Encryption with a passphrase}

Probably the most common crypto problem is encrypting a file (or some data that
is in-memory) using a passphrase. There are a million ways to do this, most of
them bad. In particular, you have to protect against weak passphrases,
people reusing a passphrase many times, accidental and deliberate modification,
and a dozen other potential problems.

We'll start with a simple method that is commonly used, and show the problems
that can arise. Each subsequent solution will modify the previous one to
prevent one or more common problems, until we arrive at a good version.

In these examples, we'll always use Serpent in Cipher-Block Chaining
(CBC) mode. Whenever we need a hash function, we'll use SHA-256, since
that is a common and well-known hash that is thought to be secure.

In all examples, we choose to derive the Initialization Vector (IV) from the
passphrase. Another (probably more common) alternative is to generate the IV
randomly and include it at the beginning of the message. Either way is
acceptable, and can be secure. The method used here was chosen to make for more
interesting examples (because it's harder to get right), and may not be an
appropriate choice for some environments.

First, some notation. The passphrase is stored as a \type{std::string} named
\variable{passphrase}. The input and output files (\variable{infile} and
\variable{outfile}) are of types \type{std::ifstream} and \type{std::ofstream}
(respectively).

\subsubsection{First try}

We hash the passphrase with SHA-256, and use the resulting hash to key
Serpent. To generate the IV, we prepend a single '0' character to the
passphrase, hash it, and truncate it to 16 bytes (which is Serpent's
block size).

\begin{verbatim}
   HashFunction* hash = get_hash("SHA-256");

   SymmetricKey key = hash->process(passphrase);
   SecureVector<byte> raw_iv = hash->process('0' + passphrase);
   InitializationVector iv(raw_iv, 16);

   Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION));

   pipe.start_msg();
   infile >> pipe;
   pipe.end_msg();
   outfile << pipe;
\end{verbatim}

\subsubsection{Problem 1: Buffering}

There is a problem with the above code, if the input file is fairly large as
compared to available memory. Specifically, all the encrypted data is stored
in memory, and then flushed to \variable{outfile} in a single go at the very
end. If the input file is big (say, a gigabyte), this will be most problematic.

The solution is to use a \type{DataSink} to handle the output for us (writing
to \arg{outfile} will be implicit with writing to the \type{Pipe}). We can do
this by replacing the last few lines with:

\begin{verbatim}
   Pipe pipe(get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
             new DataSink_Stream(outfile));

   pipe.start_msg();
   infile >> pipe;
   pipe.end_msg();
\end{verbatim}

\subsubsection{Problem 2: Deriving the key and IV}

Hash functions like SHA-256 are deterministic; if the same passphrase
is supplied twice, then the key (and in our case, the IV) will be the
same. This is very dangerous, and could easily open the whole system
up to attack. What we need to do is introduce a salt (or nonce) into
the generation of the key from the passphrase. This will mean that the
key will not be the same each time the same passphrase is typed in by
a user.

There is another problem with using a bare hash function to derive
keys. While it's inconceivable (based on our current understanding of
thermodynamics and theories of computation) that an attacker could
brute-force a 256-bit key, it would be fairly simple for them to
compute the SHA-256 hashes of various common passwords ('password',
the name of the dog, the significant other's middle name, favorite
sports team) and try those as keys. So we want to slow the attacker
down if we can, and an easy way to do that is to iterate the hash
function a bunch of times (say, 1024 to 4096 times). This will involve
only a small amount of effort for a legitimate user (since they only
have to compute the hashes once, when they type in their passphrase),
but an attacker, trying out a large list of potential passphrases,
will be seriously annoyed (and slowed down) by this.

In this iteration of the example, we'll kill these two birds with one
stone, and derive the key from the passphrase using a S2K (string to
key) algorithm (these are also often called PBKDF algorithms, for
Password-Based Key Derivation Function). In this example, we use
PBKDF2 with Hash Message Authentication Code (HMAC(SHA-256)), which is
specified in PKCS \#5. We replace the first four lines of code from
the first example with:

\begin{verbatim}
   S2K* s2k = get_s2k("PBKDF2(SHA-256)");
   // hard-coded iteration count for simplicity; should be sufficient
   s2k->set_iterations(4096);
   // 8 octets == 64-bit salt; again, good enough
   s2k->new_random_salt(8);
   SecureVector<byte> the_salt = s2k->current_salt();

   // 48 octets == 32 for key + 16 for IV
   SecureVector<byte> key_and_IV = s2k->derive_key(48, passphrase).bits_of();

   SymmetricKey key(key_and_IV, 32);
   InitializationVector iv(key_and_IV + 32, 16);
\end{verbatim}

To complete the example, we have to remember to write out the salt (stored in
\variable{the\_salt}) at the beginning of the file. The receiving side needs to
know this value in order to restore it (by calling the \variable{s2k} object's
\function{change\_salt} function) so it can derive the same key and IV from the
passphrase.

\subsubsection{Problem 3: Protecting against modification}

As it is, an attacker can undetectably alter the message while it is
in transit. It is vital to remember that encryption does not imply
authentication (except when using special modes that are specifically
designed to provide authentication along with encryption, like OCB and
EAX). For this purpose, we will append a message authentication code
to the encrypted message. Specifically, we will generate an extra 256
bits of key data, and use it to key the ``HMAC(SHA-256)'' MAC
function. We don't want to have the MAC and the cipher to share the
same key; that is very much a no-no.

\begin{verbatim}
   // 80 octets == 32 for cipher key + 16 for IV + 32 for hmac key
   SecureVector<byte> keys_and_IV = s2k->derive_key(80, passphrase);

   SymmetricKey key(keys_and_IV, 32);
   InitializationVector iv(keys_and_IV + 32, 16);
   SymmetricKey mac_key(keys_and_IV + 32 + 16, 32);

   Pipe pipe(new Fork(
                new Chain(
                   get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
                   new DataSink_Stream(outfile)
                   ),
                new MAC_Filter("HMAC(SHA-256)", mac_key)
                )
      );

   pipe.start_msg();
   infile >> pipe;
   pipe.end_msg();

   // now read the MAC from message #2. Message numbers start from 0
   SecureVector<byte> hmac = pipe.read_all(1);
   outfile.write((const char*)hmac.ptr(), hmac.size());
\end{verbatim}

The receiver can check the size of the file (in bytes), and since it knows how
long the MAC is, can figure out how many bytes of ciphertext there are. Then it
reads in that many bytes, sending them to a Serpent/CBC decryption object
(which could be obtained by calling \verb|get_cipher| with an argument of
\type{DECRYPTION} instead of \type{ENCRYPTION}), and storing the final bytes to
authenticate the message with.

\subsubsection{Problem 4: Cleaning up the key generation}

The method used to derive the keys and IV is rather inelegant, and it would be
nice to clean that up a bit, algorithmically speaking. A nice solution for this
is to generate a master key from the passphrase and salt, and then generate the
two keys and the IV (the cryptovariables) from that.

Starting from the master key, we derive the cryptovariables using a KDF
algorithm, which is designed, among other things, to ``separate'' keys so that
we can derive several different keys from the single master key. For this
purpose, we will use KDF2, which is a generally useful KDF function (defined in
IEEE 1363a, among other standards). The use of different labels (``cipher
key'', etc) makes sure that each of the three derived variables will have
different values.

\begin{verbatim}
   S2K* s2k = get_s2k("PBKDF2(SHA-256)");
   // hard-coded iteration count for simplicity; should be sufficient
   s2k->set_iterations(4096);
   // 8 octet == 64-bit salt; again, good enough
   s2k->new_random_salt(8);
   // store the salt so we can write it to a file later
   SecureVector<byte> the_salt = s2k->current_salt();

   SymmetricKey master_key = s2k->derive_key(48, passphrase);

   KDF* kdf = get_kdf("KDF2(SHA-256)");

   SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
   SymmetricKey mac_key = kdf->derive_key(32, master_key, "hmac key");
   InitializationVector iv = kdf->derive_key(16, master_key, "cipher iv");
\end{verbatim}

\subsubsection{Final version}

Here is the final version of the encryption code, with all the changes we've
made:

\begin{verbatim}
   S2K* s2k = get_s2k("PBKDF2(SHA-256)");
   s2k->set_iterations(4096);
   s2k->new_random_salt(8);
   SecureVector<byte> the_salt = s2k->current_salt();

   SymmetricKey master_key = s2k->derive_key(48, passphrase);

   KDF* kdf = get_kdf("KDF2(SHA-256)");

   SymmetricKey key = kdf->derive_key(32, master_key, "cipher key");
   SymmetricKey mac_key = kdf->derive_key(32, masterkey, "hmac key");
   InitializationVector iv = kdf->derive_key(16, masterkey, "cipher iv");

   Pipe pipe(new Fork(
                new Chain(
                   get_cipher("Serpent/CBC/PKCS7", key, iv, ENCRYPTION),
                   new DataSink_Stream(outfile)
                   ),
                new MAC_Filter("HMAC(SHA-256)", mac_key)
                )
      );

   outfile.write((const char*)the_salt.ptr(), the_salt.size());

   pipe.start_msg();
   infile >> pipe;
   pipe.end_msg();

   SecureVector<byte> hmac = pipe.read_all(1);
   outfile.write((const char*)hmac.ptr(), hmac.size());
\end{verbatim}

\subsubsection{Another buffering technique}

Sometimes the use of \type{DataSink\_Stream} is not practical for whatever
reason. In this case, an alternate buffering mechanism might be useful. Here is
some code which will write all the processed data as quickly as possible, so
that memory pressure is reduced in the case of large inputs.

\begin{verbatim}
   pipe.start_msg();
   SecureBuffer<byte, 1024> buffer;
   while(infile.good())
      {
      infile.read((char*)buffer.ptr(), buffer.size());
      u32bit got_from_infile = infile.gcount();
      pipe.write(buffer, got_from_infile);

      if(infile.eof())
         pipe.end_msg();

      while(pipe.remaining() > 0)
         {
         u32bit buffered = pipe.read(buffer, buffer.size());
         outfile.write((const char*)buffer.ptr(), buffered);
         }
      }
   if(infile.bad() || (infile.fail() && !infile.eof()))
      throw Some_Exception();
\end{verbatim}

\pagebreak

\subsection{Authentication}

After doing the encryption routines, doing message authentication keyed off a
passphrase is not very difficult. In fact it's much easier than the encryption
case, for the following reasons: a) we only need one key, and b) we don't have
to store anything, so all the input can be done in a single step without
worrying about it taking up a lot of memory if the input file is large.

In this case, we'll hex-encode the salt and the MAC, and output them both to
standard output (the salt followed by the MAC).

\begin{verbatim}
   S2K* s2k = get_s2k("PBKDF2(SHA-256)");
   s2k->set_iterations(4096);
   s2k->new_random_salt(8);
   OctetString the_salt = s2k->current_salt();

   SymmetricKey hmac_key = s2k->derive_key(32, passphrase);

   Pipe pipe(new MAC_Filter("HMAC(SHA-256)", mac_key),
             new Hex_Encoder
   );

   std::cout << the_salt.to_string(); // hex encoded

   pipe.start_msg();
   infile >> pipe;
   pipe.end_msg();
   std::cout << pipe.read_all_as_string() << std::endl;
\end{verbatim}

\subsection{User Authentication}

Doing user authentication off a shared passphrase is fairly easy. Essentially,
a challenge-response protocol is used - the server sends a random challenge,
and the client responds with an appropriate response to the challenge. The idea
is that only someone who knows the passphrase can generate or check to see if a
response is valid.

Let's say we use 160-bit (20 byte) challenges, which seems fairly
reasonable. We can create this challenge using the global random
number generator (RNG):

\begin{verbatim}
   byte challenge[20];
   Global_RNG::randomize(challenge, sizeof(challenge), Nonce);
   // send challenge to client
\end{verbatim}

After reading the challenge, the client generates a response based on
the challenge and the passphrase. In this case, we will do it by
repeatedly hashing the challenge, the passphrase, and (if applicable)
the previous digest. We iterate this construction 4096 times, to make
brute force attacks on the passphrase hard to do. Since we are already
using 160-bit challenges, a 160-bit response seems warranted, so we'll
use SHA-1.

\begin{verbatim}
   HashFunction* hash = get_hash("SHA-1");
   SecureVector<byte> digest;
   for(u32bit j = 0; j != 4096; j++)
      {
      hash->update(digest, digest.size());
      hash->update(passphrase);
      hash->update(challenge, challenge.size());
      digest = hash->final();
      }
   delete hash;
   // send value of digest to the server
\end{verbatim}

Upon receiving the response from the client, the server computes what the
response should have been based on the challenge it sent out, and the
passphrase. If the two responses match, the client is authenticated.
Otherwise, it is not.

An alternate method is to use PBKDF2 again, using the challenge as the salt. In
this case, the response could (for example) be the hash of the key produced by
PBKDF2. There is no reason to have an explicit iteration loop, as PBKDF2 is
designed to prevent dictionary attacks (assuming PBKDF2 is set up for a large
iteration count internally).

\pagebreak

\section{Public Key Cryptography}

\subsection{Basic Operations}

In this section, we'll assume we have a \type{X509\_PublicKey*} named
\arg{pubkey}, and, if necessary, a private key type (a
\type{PKCS8\_PrivateKey*}) named \arg{privkey}. A description of these types,
how to create them, and related details appears later in this tutorial. In this
section, we will use various functions that are defined in
\filename{look\_pk.h} -- you will have to include this header explicitly.

\subsubsection{Encryption}

Basically, pick an encoding method, create a \type{PK\_Encryptor} (with
\function{get\_pk\_encryptor}()), and use it. But first we have to make sure
the public key can actually be used for public key encryption. For encryption
(and decryption), the key could be RSA, ElGamal, or (in future versions) some
other public key encryption scheme, like Rabin or an elliptic curve scheme.

\begin{verbatim}
  PK_Encrypting_Key* key = dynamic_cast<PK_Encrypting_Key*>(pubkey);
  if(!key)
     error();
  PK_Encryptor* enc = get_pk_encryptor(*key, "EME1(SHA-256)");

  byte msg[] = { /* ... */ };

  // will also accept a SecureVector<byte> as input
  SecureVector<byte> ciphertext = enc->encrypt(msg, sizeof(msg));
\end{verbatim}

\subsubsection{Decryption}

This is essentially the same as the encryption operation, but using a private
key instead. One major difference is that the decryption operation can fail due
to the fact that the ciphertext was invalid (most common padding schemes, such
as ``EME1(SHA-256)'', include various pieces of redundancy, which are checked
after decryption).

\begin{verbatim}
  PK_Decrypting_Key* key = dynamic_cast<PK_Decrypting_Key*>(privkey);
  if(!key)
     error();
  PK_Decryptor* dec = get_pk_decryptor(*key, "EME1(SHA-256)");

  byte msg[] = { /* ... */ };

  SecureVector<byte> plaintext;

  try {
    // will also accept a SecureVector<byte> as input
    plaintext = dec->decrypt(msg, sizeof(msg));
  }
  catch(Decoding_Error)
  {
    /* the ciphertext was invalid */
  }
\end{verbatim}

\subsubsection{Signature Generation}

There is one difficulty with signature generation that does not occur with
encryption or decryption. Specifically, there are various padding methods which
can be useful for different signature algorithms, and not all are appropriate
for all signature schemes. The following table breaks down what algorithms
support which encodings:

\begin{tabular}{|c|c|c|} \hline
Signature Algorithm & Usable Encoding Methods & Preferred Encoding(s) \\ \hline
DSA / NR            & EMSA1                   & EMSA1 \\ \hline
RSA                 & EMSA1, EMSA2, EMSA3, EMSA4 & EMSA3, EMSA4 \\ \hline
Rabin-Williams      & EMSA2, EMSA4            & EMSA2, EMSA4 \\ \hline
\end{tabular}

For new applications, use EMSA4 with both RSA and Rabin-Williams, as it is
significantly more secure than the alternatives. However, most current
applications/libraries only support EMSA2 with Rabin-Williams and EMSA3 with
RSA. Given this, you may be forced to use less secure encoding methods for the
near future. In these examples, we punt on the problem, and hard-code using
EMSA1 with SHA-256.

\begin{verbatim}
  Public_Key* key = /* loaded or generated somehow */
  PK_Signer* signer = get_pk_signer(*key, "EMSA1(SHA-256)");

  byte msg[] = { /* ... */ };

  /*
  You can also repeatedly call update(const byte[], u32bit), followed
  by a call to signature(), which will return the final signature of
  all the data that was passed through update(). sign_message() is
  just a stub that calls update() once, and returns the value of
  signature().
  */

  SecureVector<byte> signature = signer->sign_message(msg, sizeof(msg));
\end{verbatim}

\pagebreak

\subsubsection{Signature Verification}

In addition to all the problems with choosing the correct padding method,
there is yet another complication with verifying a signature. Namely, there are
two varieties of signature algorithms - those providing message recovery (that
is, the value that was signed can be directly recovered by someone verifying
the signature), and those without message recovery (the verify operation simply
returns if the signature was valid, without telling you exactly what was
signed). This leads to two slightly different implementations of the
verification operation, which user code has to work with. As you can see
however, the implementation is still not at all difficult.

\begin{verbatim}
  PK_Verifier* verifier = 0;

  PK_Verifying_with_MR_Key* key1 =
     dynamic_cast<PK_Verifying_with_MR_Key*>(pubkey);
  PK_Verifying_wo_MR_Key* key2 =
     dynamic_cast<PK_Verifying_wo_MR_Key*>(pubkey);

  if(key1)
     verifier = get_pk_verifier(*key1, "EMSA1(SHA-256)");
  else if(key2)
     verifier = get_pk_verifier(*key2, "EMSA1(SHA-256)");
  else
     error();

  byte msg[] = { /* ... */ };
  byte sig[] = { /* ... */ };

  /*
     Like PK_Signer, you can also do repeated calls to
        void update(const byte some_data[], u32bit length)
     followed by a call to
        bool check_signature(const byte the_sig[], u32bit length)
     which will return true (valid signature) or false (bad signature).
     The function verify_message() is a simple wrapper around update() and
     check_signature().

  */
  bool is_valid = verifier->verify_message(msg, sizeof(msg), sig, sizeof(sig));
\end{verbatim}

\subsubsection{Key Agreement}

WRITEME

\pagebreak

\subsection{Working with Keys}

\subsubsection{Reading Public Keys (X.509 format)}

There are two separate ways to read X.509 public keys. Remember that the X.509
public keys are simply that: public keys. There is no associated information
(such as the owner of that key) included with the public key itself. If you
need that kind of information, you'll need to use X.509 certificates.

However, there are surely times when a simple public key is sufficient. The
most obvious is when the key is implicitly trusted, for example if access
and/or modification of it is controlled by something else (like filesystem
ACLs). In other cases, it is a perfectly reasonable proposition to use them
over the network as an anonymous key exchange mechanism. This is, admittedly,
vulnerable to man-in-the-middle attacks, but it's simple enough that it's hard
to mess up (see, for example, Peter Guttman's paper ``Lessons Learned in
Implementing and Deploying Crypto Software'' in Usenix '02).

The way to load a key is basically to set up a \type{DataSource} and then call
\function{X509::load\_key}, which will return a \type{X509\_PublicKey*}. For
example:

\begin{verbatim}
   DataSource_Stream somefile("somefile.pem"); // has 3 public keys
   X509_PublicKey* key1 = X509::load_key(somefile);
   X509_PublicKey* key2 = X509::load_key(somefile);
   X509_PublicKey* key3 = X509::load_key(somefile);
   // Now we have them all loaded. Huzah!
\end{verbatim}

At this point you can use \function{dynamic\_cast} to find the operations the
key supports (by seeing if a cast to \type{PK\_Encrypting\_Key},
\type{PK\_Verifying\_with\_MR\_Key}, or \type{PK\_Verifying\_wo\_MR\_Key}
succeeds).

There is a variant of \function{X509::load\_key} (and of
\function{PKCS8::load\_key}, described in the next section) which take a
filename (as a \type{std::string}). These are just convenience functions which
create the appropriate \type{DataSource} for you and call the main
\function{X509::load\_key}.

\subsubsection{Reading Private Keys (PKCS \#8 format)}

This is very similar to reading raw public keys, with the difference that the
key may be encrypted with a user passphrase:

\begin{verbatim}
   // rng is a RandomNumberGenerator, like AutoSeeded_RNG

   DataSource_Stream somefile("somefile");
   std::string a_passphrase = /* get from the user */
   PKCS8_PrivateKey* key = PKCS8::load_key(somefile, rng, a_passphrase);
\end{verbatim}

You can, by the way, convert a \type{PKCS8\_PrivateKey} to a
\type{X509\_PublicKey} simply by casting it (with \function{dynamic\_cast}), as
the private key type is derived from \type{X509\_PublicKey}. As with
\type{X509\_PublicKey}, you can use \function{dynamic\_cast} to figure out what
operations the private key is capable of; in particular, you can attempt to
cast it to \type{PK\_Decrypting\_Key}, \type{PK\_Signing\_Key}, or
\type{PK\_Key\_Agreement\_Key}.

Sometimes you can get away with having a static passphrase passed to
\function{load\_key}. Typically, however, you'll have to do some user
interaction to get the appropriate passphrase. In that case you'll want to use
the \type{UI} related interface, which is fully described in the API
documentation.

\subsubsection{Generating New Private Keys}

Generate a new private key is the one operation which requires you to
explicitly name the type of key you are working with. There are (currently) two
kinds of public key algorithms in Botan: ones based on the integer
factorization (IF) problem (RSA and Rabin-Williams), and ones based on the
discrete logarithm (DL) problem (DSA, Diffie-Hellman, Nyberg-Rueppel, and
ElGamal). Since discrete logarithm parameters (primes and generators) can be
shared among many keys, there is the notion of these being a combined type
(called \type{DL\_Group}).

To create a new DL-based private key, simply pass a desired \type{DL\_Group} to
the constructor of the private key - a new public/private key pair will be
generated. Since in IF-based algorithms, the modulus used isn't shared by other
keys, we don't use this notion. You can create a new key by passing in a
\type{u32bit} telling how long (in bits) the key should be.

There are quite a few ways to get a \type{DL\_Group} object. The best is to use
the function \function{get\_dl\_group}, which takes a string naming a group; it
will either return that group, if it knows about it, or throw an
exception. Names it knows about include ``IETF-n'' where n is 768, 1024, 1536,
2048, 3072, or 4096, and ``DSA-n'', where n is 512, 768, or 1024. The IETF
groups are the ones specified for use with IPSec, and the DSA ones are the
default DSA parameters specified by Java's JCE. For DSA and Nyberg-Rueppel, use
the ``DSA-n'' groups, and for Diffie-Hellman and ElGamal, use the ``IETF-n''
groups.

You can also generate a new random group. This is not recommend, because it is
very slow, particularly for ``safe'' primes, which are needed for
Diffie-Hellman and ElGamal.

Some examples:

\begin{verbatim}
   RSA_PrivateKey rsa1(512); // 512-bit RSA key
   RSA_PrivateKey rsa2(2048); // 2048-bit RSA key

   RW_PrivateKey rw1(1024); // 1024-bit Rabin-Williams key
   RW_PrivateKey rw2(1536); // 1536-bit Rabin-Williams key

   DSA_PrivateKey dsa(get_dl_group("DSA-512")); // 512-bit DSA key
   DH_PrivateKey dh(get_dl_group("IETF-4096")); // 4096-bit DH key
   NR_PrivateKey nr(get_dl_group("DSA-1024")); // 1024-bit NR key
   ElGamal_PrivateKey elg(get_dl_group("IETF-1536")); // 1536-bit ElGamal key
\end{verbatim}

To export your newly created private key, use the PKCS \#8 routines in
\filename{pkcs8.h}:

\begin{verbatim}
   std::string a_passphrase = /* get from the user */
   std::string the_key = PKCS8::PEM_encode(rsa2, a_passphrase);
\end{verbatim}

You can read the key back in using \function{PKCS8::load\_key}, described in
the section ``Reading Private Keys (PKCS \#8 format)'', above. Unfortunately,
this only works with keys that have an assigned algorithm identifier and
standardized format. Currently this is only the RSA, DSA, DH, and ElGamal
algorithms, though RW and NR keys can also be imported and exported by
assigning them an OID (this can be done either through a configuration file, or
by calling the function \function{OIDS::add\_oid} in \filename{oids.h}). Be
aware that the OID and format for ElGamal keys is not exactly standard, but
there does exist at least one other crypto library which will accept the
format.

The raw public can be exported using:

\begin{verbatim}
   std::string the_public_key = X509::PEM_encode(rsa2);
\end{verbatim}

\pagebreak

\section{X.509v3 Certificates}

Using certificates is rather complicated, so only the very basic mechanisms are
going to be covered here. The section ``Setting up a CA'' goes into reasonable
detail about CRLs and certificate requests, but there is a lot that isn't
covered (else this section would get quite long and complicated).

\subsection{Importing and Exporting Certificates}

Importing and exporting X.509 certificates is easy. Simply call the constructor
with either a \type{DataSource\&}, or the name of a file:

\begin{verbatim}
   X509_Certificate cert1("cert1.pem");

   /* This file contains two certificates, concatenated */
   DataSource_Stream in("certs2_and_3.pem");

   X509_Certificate cert2(in); // read the first cert
   X509_Certificate cert3(in); // read the second cert
\end{verbatim}

Exporting the certificate is a simple matter of calling the member function
\function{PEM\_encode}(), which returns a \type{std::string} containing the
certificate in PEM encoding.

\begin{verbatim}
   std::cout << cert3.PEM_encode();
   some_ostream_object << cert1.PEM_encode();
   std::string cert2_str = cert2.PEM_encode();
\end{verbatim}

\subsection{Verifying Certificates}

Verifying a certificate requires that we build up a chain of trust, starting
from the root (usually a commercial CA), down through some number of
intermediate CAs, and finally reaching the actual certificate in
question. Thus, to verify, we actually have to have all those certificates
on hand (or at the very least, know where we can get the ones we need).

The class which handles both storing certificates, and verifying them, is
called \type{X509\_Store}. We'll start by assuming that we have all the
certificates we need, and just want to verify a cert. This is done by calling
the member function \function{validate\_cert}, which takes the
\type{X509\_Certificate} in question, and an optional argument of type
\type{Cert\_Usage} (which is ignored here; read the section in the API doc
titled ``Verifying Certificates for information). It returns an enum;
\type{X509\_Code}, which, for most purposes, is either \type{VERIFIED}, or
something else (which specifies what circumstance caused the certificate to be
considered invalid). Really, that's it.

Now, how to let \type{X509\_Store} know about all those certificates and CRLs
we have lying around? The simplest method is to add them directly, using the
functions \function{add\_cert}, \function{add\_certs},
\function{add\_trusted\_certs}, and \function{add\_crl}; for details, consult
the API doc or read the \filename{x509stor.h} header. There is also a much more
elegant and powerful method: \type{Certificate\_Store}s. A certificate store
refers to an object that knows how to retrieve certificates from some external
source (a file, an LDAP directory, a HTTP server, a SQL database, or anything
else). By calling the function \function{add\_new\_certstore}, you can register
a new certificate store, which \type{X509\_Store} will use to find certificates
it needs. Thus, you can get away with only adding whichever root CA cert(s) you
want to use, letting some online source handle the storage of all intermediate
X.509 certificates. The API documentation has a more complete discussion of
\type{Certificate\_Store}.

\subsection{Setting up a CA}

WRITEME

\pagebreak

\section{Special Topics}

This chapter is for subjects that don't really fit into the API documentation
or into other chapters of the tutorial.

\subsection{GUIs}

There is nothing particularly special about using Botan in a GUI-based
application. However there are a few tricky spots, as well as a few ways to
take advantage of an event-based application.

\subsubsection{Initialization}

Generally you will create the \type{LibraryInitializer} somewhere in
\texttt{main}, before entering the event loop. One problem is that some GUI
libraries take over \texttt{main} and drop you right into the event loop; the
question then is how to initialize the library? The simplest way is probably to
have a static flag that marks if you have already initialized the library or
not. When you enter the event loop, check to see if this flag has not been set,
and if so, initialize the library using the function-based initializers. Using
\type{LibraryInitializer} obviously won't work in this case, since it would be
destroyed as soon as the current event handler finished. You then deinitialize
the library whenever your application is signaled to quit.

\subsubsection{Interacting With the Library}

In the simple case, the user will do stuff asynchronously, and then in response
your code will do things like encrypt a file or whatever, which can all be done
synchronously, since the data is right there for you. An application doing
something like this is basically going to look exactly like a command line
application that uses Botan, the only major difference being that the calls to
the library are inside event handlers.

Much harder is something like an SSH client, where you're acting as a layer
between two asynchronous things (the user and the network). This actually isn't
restricted to GUIs at all (text-mode SSH clients have to deal with most of the
same problems), but it is probably more common with a GUI app. The following
discussion is fairly vague, but hopefully somewhat useful.

There are a few facilities in Botan that are primarily designed to be used by
applications based on an event loop. See the section ``User Interfaces'' in the
main API doc for details.

\subsubsection{Entropy}

One nice advantage of using a GUI is opening a new method of gathering entropy
for the library. This is especially handy on Windows, where the available
sources of entropy are pretty questionable. In many versions,
\texttt{CryptGenRandom} is really rather poor, and the Toolhelp functions may
not provide much data on a small system (such as a handheld). For example, in
GTK+, you can use the following callback to get information about mouse
movements:

\begin{verbatim}
static gint add_entropy(GtkWidget* widget, GdkEventMotion* event)
   {
   if(event)
      Global_RNG::add_entropy(event, sizeof(GdkEventMotion));
   return FALSE;
   }
\end{verbatim}

And then register it with your main GTK window (presumably named
\variable{window}) as follows:

\begin{verbatim}
gtk_signal_connect(GTK_OBJECT(window), "motion_notify_event",
                   GTK_SIGNAL_FUNC(add_entropy), NULL);

gtk_widget_set_events(window, GDK_POINTER_MOTION_MASK);
\end{verbatim}

Even though we're catching all mouse movements, and hashing the results into
the entropy pool, this doesn't use up more than a few percent of even a
relatively slow desktop CPU. Note that in the case of using X over a network,
catching all mouse events would cause large amounts of X traffic over the
network, which might make your application slow, or even unusable (I haven't
tried it, though).

This could be made nicer if the collection function did something like
calculating deltas between each run, storing them into a buffer, and then when
enough of them have been added, hashing them and send them all to the PRNG in
one shot. This would not only reduce load, but also prevent the PRNG from
overestimating the amount of entropy it's getting, since its estimates don't
(can't) take history into account. For example, you could move the mouse back
and forth one pixel, and the PRNG would think it was getting a full load of
entropy each time, when actually it was getting (at best) a bit or two.

\end{document}