.. _public_key_crypto:

Public Key Cryptography
=================================

Public key cryptography (also called assymmetric cryptography) is a
collection of techniques allowing for encryption, signatures, and key
agreement.

Key Objects
----------------------------------------

Public and private keys are represented by classes ``Public_Key`` and
it's subclass ``Private_Key``. The use of inheritence here means that
a ``Private_Key`` can be converted into a reference to a public key.

None of the functions on ``Public_Key`` and ``Private_Key`` itself are
particularly useful for users of the library, because 'bare' public key
operations are *very insecure*. The only purpose of these functions is
to provide a clean interface that higher level operations can be built
on. So really the only thing you need to know is that when a function
takes a reference to a ``Public_Key``, it can take any public key or
private key, and similiarly for ``Private_Key``.

Types of ``Public_Key`` include ``RSA_PublicKey``, ``DSA_PublicKey``,
``ECDSA_PublicKey``, ``DH_PublicKey``, ``ECDH_PublicKey``,
``RW_PublicKey``, ``NR_PublicKey``,, and ``GOST_3410_PublicKey``.
There are cooresponding ``Private_Key`` classes for each of these
algorithms.

.. _creating_new_private_keys:

Creating New Private Keys
----------------------------------------

Creating a new private key requires two things: a source of
random numbers (see :ref:`random_number_generators`) and some
algorithm specific parameters that define the *security level*
of the resulting key. For instance, the security level of an RSA
key is (at least in part) defined by the length of the public key
modulus in bits. So to create a new RSA private key, you would call

.. cpp:function:: RSA_PrivateKey::RSA_PrivateKey(RandomNumberGenerator& rng, size_t bits)

  A constructor that creates a new random RSA private key with a modulus
  of length *bits*.

Algorithms based on the discrete-logarithm problem uses what is called
a *group*; a group can safely be used with many keys, and for some
operations, like key agreement, the two keys *must* use the same
group.  There are currently two kinds of discrete logarithm groups
supported in botan: the integers modulo a prime, represented by
:ref:`dl_group`, and elliptic curves in GF(p), represented by
:ref:`ec_group`. A rough generalization is that the larger the group
is, the more secure the algorithm is, but coorespondingly the slower
the operations will be.

Given a ``DL_Group``, you can create new DSA, Diffie-Hellman, and
Nyberg-Rueppel key pairs with

.. cpp:function:: DSA_PrivateKey::DSA_PrivateKey(RandomNumberGenerator& rng, const DL_Group& group, const BigInt& x = 0)

.. cpp:function:: DH_PrivateKey::DH_PrivateKey(RandomNumberGenerator& rng, const DL_Group& group, const BigInt& x = 0)

.. cpp:function:: NR_PrivateKey::NR_PrivateKey(RandomNumberGenerator& rng, const DL_Group& group, const BigInt& x = 0)

.. cpp:function:: ElGamal_PrivateKey::ElGamal_PrivateKey(RandomNumberGenerator& rng, const DL_Group& group, const BigInt& x = 0)

  The optional *x* parameter to each of these contructors is a private
  key value. This allows you to create keys where the private key is
  formed by some special technique; for instance you can use the hash
  of a password (see :ref:`pbkdf` for how to do that) as a private key
  value. Normally, you would leave the value as zero, letting the
  class generate a new random key.

Finally, given an ``EC_Group`` object, you can create a new
ECDSA, ECDH, or GOST 34.10 private key with

.. cpp:function:: ECDSA_PrivateKey::ECDSA_PrivateKey(RandomNumberGenerator& rng, const EC_Group& domain, const BigInt& x = 0)

.. cpp:function:: ECDH_PrivateKey::ECDH_PrivateKey(RandomNumberGenerator& rng, const EC_Group& domain, const BigInt& x = 0)

.. cpp:function:: GOST_3410_PrivateKey::GOST_3410_PrivateKey(RandomNumberGenerator& rng, const EC_Group& domain, const BigInt& x = 0)

.. _serializing_private_keys:

Serializing Private Keys Using PKCS #8
----------------------------------------

The standard format for serializing a private key is PKCS #8, the
operations for which are defined in ``pkcs8.h``. It supports both
unencrypted and encrypted storage.

.. cpp:function:: SecureVector<byte> PKCS8::BER_encode(const Private_Key& key, RandomNumberGenerator& rng, const std::string& password, const std::string& pbe_algo = "")

  Takes any private key object, serializes it, encrypts it using
  *password*, and returns a binary structure representing the private
  key.

  The final (optional) argument, *pbe_algo*, specifies a particular
  password based encryption (or PBE) algorithm. If you don't specify a
  PBE, a sensible default will be used.

.. cpp:function:: std::string PKCS8::PEM_encode(const Private_Key& key, RandomNumberGenerator& rng, const std::string& pass, const std::string& pbe_algo = "")

  This formats the key in the same manner as ``BER_encode``, but
  additionally encodes it into a text format with identifying
  headers. Using PEM encoding is *highly* recommended for many
  reasons, including compatibility with other software, for
  transmission over 8-bit unclean channels, because it can be
  identified by a human without special tools, and because it
  sometimes allows more sane behavior of tools that process the data.

Unencrypted serialization is also supported.

.. warning::

  In most situations, using unecrypted private key storage is a bad
  idea, because anyone can come along and grab the private key without
  having to know any passwords or other secrets. Unless you have very
  particular security requirements, always use the versions that
  encrypt the key based on a passphrase, described above.

.. cpp:function:: SecureVector<byte> PKCS8::BER_encode(const Private_Key& key)

  Serializes the private key and returns the result.

.. cpp:function:: std::string PKCS8::PEM_encode(const Private_Key& key)

  Serializes the private key, base64 encodes it, and returns the
  result.

Last but not least, there are some functions that will load (and
decrypt, if necessary) a PKCS #8 private key:

.. cpp:function:: Private_Key* PKCS8::load_key(DataSource& in, RandomNumberGenerator& rng, const User_Interface& ui)

.. cpp:function:: Private_Key* PKCS8::load_key(DataSource& in, RandomNumberGenerator& rng, std::string passphrase = "")

.. cpp:function:: Private_Key* PKCS8::load_key(const std::string& filename, RandomNumberGenerator& rng, const User_Interface& ui)

.. cpp:function:: Private_Key* PKCS8::load_key(const std::string& filename, RandomNumberGenerator& rng, const std::string& passphrase = "")

These functions will return an object allocated key object based on
the data from whatever source it is using (assuming, of course, the
source is in fact storing a representation of a private key, and the
decryption was sucessful). The encoding used (PEM or BER) need not be
specified; the format will be detected automatically. The key is
allocated with ``new``, and should be released with ``delete`` when
you are done with it. The first takes a generic ``DataSource`` that
you have to create - the other is a simple wrapper functions that take
either a filename or a memory buffer and create the appropriate
``DataSource``.

The versions that pass the passphrase as a ``std::string`` are
primarily for compatibility, but they are useful in limited
circumstances. The ``User_Interface`` versions are how ``load_key`` is
implemented, and provides for much more flexibility. If the passphrase
passed in is not correct, then an exception is thrown and that is
that. However, if you pass in an UI object, then the UI object can
keep asking the user for the passphrase until they get it right (or
until they cancel the action, though the UI interface). A
``User_Interface`` has very little to do with talking to users; it's
just a way to glue together Botan and whatever user interface you
happen to be using. You can think of it as a user interface
interface. The default ``User_Interface`` is rather dumb, and acts
rather like the versions taking the ``std::string``; it tries the
passphrase passed in first, and then it cancels.

.. note::

  In a future version, it is likely that ``User_Interface`` will be
  replaced by a simple callback using ``std::function``.

Serializing Public Keys
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

To import and export public keys, use:

.. cpp:function:: MemoryVector<byte> X509::BER_encode(const Public_Key& key)

.. cpp:function:: std::string X509::PEM_encode(const Public_Key& key)

.. cpp:function:: Public_Key* X509::load_key(DataSource& in)

.. cpp:function:: Public_Key* X509::load_key(const SecureVector<byte>& buffer)

.. cpp:function:: Public_Key* X509::load_key(const std::string& filename)

  These functions operate in the same way as the ones described in
  :ref:`serializing_private_keys`, except that no encryption option is
  availabe.

.. _dl_group:

DL_Group
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

As described in :ref:`creating_new_private_keys`, a discrete logarithm
group can be shared among many keys, even keys created by users who do
not trust each other. However, it is necessary to trust the entity who
created the group; that is why organization like NIST use algorithms
which generate groups in a deterministic way such that creating a
bogus group would require breaking some trusted cryptographic
primitive like SHA-2.

Instantiating a ``DL_Group`` simply requires calling

.. cpp:function:: DL_Group::DL_Group(const std::string& name)

  The *name* parameter is a specially formatted string that consists
  of three things, the type of the group ("modp" or "dsa"), the
  creator of the group, and the size of the group in bits, all
  delimited by '/' characters.

  Currently all "modp" groups included in botan are ones defined by
  the Internet Engineering Task Force, so the provider is "ietf", and
  the strings look like "modp/ietf/N" where N can be any of 768, 1024,
  1536, 2048, 3072, 4096, 6144, or 8192. This group type is used
  for Diffie-Hellman and ElGamal algorithms.

  The other type, "dsa" is used for DSA and Nyberg-Rueppel keys.  They
  can also be used with Diffie-Hellman and ElGamal, but this is less
  common. The currently available groups are "dsa/jce/N" for N in 512,
  768, or 1024, and "dsa/botan/N" with N being 2048 or 3072.  The
  "jce" groups are the standard DSA groups used in the Java
  Cryptography Extensions, while the "botan" groups were randomly
  generated using the FIPS 186-3 algorithm by the library maintainers.

You can generate a new random group using

.. cpp:function:: DL_Group::DL_Group(RandomNumberGenerator& rng, PrimeType type, size_t pbits, size_t qbits = 0)

  The *type* can be either ``Strong``, ``Prime_Subgroup``, or
  ``DSA_Kosherizer``. *pbits* specifies the size of the prime in
  bits. If the *type* is ``Prime_Subgroup`` or ``DSA_Kosherizer``,
  then *qbits* specifies the size of the subgroup.

You can serialize a ``DL_Group`` using

.. cpp:function:: SecureVector<byte> DL_Group::DER_Encode(Format format)

or

.. cpp:function:: std::string DL_Group::PEM_encode(Format format)

where *format* is any of

* ``ANSI_X9_42`` (or ``DH_PARAMETERS``) for modp groups
* ``ANSI_X9_57`` (or ``DSA_PARAMETERS``) for DSA-style groups
* ``PKCS_3`` is an older format for modp groups; it should only
  be used for backwards compatability.

You can reload a serialized group using

.. cpp:function:: void DL_Group::BER_decode(DataSource& source, Format format)

.. cpp:function:: void DL_Group::PEM_decode(DataSource& source)

.. _ec_group:

EC_Group
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

An ``EC_Group`` is initialized by passing the name of the
group to be used to the constructor. These groups have
semi-standardized names like "secp256r1" and "brainpool512r1".

Key Checking
---------------------------------

Most public key algorithms have limitations or restrictions on their
parameters. For example RSA requires an odd exponent, and algorithms
based on the discrete logarithm problem need a generator $> 1$.

Each public key type has a function

.. cpp:function:: bool Public_Key::check_key(RandomNumberGenerator& rng, bool strong)

  This function performs a number of algorithm-specific tests that the
  key seems to be mathematically valid and consistent, and returns
  true if all of the tests pass.

  It does not have anything to do with the validity of the key for any
  particular use, nor does it have anything to do with certificates
  that link a key (which, after all, is just some numbers) with a user
  or other entity. If *strong* is ``true``, then it does "strong"
  checking, which includes expensive operations like primality
  checking.

Encryption
---------------------------------

Safe public key encryption requires the use of a padding scheme which
hides the underlying mathematical properties of the algorithm.
Additionally, they will add randomness, so encrypting the same
plaintext twice produces two different ciphertexts.

The primary interface for encryption is ``PK_Encryptor``, which
provides the following interface:

.. cpp:function:: SecureVector<byte> PK_Encryptor::encrypt(const byte* in, size_t length, RandomNumberGenerator& rng) const

.. cpp:function:: SecureVector<byte> PK_Encryptor::encrypt(const MemoryRegion<byte>& in, RandomNumberGenerator& rng) const


.. cpp:function::  size_t PK_Encryptor::maximum_input_size() const

   This function returns the maximum size of the message that can
   be processed, in bytes. If you call ``encrypt`` with a value
   larger than this the operation will fail with an exception.

``PK_Encryptor`` is only an interface; to use one you have to create
an implementation; there are currently two availabie in the library,
``PK_Encryptor_EME`` and ``DLIES_Encryptor``. DLIES is a standard
method (from IEEE 1363) that uses a key agreement technique such as DH
or ECDH to perform message encryption. Normally, public key encryption
is done using algorithms which support it directly, such as RSA or
ElGamal; these use ``PK_Encryptor_EME``. The construction method is
simple; call

.. cpp:function:: PK_Encryptor_EME::PK_Encryptor_EME(const Public_Key& key, std::string eme)

  With *key* being the key you want to encrypt messages to. The
  padding method to use is specified in *eme*.

  The recommended values for *eme* is "EME1(SHA-1)" or
  "EME1(SHA-256)". If you need compatability with protocols using the
  PKCS #1 v1.5 standard, you can also use "EME-PKCS1-v1_5".

The DLIES encryptor is defined in the header ``dlies.h``, and
is created by the constructor:

.. cpp:function:: DLIES_Encryptor::DLIES_Encryptor(const PK_Key_Agreement_Key&, KDF* kdf, MessageAuthenticationCode* mac, size_t mac_key_len = 20)

  Where *kdf* is a :ref:`key_derivation_function` and *mac* is a
  :ref:`message_auth_code`.

The decryption classes are named ``PK_Decryptor``,
``PK_Decryptor_EME``, and ``DLIES_Decryptor``. They are created in the
exact same way, except they take the private key, and the processing
function is named ``decrypt``.


Signatures
---------------------------------


The signature algorithms look quite a bit like the hash functions. You
can repeatedly call ``update``, giving more and more of a message you
wish to sign, and then call ``signature``, which will return a
signature for that message. If you want to do it all in one shot, call
``sign_message``, which will just call ``update`` with its argument
and then return whatever ``signature`` returns. Generating a signature
requires random numbers with some schemes, so ``signature`` and
``sign_message`` both take a ``RandomNumberGenerator&``.

You can validate a signature by updating the verifier class, and
finally seeing the if the value returned from ``check_signature`` is
true (you pass the supposed signature to the ``check_signature``
function as a byte array and a length or as a
``MemoryRegion<byte>``). There is another function,
``verify_message``, which takes a pair of byte array/length pairs (or
a pair of ``MemoryRegion<byte>`` objects), the first of which is the
message, the second being the (supposed) signature. It returns true if
the signature is valid and false otherwise.

Available public key signature algorithms in Botan are RSA, DSA,
ECDSA, GOST-34.11, Nyberg-Rueppel, and Rabin-Williams. Signature
encoding methods include EMSA1, EMSA2, EMSA3, EMSA4, and Raw. All of
them, except Raw, take a parameter naming a message digest function to
hash the message with. The Raw encoding signs the input directly; if
the message is too big, the signing operation will fail. Raw is not
useful except in very specialized applications.

There are various interactions that make certain encoding schemes and
signing algorithms more or less useful.

EMSA2 is the usual method for encoding Rabin-William signatures, so
for compatibility with other implementations you may have to use
that. EMSA4 (also called PSS), also works with Rabin-Williams. EMSA1
and EMSA3 do *not* work with Rabin-Williams.

RSA can be used with any of the available encoding methods. EMSA4 is
by far the most secure, but is not (as of now) widely
implemented. EMSA3 (also called "EMSA-PKCS1-v1_5") is commonly used
with RSA (for example in SSL). EMSA1 signs the message digest
directly, without any extra padding or encoding. This may be useful,
but is not as secure as either EMSA3 or EMSA4. EMSA2 may be used but
is not recommended.

For DSA, ECDSA, GOST-34.11, and Nyberg-Rueppel, you should use
EMSA1. None of the other encoding methods are particularly useful for
these algorithms.

Key Agreement
---------------------------------

You can get a hold of a ``PK_Key_Agreement_Scheme`` object by calling
``get_pk_kas`` with a key that is of a type that supports key
agreement (such as a Diffie-Hellman key stored in a ``DH_PrivateKey``
object), and the name of a key derivation function. This can be "Raw",
meaning the output of the primitive itself is returned as the key, or
"KDF1(hash)" or "KDF2(hash)" where "hash" is any string you happen to
like (hopefully you like strings like "SHA-256" or "RIPEMD-160"), or
"X9.42-PRF(keywrap)", which uses the PRF specified in ANSI X9.42. It
takes the name or OID of the key wrap algorithm that will be used to
encrypt a content encryption key.

How key agreement works is that you trade public values with some
other party, and then each of you runs a computation with the other's
value and your key (this should return the same result to both
parties). This computation can be called by using
``derive_key`` with either a byte array/length pair, or a
``SecureVector<byte>`` than holds the public value of the other
party. The last argument to either call is a number that specifies how
long a key you want.

Depending on the KDF you're using, you *might not* get back a key
of the size you requested. In particular "Raw" will return a number
about the size of the Diffie-Hellman modulus, and KDF1 can only return
a key that is the same size as the output of the hash. KDF2, on the
other hand, will always give you a key exactly as long as you request,
regardless of the underlying hash used with it. The key returned is a
``SymmetricKey``, ready to pass to a block cipher, MAC, or other
symmetric algorithm.

The public value that should be used can be obtained by calling
``public_data``, which exists for any key that is associated with a
key agreement algorithm. It returns a ``SecureVector<byte>``.

"KDF2(SHA-256)" is by far the preferred algorithm for key derivation
in new applications. The X9.42 algorithm may be useful in some
circumstances, but unless you need X9.42 compatibility, KDF2 is easier
to use.

An example of using Diffie-Hellman:

.. literalinclude:: examples/dh.cpp