.. _pbkdf: PBKDF Algorithms ======================================== There are various procedures (usually ad-hoc) for turning a passphrase into a (mostly) arbitrary length key for a symmetric cipher. A general interface for such algorithms is presented in ``pbkdf.h``. The main function is ``derive_key``, which takes a passphrase, a salt, an iteration count, and the desired length of the output key, and returns a key of that length, deterministically produced from the passphrase and salt. If an algorithm can't produce a key of that size, it will throw an exception (most notably, PKCS #5's PBKDF1 can only produce strings between 1 and $n$ bytes, where $n$ is the output size of the underlying hash function). The purpose of the iteration count is to make the algorithm take longer to compute the final key (reducing the speed of brute-force attacks of various kinds). Most standards recommend an iteration count of at least 10000. Currently defined PBKDF algorithms are "PBKDF1(digest)", "PBKDF2(digest)", and "OpenPGP-S2K(digest)"; you can retrieve any of these using the ``get_pbkdf``, found in ``lookup.h``. As of this writing, "PBKDF2(SHA-256)" with 10000 iterations and a 16 byte salt is recommend for new applications. OpenPGP S2K ---------------------------------------- There are some oddities about OpenPGP's S2K algorithms that are documented here. For one thing, it uses the iteration count in a strange manner; instead of specifying how many times to iterate the hash, it tells how many *bytes* should be hashed in total (including the salt). So the exact iteration count will depend on the size of the salt (which is fixed at 8 bytes by the OpenPGP standard, though the implementation will allow any salt size) and the size of the passphrase. To get what OpenPGP calls "Simple S2K", set iterations to 0, and do not specify a salt. To get "Salted S2K", again leave the iteration count at 0, but give an 8-byte salt. "Salted and Iterated S2K" requires an 8-byte salt and some iteration count (this should be significantly larger than the size of the longest passphrase that might reasonably be used; somewhere from 1024 to 65536 would probably be about right). Using both a reasonably sized salt and a large iteration count is highly recommended to prevent password guessing attempts. Password Hashing ======================================== Storing passwords for user authentication purposes in plaintext is the simplest but least secure method; when an attacker compromises the database in which the passwords are stored, they immediately gain access to all of them. Often passwords are reused among multiple services or machines, meaning once a password to a single service is known an attacker has a substantial head start on attacking other machines. The general approach is to store, instead of the password, the output of a one way function of the password. Upon receiving an authentication request, the authenticator can recompute the one way function and compare the value just computed with the one that was stored. If they match, then the authentication request succeeds. But when an attacker gains access to the database, they only have the output of the one way function, not the original password. Common hash functions such as SHA-256 are one way, but used alone they have problems for this purpose. What an attacker can do, upon gaining access to such a stored password database, is hash common dictionary words and other possible passwords, storing them in a list. Then he can search through his list; if a stored hash and an entry in his list match, then he has found the password. Even worse, this can happen *offline*: an attacker can begin hashing common passwords days, months, or years before ever gaining access to the database. In addition, if two users choose the same password, the one way function output will be the same for both of them, which will be visible upon inspection of the database. There are two solutions to these problems: salting and iteration. Salting refers to including, along with the password, a randomly chosen value which perturbs the one way function. Salting can reduce the effectivness of offline dictionary generation (because for each potential password, an attacker would have to compute the one way function output for all possible salts - with a large enough salt, this can make the problem quite difficult). It also prevents the same password from producing the same output, as long as the salts do not collide. With a large salt (say 80 to 128 bits) this will be quite unlikely. Iteration refers to the general technique of forcing multiple one way function evaluations when computing the output, to slow down the operation. For instance if hashing a single password requires running SHA-256 100,000 times instead of just once, that will slow down user authentication by a factor of 100,000, but user authentication happens quite rarely, and usually there are more expensive operations that need to occur anyway (network and database I/O, etc). On the other hand, an attacker who is attempting to break a database full of stolen password hashes will be seriously inconvenienced by a factor of 100,000 slowdown; they will be able to only test at a rate of .0001% of what they would without iterations (or, equivalently, will require 100,000 times as many zombie botnet hosts). Botan provides two techniques for password hashing, bcrypt and passhash9. Bcrypt ---------------------------------------- Bcrypt is a password hashing scheme originally designed for use in OpenBSD, but numerous other implementations exist. It is made available by including ``bcrypt.h``. Bcrypt provides outputs that look like this:: "$2a$12$7KIYdyv8Bp32WAvc.7YvI.wvRlyVn0HP/EhPmmOyMQA4YKxINO0p2" .. cpp:function:: std::string generate_bcrypt(const std::string& password, RandomNumberGenerator& rng, u16bit work_factor = 10) Takes the password to hash, a rng, and a work factor. Higher values increase the amount of time the algorithm runs, increasing the cost of cracking attempts. The resulting hash is returned as a string. .. cpp:function:: bool check_bcrypt(const std::string& password, const std::string& hash) Takes a password and a bcrypt output and returns true if the password is the same as the one that was used to generate the bcrypt hash. Here is an example of using bcrypt: .. literalinclude: examples/bcrypt.cpp Passhash9 ---------------------------------------- Botan also provides a password hashing technique called passhash9, in ``passhash9.h``, which is based on PBKDF2. .. cpp:function:: std::string generate_passhash9(const std::string& password, RandomNumberGenerator& rng, u16bit work_factor = 10) Functions much like ``generate_bcrypt`` .. cpp:function:: std::string generate_passhash9(const std::string& password, byte alg_id, RandomNumberGenerator& rng, u16bit work_factor = 10) Like the other ``generate_passhash9``, but taking a parameter that specifies which PRF to use. Currently defined values are 0 ("HMAC(SHA-1)"), 1 ("HMAC(SHA-256)"), and 2 ("CMAC(Blowfish)"). .. cpp:function:: bool check_passhash9(const std::string& password, const std::string& hash) Functions much like ``check_bcrypt``