aboutsummaryrefslogtreecommitdiffstats
path: root/doc/passhash.txt
diff options
context:
space:
mode:
authorlloyd <[email protected]>2011-04-08 14:35:28 +0000
committerlloyd <[email protected]>2011-04-08 14:35:28 +0000
commit438f3eb73e494fcab82b239452d712bec06f48c9 (patch)
tree97c0c0cc826715a73688d2788045c804b832f57f /doc/passhash.txt
parent5c44330158b5cf9daaf43140c5cfd15f98c6f1e8 (diff)
Pile more doc updates
Diffstat (limited to 'doc/passhash.txt')
-rw-r--r--doc/passhash.txt152
1 files changed, 152 insertions, 0 deletions
diff --git a/doc/passhash.txt b/doc/passhash.txt
new file mode 100644
index 000000000..a7a18ebb4
--- /dev/null
+++ b/doc/passhash.txt
@@ -0,0 +1,152 @@
+
+.. _pbkdf:
+
+PBKDF Algorithms
+---------------------------------
+
+There are various procedures (usually ad-hoc) for turning a
+passphrase into a (mostly) arbitrary length key for a symmetric
+cipher. A general interface for such algorithms is presented in
+``pbkdf.h``. The main function is ``derive_key``, which
+takes a passphrase, a salt, an iteration count, and the desired length
+of the output key, and returns a key of that length, deterministically
+produced from the passphrase and salt. If an algorithm can't produce a
+key of that size, it will throw an exception (most notably, PKCS #5's
+PBKDF1 can only produce strings between 1 and $n$ bytes, where $n$ is
+the output size of the underlying hash function).
+
+The purpose of the iteration count is to make the algorithm take
+longer to compute the final key (reducing the speed of brute-force
+attacks of various kinds). Most standards recommend an iteration count
+of at least 10000. Currently defined PBKDF algorithms are
+"PBKDF1(digest)", "PBKDF2(digest)", and "OpenPGP-S2K(digest)"; you can
+retrieve any of these using the ``get_pbkdf``, found in
+``lookup.h``. As of this writing, "PBKDF2(SHA-256)" with 10000
+iterations and a 16 byte salt is recommend for new applications.
+
+OpenPGP S2K
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+There are some oddities about OpenPGP's S2K algorithms that are
+documented here. For one thing, it uses the iteration count in a
+strange manner; instead of specifying how many times to iterate the
+hash, it tells how many *bytes* should be hashed in total
+(including the salt). So the exact iteration count will depend on the
+size of the salt (which is fixed at 8 bytes by the OpenPGP standard,
+though the implementation will allow any salt size) and the size of
+the passphrase.
+
+To get what OpenPGP calls "Simple S2K", set iterations to 0, and do
+not specify a salt. To get "Salted S2K", again leave the iteration
+count at 0, but give an 8-byte salt. "Salted and Iterated S2K"
+requires an 8-byte salt and some iteration count (this should be
+significantly larger than the size of the longest passphrase that
+might reasonably be used; somewhere from 1024 to 65536 would probably
+be about right). Using both a reasonably sized salt and a large
+iteration count is highly recommended to prevent password guessing
+attempts.
+
+Password Hashing
+---------------------------------
+
+Storing passwords for user authentication purposes in plaintext is the
+simplest but least secure method; when an attacker compromises the
+database in which the passwords are stored, they immediately gain
+access to all of them. Often passwords are reused among multiple
+services or machines, meaning once a password to a single service is
+known an attacker has a substantial head start on attacking other
+machines.
+
+The general approach is to store, instead of the password, the output
+of a one way function of the password. Upon receiving an
+authentication request, the authenticator can recompute the one way
+function and compare the value just computed with the one that was
+stored. If they match, then the authentication request succeeds. But
+when an attacker gains access to the database, they only have the
+output of the one way function, not the original password.
+
+Common hash functions such as SHA-256 are one way, but used alone they
+have problems for this purpose. What an attacker can do, upon gaining
+access to such a stored password database, is hash common dictionary
+words and other possible passwords, storing them in a list. Then he
+can search through his list; if a stored hash and an entry in his list
+match, then he has found the password. Even worse, this can happen
+*offline*: an attacker can begin hashing common passwords days,
+months, or years before ever gaining access to the database. In
+addition, if two users choose the same password, the one way function
+output will be the same for both of them, which will be visible upon
+inspection of the database.
+
+There are two solutions to these problems: salting and
+iteration. Salting refers to including, along with the password, a
+randomly chosen value which perturbs the one way function. Salting can
+reduce the effectivness of offline dictionary generation (because for
+each potential password, an attacker would have to compute the one way
+function output for all possible salts - with a large enough salt,
+this can make the problem quite difficult). It also prevents the same
+password from producing the same output, as long as the salts do not
+collide. With a large salt (say 80 to 128 bits) this will be quite
+unlikely. Iteration refers to the general technique of forcing
+multiple one way function evaluations when computing the output, to
+slow down the operation. For instance if hashing a single password
+requires running SHA-256 100,000 times instead of just once, that will
+slow down user authentication by a factor of 100,000, but user
+authentication happens quite rarely, and usually there are more
+expensive operations that need to occur anyway (network and database
+I/O, etc). On the other hand, an attacker who is attempting to break a
+database full of stolen password hashes will be seriously
+inconvenienced by a factor of 100,000 slowdown; they will be able to
+only test at a rate of .0001% of what they would without iterations
+(or, equivalently, will require 100,000 times as many zombie botnet
+hosts).
+
+There are many different ways of doing this password hashing
+operation, with common ones including Unix's crypt (which is based on
+DES) and OpenBSD's bcrypt (based on Blowfish). Other variants using
+MD5 or SHA-256 are also in use on various systems.
+
+Botan provides two techniques, passhash9 and bcrypt
+
+Passhash9
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Botan provides a password hashing technique called passhash9, in
+``passhash9.h``, which is based on PBKDF2. A usage example can be
+found in ``doc/examples/passhash.cpp``. Three functions are provided
+in this header:
+
+.. cpp:function:: std::string generate_passhash9(const std::string& password, RandomNumberGenerator& rng, u16bit work_factor = 10)
+
+ Takes the password to hash, a rng, and a work factor, which tells
+ how many iterations to compute. The default work factor is 10
+ (which means 100,000 iterations), but any non-zero value is
+ accepted.
+
+.. cpp:function:: std::string generate_passhash9(const std::string& password, byte alg_id, RandomNumberGenerator& rng, u16bit work_factor = 10)
+
+ Like the other ``generate_passhash9``, but taking a parameter that
+ specifies which PRF to use. Currently defined values are 0
+ ("HMAC(SHA-1)"), 1 ("HMAC(SHA-256)"), and 2 ("CMAC(Blowfish)").
+
+.. cpp:function:: bool check_passhash9(const std::string& password, const std::string& hash)
+
+ Takes a password and a passhash9 output and checks if the password
+ is the same as the one that was used to generate the passhash9
+ output, returning a boolean true (same) or false (not same).
+
+Bcrypt
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Bcrypt is a password hashing scheme originally designed for use in
+OpenBSD, but numerous other implementations exist. It is made
+available by including ``bcrypt.h``, and provides the functions
+
+.. cpp:function:: std::string generate_bcrypt(const std::string& password, RandomNumberGenerator& rng, u16bit work_factor = 10)
+
+and
+
+.. cpp:function:: bool check_bcrypt(const std::string& password, const std::string& hash)
+
+These work in exactly the same way as the passhash9 password hashing
+functions.
+