Update side channel doc

author: Jack Lloyd <[email protected]> 2018-03-20 18:24:47 -0400
committer: Jack Lloyd <[email protected]> 2018-03-20 18:24:47 -0400
commit: eb5bc1b1ea573c44efe58b18ed07a82db88029fb (patch)
tree: b640aa4c2859b35e6fdf3f4bb2ae5d13691c99fb /doc/manual/side_channels.rst
parent: c53f4fa8686a59f2360e4609f4fc9cdc66ce00d1 (diff)
1 files changed, 34 insertions, 29 deletions
diff --git a/doc/manual/side_channels.rst b/doc/manual/side_channels.rst
index 01d20a66d..6d6bd74bb 100644
--- a/doc/manual/side_channels.rst
+++ b/doc/manual/side_channels.rst
@@ -128,16 +128,20 @@ protocols ([InvalidCurve] [InvalidCurveTLS]). See point_gfp.cpp.
 ECC scalar multiply
 ----------------------
 
-There are two implementations of scalar multiply, PointGFp::operator* and the
-class Blinded_Point_Multiply. The default scalar multiply uses the Montgomery
-ladder. However it currently leaks the size of the scalar, because the loop
-iterations are bounded by the scalar size.
+There are several different implementations of ECC scalar multiplications which
+depend on the API invoked. This include ``PointGFp::operator*``,
+``EC_Group::blinded_base_point_multiply`` and
+``EC_Group::blinded_var_point_multiply``.
 
-Blinded_Point_Multiply (used by ECDH, ECDSA, etc) applies several additional
-side channel countermeasures. The scalar is masked by a small multiple of the
-group order (this is commonly called Coron's first countermeasure [CoronDpa]),
-the size of the scalar mask is currently controlled by build.h value
-BOTAN_POINTGFP_SCALAR_BLINDING_BITS which defaults to 20 bits.
+The ``PointGFp::operator*`` implementation uses the Montgomery ladder, which is
+fairly resistant to side channels. However it leaks the size of the scalar,
+because the loop iterations are bounded by the scalar size. It should not be
+used in cases when the scalar is a secret.
+
+Both ``blinded_base_point_multiply`` and ``blinded_var_point_multiply`` apply
+side channel countermeasures. The scalar is masked by a multiple of the group
+order (this is commonly called Coron's first countermeasure [CoronDpa]),
+currently the multiplier is an 80 bit integer.
 
 Botan stores all ECC points in Jacobian representation. This form allows faster
 computation by representing points (x,y) as (X,Y,Z) where x=X/Z^2 and
@@ -147,16 +151,17 @@ attacker from mounting attacks based on the input point remaining unchanged over
 multiple executions. This is commonly called Coron's third countermeasure, see
 again [CoronDpa].
 
-Currently Blinded_Point_Multiply uses one of two different algorithms, depending
-on a build-time flag. If BOTAN_POINTGFP_BLINDED_MULTIPLY_USE_MONTGOMERY_LADDER
-is set in build.h (default is for it *not* to be set), then a randomized
-Montgomery ladder algorithm from [RandomMonty] is used. Otherwise, a simple
-fixed window exponentiation is used; the current version leaks exponent bits
-through memory index values. We rely on scalar blinding to reduce this
-leakage. It would obviously be better for Blinded_Point_Multiply to converge on
-a single side channel silent algorithm.
+The base point multiplication algorithm is a comb-like technique which
+precomputes ``P^i,(2*P)^i,(3*P)^i`` for all ``i`` in the range of valid scalars.
+This means the scalar multiplication involves only point additions and no
+doublings, which may help against attacks which rely on distinguishing between
+point doublings and point additions.
+
+The variable point multiplication algorithm uses a simple fixed-window
+exponentiation algorithm. Since this is normally invoked using untrusted points
+(eg in ECDH key exchange) it randomizes all inputs.
 
-See point_gfp.cpp.
+See point_gfp.cpp and point_mul.cpp
 
 ECDH
 ----------------------
@@ -177,7 +182,7 @@ are performed using a constant time algorithm due to Niels Möller.
 x25519
 ----------------------
 
-The x25519 code is independent of the main Weiserstrass form ECC code, instead
+The x25519 code is independent of the main Weierstrass form ECC code, instead
 based on curve25519-donna-c64.c by Adam Langley. The code seems immune to cache
 based side channels. It does make use of integer multiplications; on some old
 CPUs these multiplications take variable time and might allow a side channel
@@ -221,14 +226,14 @@ It is based on code by Mike Hamburg [VectorAes], see aes_ssse3.cpp. This same
 technique could be applied with NEON or AltiVec, and the paper suggests some
 optimizations for the AltiVec shuffle.
 
-On all other processors, a class 4K table lookup version based on the original
-Rijndael code is used. This approach relatively fast, but now known to be very
-vulnerable to side channels. The implementation does make modifications in the
-first and last rounds to reduce the cache signature, but these merely increase
-the number of observations required. See [AesCacheColl] for one paper which
-analyzes a number of implementations including Botan. Botan already follows both
-of their suggested countermeasures, which increased the number of samples
-required from 2**13 to the only slightly less pitiful 2**19 samples.
+On all other processors, a table lookup version derived from the original
+Rijndael code is used. This approach is relatively fast, but now known to be
+very vulnerable to side channels. To reduce the side channel signature, it uses
+only a 1K table (instead of 4 1K tables which is typical) and uses small tables
+in the first and last rounds. See [AesCacheColl] for one paper which analyzes a
+number of implementations including (an older version of) Botan. Botan already
+follows both of their suggested countermeasures, which increased the number of
+samples required from 2**13 to the only slightly less pitiful 2**19 samples.
 
 The Botan block cipher API already supports bitslicing implementations, so a
 const time 8x bitsliced AES could be integrated fairly easily.
@@ -236,11 +241,11 @@ const time 8x bitsliced AES could be integrated fairly easily.
 GCM
 ---------------------
 
-On platforms that support a carryless multiply instruction (recent x86 and ARM),
+On platforms that support a carryless multiply instruction (ARMv8 and recent x86),
 GCM is fast and constant time.
 
 On all other platforms, GCM uses a slow but constant time algorithm. There is
-also an SSSE3 variant of the same (still slow) algorithm.
+also an SSSE3 variant of the same (still relatively slow) algorithm.
 
 OCB
 -----------------------
author	Jack Lloyd <[email protected]>	2018-03-20 18:24:47 -0400
committer	Jack Lloyd <[email protected]>	2018-03-20 18:24:47 -0400
commit	eb5bc1b1ea573c44efe58b18ed07a82db88029fb (patch)
tree	b640aa4c2859b35e6fdf3f4bb2ae5d13691c99fb /doc/manual/side_channels.rst
parent	c53f4fa8686a59f2360e4609f4fc9cdc66ce00d1 (diff)