Update compression docs

author: lloyd <[email protected]> 2015-05-10 02:39:38 +0000
committer: lloyd <[email protected]> 2015-05-10 02:39:38 +0000
commit: 9426f6d0f4a760c555379c3af642127df7e1456e (patch)
tree: 54dd0e89752d403adbe3393c1e569ec8bfb8bc01 /doc/manual
parent: a08c16ef5f5fa85ab8b46c2fcbeca2c1b40fa339 (diff)
2 files changed, 52 insertions, 42 deletions
diff --git a/doc/manual/compression.rst b/doc/manual/compression.rst
new file mode 100644
index 000000000..c58ba58a6
--- /dev/null
+++ b/doc/manual/compression.rst
@@ -0,0 +1,52 @@
+Lossless Data Compression
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Some lossless data compression algorithms are available in botan, currently all
+via third party libraries - these include zlib (including deflate and gzip
+formats), bzip2, and lzma.
+
+.. note::
+   You should always compress *before* you encrypt, because encryption seeks to
+   hide the redundancy that compression is supposed to try to find and remove.
+
+All compressors provide the `Transform` interface through a subclass
+`Compression_Transform` (defined in compression.h). The compression algorithms
+have some limitations in terms of the standard API, in particular the
+`output_length` function simply throws an exception since the value cannot be
+determined merely from the input length for such an algorithm.
+
+The transformations work much like any other - calling `update` on a vector
+returns the (de)compressed result, calling `finish` completes the computation.
+All (de)compression algorithms will accept inputs of any size
+(update_granularity is 1) and do not require any final data be saved to be
+passed to `finish`.
+
+On `Compression_Transform` an additional function function `flush` is available
+which (in addition to always acting as equivalent to an `update`) signals the
+compression function to flush as much output as possible immediately, regardless
+of considerations of compression ratio. Any compressor or decompressor may
+ignore this and treat it as equivalent to a normal update.
+
+The easiest way to get a compressor is via the functions
+
+.. cpp:function:: Compression_Transform* make_compressor(std::string type, size_t level)
+.. cpp:function:: Compression_Transform* make_decompressor(std::string type)
+
+Supported values for `type` include `zlib` (raw zlib with no checksum),
+`deflate` (zlib's deflate format), `gzip`, `bz2`, and `lzma`. A null pointer
+will be returned if the algorithm is unavailable. The meaning of the `level`
+parameter varies by the algorithm but generally takes a value between 1 and 9,
+with higher values implying typically better compression from and more memory
+and/or CPU time consumed by the compression process. The decompressor can always
+handle input from any compressor.
+
+As with any consumer of complex formats, a decompressor may throw an exception
+(from either `update` or `finish`) if the input is invalid or corrupt.
+
+To use a compression algorithm in a `Pipe` use the adaptor types
+`Compression_Filter` and `Decompression_Filter` from `comp_filter.h`. The
+constructors of both filters take a `std::string` argument (passed to
+`make_compressor` or `make_decompressor`), the compression filter also takes a
+`level` parameter. Finally both constructors have a parameter `buf_sz` which
+specifies the size of the internal buffer that will be used - inputs will be
+broken into blocks of this size. The default is 4096.
diff --git a/doc/manual/filters.rst b/doc/manual/filters.rst
index e8016eac7..bd73739af 100644
--- a/doc/manual/filters.rst
+++ b/doc/manual/filters.rst
@@ -693,48 +693,6 @@ letters for its output.
 You can find the declarations for these types in ``hex_filt.h`` and
 ``b64_filt.h``.
 
-Compressors
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-There are two compression algorithms supported by Botan, zlib and
-bzip2. Only lossless compression algorithms are currently supported by
-Botan, because they tend to be the most useful for
-cryptography. However, it is very reasonable to consider supporting
-something like GSM speech encoding (which is lossy), for use in
-encrypted voice applications.
-
-You should always compress *before* you encrypt, because encryption seeks
-to hide the redundancy that compression is supposed to try to find and remove.
-
-To test for Bzip2, check to see if ``BOTAN_HAS_COMPRESSOR_BZIP2`` is
-defined. If so, you can include ``botan/bzip2.h``, which will declare
-a pair of ``Filter`` objects: ``Bzip2_Compression`` and
-``Bzip2_Decompression``.
-
-You should be prepared to take an exception when using the
-decompressing filter, for if the input is not valid bzip2 data, that
-is what you will receive. You can specify the desired level of
-compression to ``Bzip2_Compression``'s constructor as an integer
-between 1 and 9, 1 meaning worst compression, and 9 meaning the
-best. The default is to use 9, since small values take the same amount
-of time, just use a little less memory.
-
-Zlib compression works much like Bzip2 compression. The only
-differences in this case are that the macro is
-``BOTAN_HAS_COMPRESSOR_ZLIB``, the header you need to include is
-called ``botan/zlib.h`` (remember that you shouldn't just ``#include
-<zlib.h>``, or you'll get the regular zlib API, which is not what you
-want). The Botan classes for zlib compression/decompression are called
-``Zlib_Compression`` and ``Zlib_Decompression``.
-
-Like Bzip2, a ``Zlib_Decompression`` object will throw an exception if
-invalid (in the sense of not being in the Zlib format) data is passed
-into it.
-
-While the zlib compression library uses the same compression algorithm
-as the gzip and zip programs, the format is different. The zlib format
-is defined in RFC 1950.
-
 Writing New Filters
 ---------------------------------
author	lloyd <[email protected]>	2015-05-10 02:39:38 +0000
committer	lloyd <[email protected]>	2015-05-10 02:39:38 +0000
commit	9426f6d0f4a760c555379c3af642127df7e1456e (patch)
tree	54dd0e89752d403adbe3393c1e569ec8bfb8bc01 /doc/manual
parent	a08c16ef5f5fa85ab8b46c2fcbeca2c1b40fa339 (diff)