contrib/python-zstandard/README.rst
changeset 31796 e0dc40530c5a
parent 30895 c32454d69b85
child 37495 b1fb341d8a61
equal deleted inserted replaced
31795:2b130e26c3a4 31796:e0dc40530c5a
    18 
    18 
    19 State of Project
    19 State of Project
    20 ================
    20 ================
    21 
    21 
    22 The project is officially in beta state. The author is reasonably satisfied
    22 The project is officially in beta state. The author is reasonably satisfied
    23 with the current API and that functionality works as advertised. There
    23 that functionality works as advertised. **There will be some backwards
    24 may be some backwards incompatible changes before 1.0. Though the author
    24 incompatible changes before 1.0, probably in the 0.9 release.** This may
    25 does not intend to make any major changes to the Python API.
    25 involve renaming the main module from *zstd* to *zstandard* and renaming
       
    26 various types and methods. Pin the package version to prevent unwanted
       
    27 breakage when this change occurs!
    26 
    28 
    27 This project is vendored and distributed with Mercurial 4.1, where it is
    29 This project is vendored and distributed with Mercurial 4.1, where it is
    28 used in a production capacity.
    30 used in a production capacity.
    29 
    31 
    30 There is continuous integration for Python versions 2.6, 2.7, and 3.3+
    32 There is continuous integration for Python versions 2.6, 2.7, and 3.3+
    31 on Linux x86_x64 and Windows x86 and x86_64. The author is reasonably
    33 on Linux x86_x64 and Windows x86 and x86_64. The author is reasonably
    32 confident the extension is stable and works as advertised on these
    34 confident the extension is stable and works as advertised on these
    33 platforms.
    35 platforms.
    34 
    36 
       
    37 The CFFI bindings are mostly feature complete. Where a feature is implemented
       
    38 in CFFI, unit tests run against both C extension and CFFI implementation to
       
    39 ensure behavior parity.
       
    40 
    35 Expected Changes
    41 Expected Changes
    36 ----------------
    42 ----------------
    37 
    43 
    38 The author is reasonably confident in the current state of what's
    44 The author is reasonably confident in the current state of what's
    39 implemented on the ``ZstdCompressor`` and ``ZstdDecompressor`` types.
    45 implemented on the ``ZstdCompressor`` and ``ZstdDecompressor`` types.
    45 sizes using zstd's preferred defaults).
    51 sizes using zstd's preferred defaults).
    46 
    52 
    47 There should be an API that accepts an object that conforms to the buffer
    53 There should be an API that accepts an object that conforms to the buffer
    48 interface and returns an iterator over compressed or decompressed output.
    54 interface and returns an iterator over compressed or decompressed output.
    49 
    55 
       
    56 There should be an API that exposes an ``io.RawIOBase`` interface to
       
    57 compressor and decompressor streams, like how ``gzip.GzipFile`` from
       
    58 the standard library works (issue 13).
       
    59 
    50 The author is on the fence as to whether to support the extremely
    60 The author is on the fence as to whether to support the extremely
    51 low level compression and decompression APIs. It could be useful to
    61 low level compression and decompression APIs. It could be useful to
    52 support compression without the framing headers. But the author doesn't
    62 support compression without the framing headers. But the author doesn't
    53 believe it a high priority at this time.
    63 believe it a high priority at this time.
    54 
    64 
    55 The CFFI bindings are feature complete and all tests run against both
    65 There will likely be a refactoring of the module names. Currently,
    56 the C extension and CFFI bindings to ensure behavior parity.
    66 ``zstd`` is a C extension and ``zstd_cffi`` is the CFFI interface.
       
    67 This means that all code for the C extension must be implemented in
       
    68 C. ``zstd`` may be converted to a Python module so code can be reused
       
    69 between CFFI and C and so not all code in the C extension has to be C.
    57 
    70 
    58 Requirements
    71 Requirements
    59 ============
    72 ============
    60 
    73 
    61 This extension is designed to run with Python 2.6, 2.7, 3.3, 3.4, 3.5, and
    74 This extension is designed to run with Python 2.6, 2.7, 3.3, 3.4, 3.5, and
   150 A Tox configuration is present to test against multiple Python versions::
   163 A Tox configuration is present to test against multiple Python versions::
   151 
   164 
   152    $ tox
   165    $ tox
   153 
   166 
   154 Tests use the ``hypothesis`` Python package to perform fuzzing. If you
   167 Tests use the ``hypothesis`` Python package to perform fuzzing. If you
   155 don't have it, those tests won't run.
   168 don't have it, those tests won't run. Since the fuzzing tests take longer
   156 
   169 to execute than normal tests, you'll need to opt in to running them by
   157 There is also an experimental CFFI module. You need the ``cffi`` Python
   170 setting the ``ZSTD_SLOW_TESTS`` environment variable. This is set
   158 package installed to build and test that.
   171 automatically when using ``tox``.
       
   172 
       
   173 The ``cffi`` Python package needs to be installed in order to build the CFFI
       
   174 bindings. If it isn't present, the CFFI bindings won't be built.
   159 
   175 
   160 To create a virtualenv with all development dependencies, do something
   176 To create a virtualenv with all development dependencies, do something
   161 like the following::
   177 like the following::
   162 
   178 
   163   # Python 2
   179   # Python 2
   170   $ pip install cffi hypothesis nose tox
   186   $ pip install cffi hypothesis nose tox
   171 
   187 
   172 API
   188 API
   173 ===
   189 ===
   174 
   190 
   175 The compiled C extension provides a ``zstd`` Python module. This module
   191 The compiled C extension provides a ``zstd`` Python module. The CFFI
   176 exposes the following interfaces.
   192 bindings provide a ``zstd_cffi`` module. Both provide an identical API
       
   193 interface. The types, functions, and attributes exposed by these modules
       
   194 are documented in the sections below.
       
   195 
       
   196 .. note::
       
   197 
       
   198    The documentation in this section makes references to various zstd
       
   199    concepts and functionality. The ``Concepts`` section below explains
       
   200    these concepts in more detail.
   177 
   201 
   178 ZstdCompressor
   202 ZstdCompressor
   179 --------------
   203 --------------
   180 
   204 
   181 The ``ZstdCompressor`` class provides an interface for performing
   205 The ``ZstdCompressor`` class provides an interface for performing
   207    likely not true for streaming compression.
   231    likely not true for streaming compression.
   208 write_dict_id
   232 write_dict_id
   209    Whether to write the dictionary ID into the compressed data.
   233    Whether to write the dictionary ID into the compressed data.
   210    Defaults to True. The dictionary ID is only written if a dictionary
   234    Defaults to True. The dictionary ID is only written if a dictionary
   211    is being used.
   235    is being used.
       
   236 threads
       
   237    Enables and sets the number of threads to use for multi-threaded compression
       
   238    operations. Defaults to 0, which means to use single-threaded compression.
       
   239    Negative values will resolve to the number of logical CPUs in the system.
       
   240    Read below for more info on multi-threaded compression. This argument only
       
   241    controls thread count for operations that operate on individual pieces of
       
   242    data. APIs that spawn multiple threads for working on multiple pieces of
       
   243    data have their own ``threads`` argument.
   212 
   244 
   213 Unless specified otherwise, assume that no two methods of ``ZstdCompressor``
   245 Unless specified otherwise, assume that no two methods of ``ZstdCompressor``
   214 instances can be called from multiple Python threads simultaneously. In other
   246 instances can be called from multiple Python threads simultaneously. In other
   215 words, assume instances are not thread safe unless stated otherwise.
   247 words, assume instances are not thread safe unless stated otherwise.
   216 
   248 
   219 
   251 
   220 ``compress(data)`` compresses and returns data as a one-shot operation.::
   252 ``compress(data)`` compresses and returns data as a one-shot operation.::
   221 
   253 
   222    cctx = zstd.ZstdCompressor()
   254    cctx = zstd.ZstdCompressor()
   223    compressed = cctx.compress(b'data to compress')
   255    compressed = cctx.compress(b'data to compress')
       
   256 
       
   257 The ``data`` argument can be any object that implements the *buffer protocol*.
   224 
   258 
   225 Unless ``compression_params`` or ``dict_data`` are passed to the
   259 Unless ``compression_params`` or ``dict_data`` are passed to the
   226 ``ZstdCompressor``, each invocation of ``compress()`` will calculate the
   260 ``ZstdCompressor``, each invocation of ``compress()`` will calculate the
   227 optimal compression parameters for the configured compression ``level`` and
   261 optimal compression parameters for the configured compression ``level`` and
   228 input data size (some parameters are fine-tuned for small input sizes).
   262 input data size (some parameters are fine-tuned for small input sizes).
   409    cctx = zstd.ZstdCompressor()
   443    cctx = zstd.ZstdCompressor()
   410    cobj = cctx.compressobj(size=6)
   444    cobj = cctx.compressobj(size=6)
   411    data = cobj.compress(b'foobar')
   445    data = cobj.compress(b'foobar')
   412    data = cobj.flush()
   446    data = cobj.flush()
   413 
   447 
       
   448 Batch Compression API
       
   449 ^^^^^^^^^^^^^^^^^^^^^
       
   450 
       
   451 (Experimental. Not yet supported in CFFI bindings.)
       
   452 
       
   453 ``multi_compress_to_buffer(data, [threads=0])`` performs compression of multiple
       
   454 inputs as a single operation.
       
   455 
       
   456 Data to be compressed can be passed as a ``BufferWithSegmentsCollection``, a
       
   457 ``BufferWithSegments``, or a list containing byte like objects. Each element of
       
   458 the container will be compressed individually using the configured parameters
       
   459 on the ``ZstdCompressor`` instance.
       
   460 
       
   461 The ``threads`` argument controls how many threads to use for compression. The
       
   462 default is ``0`` which means to use a single thread. Negative values use the
       
   463 number of logical CPUs in the machine.
       
   464 
       
   465 The function returns a ``BufferWithSegmentsCollection``. This type represents
       
   466 N discrete memory allocations, eaching holding 1 or more compressed frames.
       
   467 
       
   468 Output data is written to shared memory buffers. This means that unlike
       
   469 regular Python objects, a reference to *any* object within the collection
       
   470 keeps the shared buffer and therefore memory backing it alive. This can have
       
   471 undesirable effects on process memory usage.
       
   472 
       
   473 The API and behavior of this function is experimental and will likely change.
       
   474 Known deficiencies include:
       
   475 
       
   476 * If asked to use multiple threads, it will always spawn that many threads,
       
   477   even if the input is too small to use them. It should automatically lower
       
   478   the thread count when the extra threads would just add overhead.
       
   479 * The buffer allocation strategy is fixed. There is room to make it dynamic,
       
   480   perhaps even to allow one output buffer per input, facilitating a variation
       
   481   of the API to return a list without the adverse effects of shared memory
       
   482   buffers.
       
   483 
   414 ZstdDecompressor
   484 ZstdDecompressor
   415 ----------------
   485 ----------------
   416 
   486 
   417 The ``ZstdDecompressor`` class provides an interface for performing
   487 The ``ZstdDecompressor`` class provides an interface for performing
   418 decompression.
   488 decompression.
   583    dctx = zstd.ZstdDeompressor()
   653    dctx = zstd.ZstdDeompressor()
   584    dobj = cctx.decompressobj()
   654    dobj = cctx.decompressobj()
   585    data = dobj.decompress(compressed_chunk_0)
   655    data = dobj.decompress(compressed_chunk_0)
   586    data = dobj.decompress(compressed_chunk_1)
   656    data = dobj.decompress(compressed_chunk_1)
   587 
   657 
       
   658 Batch Decompression API
       
   659 ^^^^^^^^^^^^^^^^^^^^^^^
       
   660 
       
   661 (Experimental. Not yet supported in CFFI bindings.)
       
   662 
       
   663 ``multi_decompress_to_buffer()`` performs decompression of multiple
       
   664 frames as a single operation and returns a ``BufferWithSegmentsCollection``
       
   665 containing decompressed data for all inputs.
       
   666 
       
   667 Compressed frames can be passed to the function as a ``BufferWithSegments``,
       
   668 a ``BufferWithSegmentsCollection``, or as a list containing objects that
       
   669 conform to the buffer protocol. For best performance, pass a
       
   670 ``BufferWithSegmentsCollection`` or a ``BufferWithSegments``, as
       
   671 minimal input validation will be done for that type. If calling from
       
   672 Python (as opposed to C), constructing one of these instances may add
       
   673 overhead cancelling out the performance overhead of validation for list
       
   674 inputs.
       
   675 
       
   676 The decompressed size of each frame must be discoverable. It can either be
       
   677 embedded within the zstd frame (``write_content_size=True`` argument to
       
   678 ``ZstdCompressor``) or passed in via the ``decompressed_sizes`` argument.
       
   679 
       
   680 The ``decompressed_sizes`` argument is an object conforming to the buffer
       
   681 protocol which holds an array of 64-bit unsigned integers in the machine's
       
   682 native format defining the decompressed sizes of each frame. If this argument
       
   683 is passed, it avoids having to scan each frame for its decompressed size.
       
   684 This frame scanning can add noticeable overhead in some scenarios.
       
   685 
       
   686 The ``threads`` argument controls the number of threads to use to perform
       
   687 decompression operations. The default (``0``) or the value ``1`` means to
       
   688 use a single thread. Negative values use the number of logical CPUs in the
       
   689 machine.
       
   690 
       
   691 .. note::
       
   692 
       
   693    It is possible to pass a ``mmap.mmap()`` instance into this function by
       
   694    wrapping it with a ``BufferWithSegments`` instance (which will define the
       
   695    offsets of frames within the memory mapped region).
       
   696 
       
   697 This function is logically equivalent to performing ``dctx.decompress()``
       
   698 on each input frame and returning the result.
       
   699 
       
   700 This function exists to perform decompression on multiple frames as fast
       
   701 as possible by having as little overhead as possible. Since decompression is
       
   702 performed as a single operation and since the decompressed output is stored in
       
   703 a single buffer, extra memory allocations, Python objects, and Python function
       
   704 calls are avoided. This is ideal for scenarios where callers need to access
       
   705 decompressed data for multiple frames.
       
   706 
       
   707 Currently, the implementation always spawns multiple threads when requested,
       
   708 even if the amount of work to do is small. In the future, it will be smarter
       
   709 about avoiding threads and their associated overhead when the amount of
       
   710 work to do is small.
       
   711 
   588 Content-Only Dictionary Chain Decompression
   712 Content-Only Dictionary Chain Decompression
   589 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   713 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   590 
   714 
   591 ``decompress_content_dict_chain(frames)`` performs decompression of a list of
   715 ``decompress_content_dict_chain(frames)`` performs decompression of a list of
   592 zstd frames produced using chained *content-only* dictionary compression. Such
   716 zstd frames produced using chained *content-only* dictionary compression. Such
   607 Each zstd frame **must** have the content size written.
   731 Each zstd frame **must** have the content size written.
   608 
   732 
   609 The following Python code can be used to produce a *content-only dictionary
   733 The following Python code can be used to produce a *content-only dictionary
   610 chain*::
   734 chain*::
   611 
   735 
   612 	def make_chain(inputs):
   736     def make_chain(inputs):
   613 	    frames = []
   737         frames = []
   614 
   738 
   615 		# First frame is compressed in standalone/discrete mode.
   739         # First frame is compressed in standalone/discrete mode.
   616 		zctx = zstd.ZstdCompressor(write_content_size=True)
   740         zctx = zstd.ZstdCompressor(write_content_size=True)
   617 		frames.append(zctx.compress(inputs[0]))
   741         frames.append(zctx.compress(inputs[0]))
   618 
   742 
   619 		# Subsequent frames use the previous fulltext as a content-only dictionary
   743         # Subsequent frames use the previous fulltext as a content-only dictionary
   620 		for i, raw in enumerate(inputs[1:]):
   744         for i, raw in enumerate(inputs[1:]):
   621 		    dict_data = zstd.ZstdCompressionDict(inputs[i])
   745             dict_data = zstd.ZstdCompressionDict(inputs[i])
   622 			zctx = zstd.ZstdCompressor(write_content_size=True, dict_data=dict_data)
   746             zctx = zstd.ZstdCompressor(write_content_size=True, dict_data=dict_data)
   623 			frames.append(zctx.compress(raw))
   747             frames.append(zctx.compress(raw))
   624 
   748 
   625 		return frames
   749         return frames
   626 
   750 
   627 ``decompress_content_dict_chain()`` returns the uncompressed data of the last
   751 ``decompress_content_dict_chain()`` returns the uncompressed data of the last
   628 element in the input chain.
   752 element in the input chain.
   629 
   753 
   630 It is possible to implement *content-only dictionary chain* decompression
   754 It is possible to implement *content-only dictionary chain* decompression
   631 on top of other Python APIs. However, this function will likely be significantly
   755 on top of other Python APIs. However, this function will likely be significantly
   632 faster, especially for long input chains, as it avoids the overhead of
   756 faster, especially for long input chains, as it avoids the overhead of
   633 instantiating and passing around intermediate objects between C and Python.
   757 instantiating and passing around intermediate objects between C and Python.
   634 
   758 
   635 Choosing an API
   759 Multi-Threaded Compression
   636 ---------------
   760 --------------------------
   637 
   761 
   638 Various forms of compression and decompression APIs are provided because each
   762 ``ZstdCompressor`` accepts a ``threads`` argument that controls the number
   639 are suitable for different use cases.
   763 of threads to use for compression. The way this works is that input is split
   640 
   764 into segments and each segment is fed into a worker pool for compression. Once
   641 The simple/one-shot APIs are useful for small data, when the decompressed
   765 a segment is compressed, it is flushed/appended to the output.
   642 data size is known (either recorded in the zstd frame header via
   766 
   643 ``write_content_size`` or known via an out-of-band mechanism, such as a file
   767 The segment size for multi-threaded compression is chosen from the window size
   644 size).
   768 of the compressor. This is derived from the ``window_log`` attribute of a
   645 
   769 ``CompressionParameters`` instance. By default, segment sizes are in the 1+MB
   646 A limitation of the simple APIs is that input or output data must fit in memory.
   770 range.
   647 And unless using advanced tricks with Python *buffer objects*, both input and
   771 
   648 output must fit in memory simultaneously.
   772 If multi-threaded compression is requested and the input is smaller than the
   649 
   773 configured segment size, only a single compression thread will be used. If the
   650 Another limitation is that compression or decompression is performed as a single
   774 input is smaller than the segment size multiplied by the thread pool size or
   651 operation. So if you feed large input, it could take a long time for the
   775 if data cannot be delivered to the compressor fast enough, not all requested
   652 function to return.
   776 compressor threads may be active simultaneously.
   653 
   777 
   654 The streaming APIs do not have the limitations of the simple API. The cost to
   778 Compared to non-multi-threaded compression, multi-threaded compression has
   655 this is they are more complex to use than a single function call.
   779 higher per-operation overhead. This includes extra memory operations,
   656 
   780 thread creation, lock acquisition, etc.
   657 The streaming APIs put the caller in control of compression and decompression
   781 
   658 behavior by allowing them to directly control either the input or output side
   782 Due to the nature of multi-threaded compression using *N* compression
   659 of the operation.
   783 *states*, the output from multi-threaded compression will likely be larger
   660 
   784 than non-multi-threaded compression. The difference is usually small. But
   661 With the streaming input APIs, the caller feeds data into the compressor or
   785 there is a CPU/wall time versus size trade off that may warrant investigation.
   662 decompressor as they see fit. Output data will only be written after the caller
   786 
   663 has explicitly written data.
   787 Output from multi-threaded compression does not require any special handling
   664 
   788 on the decompression side. In other words, any zstd decompressor should be able
   665 With the streaming output APIs, the caller consumes output from the compressor
   789 to consume data produced with multi-threaded compression.
   666 or decompressor as they see fit. The compressor or decompressor will only
       
   667 consume data from the source when the caller is ready to receive it.
       
   668 
       
   669 One end of the streaming APIs involves a file-like object that must
       
   670 ``write()`` output data or ``read()`` input data. Depending on what the
       
   671 backing storage for these objects is, those operations may not complete quickly.
       
   672 For example, when streaming compressed data to a file, the ``write()`` into
       
   673 a streaming compressor could result in a ``write()`` to the filesystem, which
       
   674 may take a long time to finish due to slow I/O on the filesystem. So, there
       
   675 may be overhead in streaming APIs beyond the compression and decompression
       
   676 operations.
       
   677 
   790 
   678 Dictionary Creation and Management
   791 Dictionary Creation and Management
   679 ----------------------------------
   792 ----------------------------------
   680 
   793 
   681 Zstandard allows *dictionaries* to be used when compressing and
   794 Compression dictionaries are represented as the ``ZstdCompressionDict`` type.
   682 decompressing data. The idea is that if you are compressing a lot of similar
       
   683 data, you can precompute common properties of that data (such as recurring
       
   684 byte sequences) to achieve better compression ratios.
       
   685 
       
   686 In Python, compression dictionaries are represented as the
       
   687 ``ZstdCompressionDict`` type.
       
   688 
   795 
   689 Instances can be constructed from bytes::
   796 Instances can be constructed from bytes::
   690 
   797 
   691    dict_data = zstd.ZstdCompressionDict(data)
   798    dict_data = zstd.ZstdCompressionDict(data)
   692 
   799 
   733 You can obtain the raw data in the dict (useful for persisting and constructing
   840 You can obtain the raw data in the dict (useful for persisting and constructing
   734 a ``ZstdCompressionDict`` later) via ``as_bytes()``::
   841 a ``ZstdCompressionDict`` later) via ``as_bytes()``::
   735 
   842 
   736    dict_data = zstd.train_dictionary(size, samples)
   843    dict_data = zstd.train_dictionary(size, samples)
   737    raw_data = dict_data.as_bytes()
   844    raw_data = dict_data.as_bytes()
       
   845 
       
   846 The following named arguments to ``train_dictionary`` can also be used
       
   847 to further control dictionary generation.
       
   848 
       
   849 selectivity
       
   850    Integer selectivity level. Default is 9. Larger values yield more data in
       
   851    dictionary.
       
   852 level
       
   853    Integer compression level. Default is 6.
       
   854 dict_id
       
   855    Integer dictionary ID for the produced dictionary. Default is 0, which
       
   856    means to use a random value.
       
   857 notifications
       
   858    Controls writing of informational messages to ``stderr``. ``0`` (the
       
   859    default) means to write nothing. ``1`` writes errors. ``2`` writes
       
   860    progression info. ``3`` writes more details. And ``4`` writes all info.
       
   861 
       
   862 Cover Dictionaries
       
   863 ^^^^^^^^^^^^^^^^^^
       
   864 
       
   865 An alternate dictionary training mechanism named *cover* is also available.
       
   866 More details about this training mechanism are available in the paper
       
   867 *Effective Construction of Relative Lempel-Ziv Dictionaries* (authors:
       
   868 Liao, Petri, Moffat, Wirth).
       
   869 
       
   870 To use this mechanism, use ``zstd.train_cover_dictionary()`` instead of
       
   871 ``zstd.train_dictionary()``. The function behaves nearly the same except
       
   872 its arguments are different and the returned dictionary will contain ``k``
       
   873 and ``d`` attributes reflecting the parameters to the cover algorithm.
       
   874 
       
   875 .. note::
       
   876 
       
   877    The ``k`` and ``d`` attributes are only populated on dictionary
       
   878    instances created by this function. If a ``ZstdCompressionDict`` is
       
   879    constructed from raw bytes data, the ``k`` and ``d`` attributes will
       
   880    be ``0``.
       
   881 
       
   882 The segment and dmer size parameters to the cover algorithm can either be
       
   883 specified manually or you can ask ``train_cover_dictionary()`` to try
       
   884 multiple values and pick the best one, where *best* means the smallest
       
   885 compressed data size.
       
   886 
       
   887 In manual mode, the ``k`` and ``d`` arguments must be specified or a
       
   888 ``ZstdError`` will be raised.
       
   889 
       
   890 In automatic mode (triggered by specifying ``optimize=True``), ``k``
       
   891 and ``d`` are optional. If a value isn't specified, then default values for
       
   892 both are tested.  The ``steps`` argument can control the number of steps
       
   893 through ``k`` values. The ``level`` argument defines the compression level
       
   894 that will be used when testing the compressed size. And ``threads`` can
       
   895 specify the number of threads to use for concurrent operation.
       
   896 
       
   897 This function takes the following arguments:
       
   898 
       
   899 dict_size
       
   900    Target size in bytes of the dictionary to generate.
       
   901 samples
       
   902    A list of bytes holding samples the dictionary will be trained from.
       
   903 k
       
   904    Parameter to cover algorithm defining the segment size. A reasonable range
       
   905    is [16, 2048+].
       
   906 d
       
   907    Parameter to cover algorithm defining the dmer size. A reasonable range is
       
   908    [6, 16]. ``d`` must be less than or equal to ``k``.
       
   909 dict_id
       
   910    Integer dictionary ID for the produced dictionary. Default is 0, which uses
       
   911    a random value.
       
   912 optimize
       
   913    When true, test dictionary generation with multiple parameters.
       
   914 level
       
   915    Integer target compression level when testing compression with
       
   916    ``optimize=True``. Default is 1.
       
   917 steps
       
   918    Number of steps through ``k`` values to perform when ``optimize=True``.
       
   919    Default is 32.
       
   920 threads
       
   921    Number of threads to use when ``optimize=True``. Default is 0, which means
       
   922    to use a single thread. A negative value can be specified to use as many
       
   923    threads as there are detected logical CPUs.
       
   924 notifications
       
   925    Controls writing of informational messages to ``stderr``. See the
       
   926    documentation for ``train_dictionary()`` for more.
   738 
   927 
   739 Explicit Compression Parameters
   928 Explicit Compression Parameters
   740 -------------------------------
   929 -------------------------------
   741 
   930 
   742 Zstandard's integer compression levels along with the input size and dictionary
   931 Zstandard's integer compression levels along with the input size and dictionary
   902 example, the difference between *context* reuse and non-reuse for 100,000
  1091 example, the difference between *context* reuse and non-reuse for 100,000
   903 100 byte inputs will be significant (possiby over 10x faster to reuse contexts)
  1092 100 byte inputs will be significant (possiby over 10x faster to reuse contexts)
   904 whereas 10 1,000,000 byte inputs will be more similar in speed (because the
  1093 whereas 10 1,000,000 byte inputs will be more similar in speed (because the
   905 time spent doing compression dwarfs time spent creating new *contexts*).
  1094 time spent doing compression dwarfs time spent creating new *contexts*).
   906 
  1095 
       
  1096 Buffer Types
       
  1097 ------------
       
  1098 
       
  1099 The API exposes a handful of custom types for interfacing with memory buffers.
       
  1100 The primary goal of these types is to facilitate efficient multi-object
       
  1101 operations.
       
  1102 
       
  1103 The essential idea is to have a single memory allocation provide backing
       
  1104 storage for multiple logical objects. This has 2 main advantages: fewer
       
  1105 allocations and optimal memory access patterns. This avoids having to allocate
       
  1106 a Python object for each logical object and furthermore ensures that access of
       
  1107 data for objects can be sequential (read: fast) in memory.
       
  1108 
       
  1109 BufferWithSegments
       
  1110 ^^^^^^^^^^^^^^^^^^
       
  1111 
       
  1112 The ``BufferWithSegments`` type represents a memory buffer containing N
       
  1113 discrete items of known lengths (segments). It is essentially a fixed size
       
  1114 memory address and an array of 2-tuples of ``(offset, length)`` 64-bit
       
  1115 unsigned native endian integers defining the byte offset and length of each
       
  1116 segment within the buffer.
       
  1117 
       
  1118 Instances behave like containers.
       
  1119 
       
  1120 ``len()`` returns the number of segments within the instance.
       
  1121 
       
  1122 ``o[index]`` or ``__getitem__`` obtains a ``BufferSegment`` representing an
       
  1123 individual segment within the backing buffer. That returned object references
       
  1124 (not copies) memory. This means that iterating all objects doesn't copy
       
  1125 data within the buffer.
       
  1126 
       
  1127 The ``.size`` attribute contains the total size in bytes of the backing
       
  1128 buffer.
       
  1129 
       
  1130 Instances conform to the buffer protocol. So a reference to the backing bytes
       
  1131 can be obtained via ``memoryview(o)``. A *copy* of the backing bytes can also
       
  1132 be obtained via ``.tobytes()``.
       
  1133 
       
  1134 The ``.segments`` attribute exposes the array of ``(offset, length)`` for
       
  1135 segments within the buffer. It is a ``BufferSegments`` type.
       
  1136 
       
  1137 BufferSegment
       
  1138 ^^^^^^^^^^^^^
       
  1139 
       
  1140 The ``BufferSegment`` type represents a segment within a ``BufferWithSegments``.
       
  1141 It is essentially a reference to N bytes within a ``BufferWithSegments``.
       
  1142 
       
  1143 ``len()`` returns the length of the segment in bytes.
       
  1144 
       
  1145 ``.offset`` contains the byte offset of this segment within its parent
       
  1146 ``BufferWithSegments`` instance.
       
  1147 
       
  1148 The object conforms to the buffer protocol. ``.tobytes()`` can be called to
       
  1149 obtain a ``bytes`` instance with a copy of the backing bytes.
       
  1150 
       
  1151 BufferSegments
       
  1152 ^^^^^^^^^^^^^^
       
  1153 
       
  1154 This type represents an array of ``(offset, length)`` integers defining segments
       
  1155 within a ``BufferWithSegments``.
       
  1156 
       
  1157 The array members are 64-bit unsigned integers using host/native bit order.
       
  1158 
       
  1159 Instances conform to the buffer protocol.
       
  1160 
       
  1161 BufferWithSegmentsCollection
       
  1162 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
       
  1163 
       
  1164 The ``BufferWithSegmentsCollection`` type represents a virtual spanning view
       
  1165 of multiple ``BufferWithSegments`` instances.
       
  1166 
       
  1167 Instances are constructed from 1 or more ``BufferWithSegments`` instances. The
       
  1168 resulting object behaves like an ordered sequence whose members are the
       
  1169 segments within each ``BufferWithSegments``.
       
  1170 
       
  1171 ``len()`` returns the number of segments within all ``BufferWithSegments``
       
  1172 instances.
       
  1173 
       
  1174 ``o[index]`` and ``__getitem__(index)`` return the ``BufferSegment`` at
       
  1175 that offset as if all ``BufferWithSegments`` instances were a single
       
  1176 entity.
       
  1177 
       
  1178 If the object is composed of 2 ``BufferWithSegments`` instances with the
       
  1179 first having 2 segments and the second have 3 segments, then ``b[0]``
       
  1180 and ``b[1]`` access segments in the first object and ``b[2]``, ``b[3]``,
       
  1181 and ``b[4]`` access segments from the second.
       
  1182 
       
  1183 Choosing an API
       
  1184 ===============
       
  1185 
       
  1186 There are multiple APIs for performing compression and decompression. This is
       
  1187 because different applications have different needs and the library wants to
       
  1188 facilitate optimal use in as many use cases as possible.
       
  1189 
       
  1190 From a high-level, APIs are divided into *one-shot* and *streaming*. See
       
  1191 the ``Concepts`` section for a description of how these are different at
       
  1192 the C layer.
       
  1193 
       
  1194 The *one-shot* APIs are useful for small data, where the input or output
       
  1195 size is known. (The size can come from a buffer length, file size, or
       
  1196 stored in the zstd frame header.) A limitation of the *one-shot* APIs is that
       
  1197 input and output must fit in memory simultaneously. For say a 4 GB input,
       
  1198 this is often not feasible.
       
  1199 
       
  1200 The *one-shot* APIs also perform all work as a single operation. So, if you
       
  1201 feed it large input, it could take a long time for the function to return.
       
  1202 
       
  1203 The streaming APIs do not have the limitations of the simple API. But the
       
  1204 price you pay for this flexibility is that they are more complex than a
       
  1205 single function call.
       
  1206 
       
  1207 The streaming APIs put the caller in control of compression and decompression
       
  1208 behavior by allowing them to directly control either the input or output side
       
  1209 of the operation.
       
  1210 
       
  1211 With the *streaming input*, *compressor*, and *decompressor* APIs, the caller
       
  1212 has full control over the input to the compression or decompression stream.
       
  1213 They can directly choose when new data is operated on.
       
  1214 
       
  1215 With the *streaming ouput* APIs, the caller has full control over the output
       
  1216 of the compression or decompression stream. It can choose when to receive
       
  1217 new data.
       
  1218 
       
  1219 When using the *streaming* APIs that operate on file-like or stream objects,
       
  1220 it is important to consider what happens in that object when I/O is requested.
       
  1221 There is potential for long pauses as data is read or written from the
       
  1222 underlying stream (say from interacting with a filesystem or network). This
       
  1223 could add considerable overhead.
       
  1224 
       
  1225 Concepts
       
  1226 ========
       
  1227 
       
  1228 It is important to have a basic understanding of how Zstandard works in order
       
  1229 to optimally use this library. In addition, there are some low-level Python
       
  1230 concepts that are worth explaining to aid understanding. This section aims to
       
  1231 provide that knowledge.
       
  1232 
       
  1233 Zstandard Frames and Compression Format
       
  1234 ---------------------------------------
       
  1235 
       
  1236 Compressed zstandard data almost always exists within a container called a
       
  1237 *frame*. (For the technically curious, see the
       
  1238 `specification <https://github.com/facebook/zstd/blob/3bee41a70eaf343fbcae3637b3f6edbe52f35ed8/doc/zstd_compression_format.md>_.)
       
  1239 
       
  1240 The frame contains a header and optional trailer. The header contains a
       
  1241 magic number to self-identify as a zstd frame and a description of the
       
  1242 compressed data that follows.
       
  1243 
       
  1244 Among other things, the frame *optionally* contains the size of the
       
  1245 decompressed data the frame represents, a 32-bit checksum of the
       
  1246 decompressed data (to facilitate verification during decompression),
       
  1247 and the ID of the dictionary used to compress the data.
       
  1248 
       
  1249 Storing the original content size in the frame (``write_content_size=True``
       
  1250 to ``ZstdCompressor``) is important for performance in some scenarios. Having
       
  1251 the decompressed size stored there (or storing it elsewhere) allows
       
  1252 decompression to perform a single memory allocation that is exactly sized to
       
  1253 the output. This is faster than continuously growing a memory buffer to hold
       
  1254 output.
       
  1255 
       
  1256 Compression and Decompression Contexts
       
  1257 --------------------------------------
       
  1258 
       
  1259 In order to perform a compression or decompression operation with the zstd
       
  1260 C API, you need what's called a *context*. A context essentially holds
       
  1261 configuration and state for a compression or decompression operation. For
       
  1262 example, a compression context holds the configured compression level.
       
  1263 
       
  1264 Contexts can be reused for multiple operations. Since creating and
       
  1265 destroying contexts is not free, there are performance advantages to
       
  1266 reusing contexts.
       
  1267 
       
  1268 The ``ZstdCompressor`` and ``ZstdDecompressor`` types are essentially
       
  1269 wrappers around these contexts in the zstd C API.
       
  1270 
       
  1271 One-shot And Streaming Operations
       
  1272 ---------------------------------
       
  1273 
       
  1274 A compression or decompression operation can either be performed as a
       
  1275 single *one-shot* operation or as a continuous *streaming* operation.
       
  1276 
       
  1277 In one-shot mode (the *simple* APIs provided by the Python interface),
       
  1278 **all** input is handed to the compressor or decompressor as a single buffer
       
  1279 and **all** output is returned as a single buffer.
       
  1280 
       
  1281 In streaming mode, input is delivered to the compressor or decompressor as
       
  1282 a series of chunks via multiple function calls. Likewise, output is
       
  1283 obtained in chunks as well.
       
  1284 
       
  1285 Streaming operations require an additional *stream* object to be created
       
  1286 to track the operation. These are logical extensions of *context*
       
  1287 instances.
       
  1288 
       
  1289 There are advantages and disadvantages to each mode of operation. There
       
  1290 are scenarios where certain modes can't be used. See the
       
  1291 ``Choosing an API`` section for more.
       
  1292 
       
  1293 Dictionaries
       
  1294 ------------
       
  1295 
       
  1296 A compression *dictionary* is essentially data used to seed the compressor
       
  1297 state so it can achieve better compression. The idea is that if you are
       
  1298 compressing a lot of similar pieces of data (e.g. JSON documents or anything
       
  1299 sharing similar structure), then you can find common patterns across multiple
       
  1300 objects then leverage those common patterns during compression and
       
  1301 decompression operations to achieve better compression ratios.
       
  1302 
       
  1303 Dictionary compression is generally only useful for small inputs - data no
       
  1304 larger than a few kilobytes. The upper bound on this range is highly dependent
       
  1305 on the input data and the dictionary.
       
  1306 
       
  1307 Python Buffer Protocol
       
  1308 ----------------------
       
  1309 
       
  1310 Many functions in the library operate on objects that implement Python's
       
  1311 `buffer protocol <https://docs.python.org/3.6/c-api/buffer.html>`_.
       
  1312 
       
  1313 The *buffer protocol* is an internal implementation detail of a Python
       
  1314 type that allows instances of that type (objects) to be exposed as a raw
       
  1315 pointer (or buffer) in the C API. In other words, it allows objects to be
       
  1316 exposed as an array of bytes.
       
  1317 
       
  1318 From the perspective of the C API, objects implementing the *buffer protocol*
       
  1319 all look the same: they are just a pointer to a memory address of a defined
       
  1320 length. This allows the C API to be largely type agnostic when accessing their
       
  1321 data. This allows custom types to be passed in without first converting them
       
  1322 to a specific type.
       
  1323 
       
  1324 Many Python types implement the buffer protocol. These include ``bytes``
       
  1325 (``str`` on Python 2), ``bytearray``, ``array.array``, ``io.BytesIO``,
       
  1326 ``mmap.mmap``, and ``memoryview``.
       
  1327 
       
  1328 ``python-zstandard`` APIs that accept objects conforming to the buffer
       
  1329 protocol require that the buffer is *C contiguous* and has a single
       
  1330 dimension (``ndim==1``). This is usually the case. An example of where it
       
  1331 is not is a Numpy matrix type.
       
  1332 
       
  1333 Requiring Output Sizes for Non-Streaming Decompression APIs
       
  1334 -----------------------------------------------------------
       
  1335 
       
  1336 Non-streaming decompression APIs require that either the output size is
       
  1337 explicitly defined (either in the zstd frame header or passed into the
       
  1338 function) or that a max output size is specified. This restriction is for
       
  1339 your safety.
       
  1340 
       
  1341 The *one-shot* decompression APIs store the decompressed result in a
       
  1342 single buffer. This means that a buffer needs to be pre-allocated to hold
       
  1343 the result. If the decompressed size is not known, then there is no universal
       
  1344 good default size to use. Any default will fail or will be highly sub-optimal
       
  1345 in some scenarios (it will either be too small or will put stress on the
       
  1346 memory allocator to allocate a too large block).
       
  1347 
       
  1348 A *helpful* API may retry decompression with buffers of increasing size.
       
  1349 While useful, there are obvious performance disadvantages, namely redoing
       
  1350 decompression N times until it works. In addition, there is a security
       
  1351 concern. Say the input came from highly compressible data, like 1 GB of the
       
  1352 same byte value. The output size could be several magnitudes larger than the
       
  1353 input size. An input of <100KB could decompress to >1GB. Without a bounds
       
  1354 restriction on the decompressed size, certain inputs could exhaust all system
       
  1355 memory. That's not good and is why the maximum output size is limited.
       
  1356 
   907 Note on Zstandard's *Experimental* API
  1357 Note on Zstandard's *Experimental* API
   908 ======================================
  1358 ======================================
   909 
  1359 
   910 Many of the Zstandard APIs used by this module are marked as *experimental*
  1360 Many of the Zstandard APIs used by this module are marked as *experimental*
   911 within the Zstandard project. This includes a large number of useful
  1361 within the Zstandard project. This includes a large number of useful