mercurial/utils/compression.py
author Manuel Jacob <me@manueljacob.de>
Mon, 11 Jul 2022 01:51:20 +0200
branchstable
changeset 49378 094a5fa3cf52
parent 48946 642e31cb55f0
child 50913 93b0de7f13ca
permissions -rw-r--r--
procutil: make stream detection in make_line_buffered more correct and strict In make_line_buffered(), we don’t want to wrap the stream if we know that lines get flushed to the underlying raw stream already. Previously, the heuristic was too optimistic. It assumed that any stream which is not an instance of io.BufferedIOBase doesn’t need wrapping. However, there are buffered streams that aren’t instances of io.BufferedIOBase, like Mercurial’s own winstdout. The new logic is different in two ways: First, only for the check, if unwraps any combination of WriteAllWrapper and winstdout. Second, it skips wrapping the stream only if it is an instance of io.RawIOBase (or already wrapped). If it is an instance of io.BufferedIOBase, it gets wrapped. In any other case, the function raises an exception. This ensures that, if an unknown stream is passed or we add another wrapper in the future, we don’t wrap the stream if it’s already line buffered or not wrap the stream if it’s not line buffered. In fact, this was already helpful during development of this change. Without it, I possibly would have forgot that WriteAllWrapper needs to be ignored for the check, leading to unnecessary wrapping if stdout is unbuffered. The alternative would have been to always wrap unknown streams. However, I don’t think that anyone would benefit from being less strict. We can expect streams from the standard library to be subclassing either io.RawIOBase or io.BufferedIOBase, so running Mercurial in the standard way should not regress by this change. Py2exe might replace sys.stdout and sys.stderr, but that currently breaks Mercurial anyway and also these streams don’t claim to be interactive, so this function is not called for them.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
42041
3e47d1ec9da5 util: extract compression code in `mercurial.utils.compression`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41834
diff changeset
     1
# compression.py - Mercurial utility functions for compression
8226
8b2cd04a6e97 put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
     2
#
8b2cd04a6e97 put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
     3
# This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 9996
diff changeset
     4
# GNU General Public License version 2 or any later version.
1082
ce96e316278a Update util.py docstrings, fix walk test
mpm@selenic.com
parents: 1081
diff changeset
     5
419
28511fc21073 [PATCH] file seperator handling for the other 'OS'
mpm@selenic.com
parents:
diff changeset
     6
27358
ac839ee45b6a util: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27357
diff changeset
     7
import bz2
ac839ee45b6a util: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27357
diff changeset
     8
import collections
26266
1e042e31bd0c changegroup: move all compressions utilities in util
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 26201
diff changeset
     9
import zlib
3769
96095d9ff1f8 Add encoding detection
Matt Mackall <mpm@selenic.com>
parents: 3767
diff changeset
    10
43089
c59eb1560c44 py3: manually import getattr where it is needed
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
    11
from ..pycompat import getattr
42041
3e47d1ec9da5 util: extract compression code in `mercurial.utils.compression`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41834
diff changeset
    12
from .. import (
27358
ac839ee45b6a util: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27357
diff changeset
    13
    error,
ac839ee45b6a util: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27357
diff changeset
    14
    i18n,
28818
6041fb8f2da8 pycompat: add empty and queue to handle py3 divergence
timeless <timeless@mozdev.org>
parents: 28497
diff changeset
    15
    pycompat,
27358
ac839ee45b6a util: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27357
diff changeset
    16
)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
    17
from . import stringutil
3769
96095d9ff1f8 Add encoding detection
Matt Mackall <mpm@selenic.com>
parents: 3767
diff changeset
    18
42041
3e47d1ec9da5 util: extract compression code in `mercurial.utils.compression`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41834
diff changeset
    19
safehasattr = pycompat.safehasattr
32201
4462a981e8df base85: proxy through util module
Yuya Nishihara <yuya@tcha.org>
parents: 32154
diff changeset
    20
14926
4e7e63fc685a util: eliminate wildcard imports
Adrian Buehlmann <adrian@cadifra.com>
parents: 14918
diff changeset
    21
27358
ac839ee45b6a util: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27357
diff changeset
    22
_ = i18n._
14926
4e7e63fc685a util: eliminate wildcard imports
Adrian Buehlmann <adrian@cadifra.com>
parents: 14918
diff changeset
    23
42041
3e47d1ec9da5 util: extract compression code in `mercurial.utils.compression`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41834
diff changeset
    24
# compression code
36524
bfe38f787d5b util: add a file object proxy that can notify observers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36432
diff changeset
    25
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    26
SERVERROLE = b'server'
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    27
CLIENTROLE = b'client'
33793
bbbbd3c30bfc util: add base class for transactional context managers
Martin von Zweigbergk <martinvonz@google.com>
parents: 33737
diff changeset
    28
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
    29
compewireprotosupport = collections.namedtuple(
45942
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43506
diff changeset
    30
    'compenginewireprotosupport',
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 43506
diff changeset
    31
    ('name', 'serverpriority', 'clientpriority'),
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
    32
)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
    33
9097
431462bd8478 fix memory usage of revlog caches by limiting cache size [issue1639]
Matt Mackall <mpm@selenic.com>
parents: 9089
diff changeset
    34
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
    35
class propertycache:
8207
dd8d5be57d65 util: take propertycache from context.py
Matt Mackall <mpm@selenic.com>
parents: 8181
diff changeset
    36
    def __init__(self, func):
dd8d5be57d65 util: take propertycache from context.py
Matt Mackall <mpm@selenic.com>
parents: 8181
diff changeset
    37
        self.func = func
dd8d5be57d65 util: take propertycache from context.py
Matt Mackall <mpm@selenic.com>
parents: 8181
diff changeset
    38
        self.name = func.__name__
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
    39
8207
dd8d5be57d65 util: take propertycache from context.py
Matt Mackall <mpm@selenic.com>
parents: 8181
diff changeset
    40
    def __get__(self, obj, type=None):
dd8d5be57d65 util: take propertycache from context.py
Matt Mackall <mpm@selenic.com>
parents: 8181
diff changeset
    41
        result = self.func(obj)
18013
98c867ac1330 clfilter: add a propertycache that must be unfiltered
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 17962
diff changeset
    42
        self.cachevalue(obj, result)
8207
dd8d5be57d65 util: take propertycache from context.py
Matt Mackall <mpm@selenic.com>
parents: 8181
diff changeset
    43
        return result
dd8d5be57d65 util: take propertycache from context.py
Matt Mackall <mpm@selenic.com>
parents: 8181
diff changeset
    44
18013
98c867ac1330 clfilter: add a propertycache that must be unfiltered
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 17962
diff changeset
    45
    def cachevalue(self, obj, value):
19951
d51c4d85ec23 spelling: random spell checker fixes
Mads Kiilerich <madski@unity3d.com>
parents: 19852
diff changeset
    46
        # __dict__ assignment required to bypass __setattr__ (eg: repoview)
19845
a1237a4b437d repoview: make propertycache.setcache compatible with repoview
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 19461
diff changeset
    47
        obj.__dict__[self.name] = value
18013
98c867ac1330 clfilter: add a propertycache that must be unfiltered
Pierre-Yves David <pierre-yves.david@logilab.fr>
parents: 17962
diff changeset
    48
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
    49
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
    50
class compressormanager:
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    51
    """Holds registrations of various compression engines.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    52
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    53
    This class essentially abstracts the differences between compression
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    54
    engines to allow new compression formats to be added easily, possibly from
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    55
    extensions.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    56
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    57
    Compressors are registered against the global instance by calling its
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    58
    ``register()`` method.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    59
    """
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
    60
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    61
    def __init__(self):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    62
        self._engines = {}
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    63
        # Bundle spec human name to engine name.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    64
        self._bundlenames = {}
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    65
        # Internal bundle identifier to engine name.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    66
        self._bundletypes = {}
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
    67
        # Revlog header to engine name.
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
    68
        self._revlogheaders = {}
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
    69
        # Wire proto identifier to engine name.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
    70
        self._wiretypes = {}
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    71
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    72
    def __getitem__(self, key):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    73
        return self._engines[key]
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    74
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    75
    def __contains__(self, key):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    76
        return key in self._engines
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    77
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    78
    def __iter__(self):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    79
        return iter(self._engines.keys())
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    80
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    81
    def register(self, engine):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    82
        """Register a compression engine with the manager.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    83
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    84
        The argument must be a ``compressionengine`` instance.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    85
        """
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    86
        if not isinstance(engine, compressionengine):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    87
            raise ValueError(_(b'argument must be a compressionengine'))
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    88
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    89
        name = engine.name()
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    90
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    91
        if name in self._engines:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
    92
            raise error.Abort(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    93
                _(b'compression engine %s already registered') % name
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
    94
            )
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    95
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    96
        bundleinfo = engine.bundletype()
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    97
        if bundleinfo:
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    98
            bundlename, bundletype = bundleinfo
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
    99
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   100
            if bundlename in self._bundlenames:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   101
                raise error.Abort(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   102
                    _(b'bundle name %s already registered') % bundlename
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   103
                )
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   104
            if bundletype in self._bundletypes:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   105
                raise error.Abort(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   106
                    _(b'bundle type %s already registered by %s')
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   107
                    % (bundletype, self._bundletypes[bundletype])
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   108
                )
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   109
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   110
            # No external facing name declared.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   111
            if bundlename:
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   112
                self._bundlenames[bundlename] = name
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   113
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   114
            self._bundletypes[bundletype] = name
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   115
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   116
        wiresupport = engine.wireprotosupport()
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   117
        if wiresupport:
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   118
            wiretype = wiresupport.name
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   119
            if wiretype in self._wiretypes:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   120
                raise error.Abort(
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   121
                    _(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   122
                        b'wire protocol compression %s already '
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   123
                        b'registered by %s'
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   124
                    )
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   125
                    % (wiretype, self._wiretypes[wiretype])
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   126
                )
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   127
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   128
            self._wiretypes[wiretype] = name
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   129
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   130
        revlogheader = engine.revlogheader()
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   131
        if revlogheader and revlogheader in self._revlogheaders:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   132
            raise error.Abort(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   133
                _(b'revlog header %s already registered by %s')
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   134
                % (revlogheader, self._revlogheaders[revlogheader])
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   135
            )
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   136
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   137
        if revlogheader:
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   138
            self._revlogheaders[revlogheader] = name
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   139
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   140
        self._engines[name] = engine
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   141
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   142
    @property
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   143
    def supportedbundlenames(self):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   144
        return set(self._bundlenames.keys())
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   145
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   146
    @property
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   147
    def supportedbundletypes(self):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   148
        return set(self._bundletypes.keys())
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   149
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   150
    def forbundlename(self, bundlename):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   151
        """Obtain a compression engine registered to a bundle name.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   152
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   153
        Will raise KeyError if the bundle type isn't registered.
30438
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   154
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   155
        Will abort if the engine is known but not available.
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   156
        """
30438
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   157
        engine = self._engines[self._bundlenames[bundlename]]
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   158
        if not engine.available():
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   159
            raise error.Abort(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   160
                _(b'compression engine %s could not be loaded') % engine.name()
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   161
            )
30438
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   162
        return engine
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   163
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   164
    def forbundletype(self, bundletype):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   165
        """Obtain a compression engine registered to a bundle type.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   166
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   167
        Will raise KeyError if the bundle type isn't registered.
30438
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   168
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   169
        Will abort if the engine is known but not available.
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   170
        """
30438
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   171
        engine = self._engines[self._bundletypes[bundletype]]
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   172
        if not engine.available():
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   173
            raise error.Abort(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   174
                _(b'compression engine %s could not be loaded') % engine.name()
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   175
            )
30438
90933e4e44fd util: check for compression engine availability before returning
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30437
diff changeset
   176
        return engine
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   177
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   178
    def supportedwireengines(self, role, onlyavailable=True):
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   179
        """Obtain compression engines that support the wire protocol.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   180
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   181
        Returns a list of engines in prioritized order, most desired first.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   182
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   183
        If ``onlyavailable`` is set, filter out engines that can't be
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   184
        loaded.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   185
        """
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   186
        assert role in (SERVERROLE, CLIENTROLE)
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   187
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   188
        attr = b'serverpriority' if role == SERVERROLE else b'clientpriority'
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   189
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   190
        engines = [self._engines[e] for e in self._wiretypes.values()]
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   191
        if onlyavailable:
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   192
            engines = [e for e in engines if e.available()]
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   193
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   194
        def getkey(e):
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   195
            # Sort first by priority, highest first. In case of tie, sort
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   196
            # alphabetically. This is arbitrary, but ensures output is
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   197
            # stable.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   198
            w = e.wireprotosupport()
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   199
            return -1 * getattr(w, attr), w.name
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   200
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   201
        return list(sorted(engines, key=getkey))
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   202
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   203
    def forwiretype(self, wiretype):
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   204
        engine = self._engines[self._wiretypes[wiretype]]
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   205
        if not engine.available():
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   206
            raise error.Abort(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   207
                _(b'compression engine %s could not be loaded') % engine.name()
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   208
            )
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   209
        return engine
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   210
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   211
    def forrevlogheader(self, header):
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   212
        """Obtain a compression engine registered to a revlog header.
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   213
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   214
        Will raise KeyError if the revlog header value isn't registered.
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   215
        """
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   216
        return self._engines[self._revlogheaders[header]]
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   217
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   218
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   219
compengines = compressormanager()
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   220
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   221
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   222
class compressionengine:
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   223
    """Base class for compression engines.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   224
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   225
    Compression engines must implement the interface defined by this class.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   226
    """
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   227
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   228
    def name(self):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   229
        """Returns the name of the compression engine.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   230
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   231
        This is the key the engine is registered under.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   232
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   233
        This method must be implemented.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   234
        """
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   235
        raise NotImplementedError()
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   236
30437
64d7275445d0 util: expose an "available" API on compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30418
diff changeset
   237
    def available(self):
64d7275445d0 util: expose an "available" API on compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30418
diff changeset
   238
        """Whether the compression engine is available.
64d7275445d0 util: expose an "available" API on compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30418
diff changeset
   239
64d7275445d0 util: expose an "available" API on compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30418
diff changeset
   240
        The intent of this method is to allow optional compression engines
64d7275445d0 util: expose an "available" API on compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30418
diff changeset
   241
        that may not be available in all installations (such as engines relying
64d7275445d0 util: expose an "available" API on compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30418
diff changeset
   242
        on C extensions that may not be present).
64d7275445d0 util: expose an "available" API on compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30418
diff changeset
   243
        """
64d7275445d0 util: expose an "available" API on compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30418
diff changeset
   244
        return True
64d7275445d0 util: expose an "available" API on compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30418
diff changeset
   245
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   246
    def bundletype(self):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   247
        """Describes bundle identifiers for this engine.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   248
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   249
        If this compression engine isn't supported for bundles, returns None.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   250
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   251
        If this engine can be used for bundles, returns a 2-tuple of strings of
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   252
        the user-facing "bundle spec" compression name and an internal
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   253
        identifier used to denote the compression format within bundles. To
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   254
        exclude the name from external usage, set the first element to ``None``.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   255
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   256
        If bundle compression is supported, the class must also implement
30359
673f0fdc1046 util: remove compressorobj API from compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30356
diff changeset
   257
        ``compressstream`` and `decompressorreader``.
31792
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   258
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   259
        The docstring of this method is used in the help system to tell users
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   260
        about this engine.
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   261
        """
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   262
        return None
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   263
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   264
    def wireprotosupport(self):
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   265
        """Declare support for this compression format on the wire protocol.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   266
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   267
        If this compression engine isn't supported for compressing wire
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   268
        protocol payloads, returns None.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   269
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   270
        Otherwise, returns ``compenginewireprotosupport`` with the following
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   271
        fields:
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   272
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   273
        * String format identifier
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   274
        * Integer priority for the server
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   275
        * Integer priority for the client
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   276
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   277
        The integer priorities are used to order the advertisement of format
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   278
        support by server and client. The highest integer is advertised
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   279
        first. Integers with non-positive values aren't advertised.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   280
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   281
        The priority values are somewhat arbitrary and only used for default
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   282
        ordering. The relative order can be changed via config options.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   283
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   284
        If wire protocol compression is supported, the class must also implement
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   285
        ``compressstream`` and ``decompressorreader``.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   286
        """
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   287
        return None
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   288
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   289
    def revlogheader(self):
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   290
        """Header added to revlog chunks that identifies this engine.
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   291
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   292
        If this engine can be used to compress revlogs, this method should
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   293
        return the bytes used to identify chunks compressed with this engine.
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   294
        Else, the method should return ``None`` to indicate it does not
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   295
        participate in revlog compression.
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   296
        """
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   297
        return None
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   298
30356
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   299
    def compressstream(self, it, opts=None):
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   300
        """Compress an iterator of chunks.
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   301
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   302
        The method receives an iterator (ideally a generator) of chunks of
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   303
        bytes to be compressed. It returns an iterator (ideally a generator)
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   304
        of bytes of chunks representing the compressed output.
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   305
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   306
        Optionally accepts an argument defining how to perform compression.
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   307
        Each engine treats this argument differently.
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   308
        """
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   309
        raise NotImplementedError()
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   310
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   311
    def decompressorreader(self, fh):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   312
        """Perform decompression on a file object.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   313
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   314
        Argument is an object with a ``read(size)`` method that returns
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   315
        compressed data. Return value is an object with a ``read(size)`` that
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   316
        returns uncompressed data.
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   317
        """
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   318
        raise NotImplementedError()
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   319
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   320
    def revlogcompressor(self, opts=None):
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   321
        """Obtain an object that can be used to compress revlog entries.
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   322
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   323
        The object has a ``compress(data)`` method that compresses binary
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   324
        data. This method returns compressed binary data or ``None`` if
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   325
        the data could not be compressed (too small, not compressible, etc).
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   326
        The returned data should have a header uniquely identifying this
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   327
        compression format so decompression can be routed to this engine.
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   328
        This header should be identified by the ``revlogheader()`` return
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   329
        value.
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   330
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   331
        The object has a ``decompress(data)`` method that decompresses
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   332
        data. The method will only be called if ``data`` begins with
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   333
        ``revlogheader()``. The method should return the raw, uncompressed
39777
b63dee7bd0d9 global: replace most uses of RevlogError with StorageError (API)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39571
diff changeset
   334
        data or raise a ``StorageError``.
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   335
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   336
        The object is reusable but is not thread safe.
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   337
        """
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   338
        raise NotImplementedError()
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   339
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   340
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   341
class _CompressedStreamReader:
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   342
    def __init__(self, fh):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   343
        if safehasattr(fh, 'unbufferedread'):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   344
            self._reader = fh.unbufferedread
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   345
        else:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   346
            self._reader = fh.read
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   347
        self._pending = []
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   348
        self._pos = 0
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   349
        self._eof = False
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   350
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   351
    def _decompress(self, chunk):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   352
        raise NotImplementedError()
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   353
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   354
    def read(self, l):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   355
        buf = []
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   356
        while True:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   357
            while self._pending:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   358
                if len(self._pending[0]) > l + self._pos:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   359
                    newbuf = self._pending[0]
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   360
                    buf.append(newbuf[self._pos : self._pos + l])
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   361
                    self._pos += l
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   362
                    return b''.join(buf)
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   363
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   364
                newbuf = self._pending.pop(0)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   365
                if self._pos:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   366
                    buf.append(newbuf[self._pos :])
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   367
                    l -= len(newbuf) - self._pos
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   368
                else:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   369
                    buf.append(newbuf)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   370
                    l -= len(newbuf)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   371
                self._pos = 0
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   372
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   373
            if self._eof:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   374
                return b''.join(buf)
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   375
            chunk = self._reader(65536)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   376
            self._decompress(chunk)
39209
1af95139e5ec util: improve handling of truncated compressed streams
Joerg Sonnenberger <joerg@bec.de>
parents: 39060
diff changeset
   377
            if not chunk and not self._pending and not self._eof:
1af95139e5ec util: improve handling of truncated compressed streams
Joerg Sonnenberger <joerg@bec.de>
parents: 39060
diff changeset
   378
                # No progress and no new data, bail out
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   379
                return b''.join(buf)
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   380
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   381
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   382
class _GzipCompressedStreamReader(_CompressedStreamReader):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   383
    def __init__(self, fh):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   384
        super(_GzipCompressedStreamReader, self).__init__(fh)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   385
        self._decompobj = zlib.decompressobj()
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   386
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   387
    def _decompress(self, chunk):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   388
        newbuf = self._decompobj.decompress(chunk)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   389
        if newbuf:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   390
            self._pending.append(newbuf)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   391
        d = self._decompobj.copy()
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   392
        try:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   393
            d.decompress(b'x')
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   394
            d.flush()
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   395
            if d.unused_data == b'x':
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   396
                self._eof = True
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   397
        except zlib.error:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   398
            pass
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   399
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   400
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   401
class _BZ2CompressedStreamReader(_CompressedStreamReader):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   402
    def __init__(self, fh):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   403
        super(_BZ2CompressedStreamReader, self).__init__(fh)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   404
        self._decompobj = bz2.BZ2Decompressor()
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   405
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   406
    def _decompress(self, chunk):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   407
        newbuf = self._decompobj.decompress(chunk)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   408
        if newbuf:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   409
            self._pending.append(newbuf)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   410
        try:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   411
            while True:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   412
                newbuf = self._decompobj.decompress(b'')
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   413
                if newbuf:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   414
                    self._pending.append(newbuf)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   415
                else:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   416
                    break
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   417
        except EOFError:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   418
            self._eof = True
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   419
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   420
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   421
class _TruncatedBZ2CompressedStreamReader(_BZ2CompressedStreamReader):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   422
    def __init__(self, fh):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   423
        super(_TruncatedBZ2CompressedStreamReader, self).__init__(fh)
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   424
        newbuf = self._decompobj.decompress(b'BZ')
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   425
        if newbuf:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   426
            self._pending.append(newbuf)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   427
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   428
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   429
class _ZstdCompressedStreamReader(_CompressedStreamReader):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   430
    def __init__(self, fh, zstd):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   431
        super(_ZstdCompressedStreamReader, self).__init__(fh)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   432
        self._zstd = zstd
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   433
        self._decompobj = zstd.ZstdDecompressor().decompressobj()
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   434
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   435
    def _decompress(self, chunk):
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   436
        newbuf = self._decompobj.decompress(chunk)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   437
        if newbuf:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   438
            self._pending.append(newbuf)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   439
        try:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   440
            while True:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   441
                newbuf = self._decompobj.decompress(b'')
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   442
                if newbuf:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   443
                    self._pending.append(newbuf)
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   444
                else:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   445
                    break
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   446
        except self._zstd.ZstdError:
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   447
            self._eof = True
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   448
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   449
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   450
class _zlibengine(compressionengine):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   451
    def name(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   452
        return b'zlib'
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   453
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   454
    def bundletype(self):
31792
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   455
        """zlib compression using the DEFLATE algorithm.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   456
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   457
        All Mercurial clients should support this format. The compression
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   458
        algorithm strikes a reasonable balance between compression ratio
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   459
        and size.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   460
        """
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   461
        return b'gzip', b'GZ'
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   462
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   463
    def wireprotosupport(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   464
        return compewireprotosupport(b'zlib', 20, 20)
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   465
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   466
    def revlogheader(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   467
        return b'x'
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   468
30356
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   469
    def compressstream(self, it, opts=None):
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   470
        opts = opts or {}
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   471
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   472
        z = zlib.compressobj(opts.get(b'level', -1))
30356
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   473
        for chunk in it:
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   474
            data = z.compress(chunk)
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   475
            # Not all calls to compress emit data. It is cheaper to inspect
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   476
            # here than to feed empty chunks through generator.
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   477
            if data:
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   478
                yield data
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   479
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   480
        yield z.flush()
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   481
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   482
    def decompressorreader(self, fh):
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   483
        return _GzipCompressedStreamReader(fh)
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   484
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   485
    class zlibrevlogcompressor:
42042
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   486
        def __init__(self, level=None):
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   487
            self._level = level
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   488
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   489
        def compress(self, data):
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   490
            insize = len(data)
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   491
            # Caller handles empty input case.
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   492
            assert insize > 0
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   493
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   494
            if insize < 44:
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   495
                return None
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   496
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   497
            elif insize <= 1000000:
42042
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   498
                if self._level is None:
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   499
                    compressed = zlib.compress(data)
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   500
                else:
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   501
                    compressed = zlib.compress(data, self._level)
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   502
                if len(compressed) < insize:
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   503
                    return compressed
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   504
                return None
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   505
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   506
            # zlib makes an internal copy of the input buffer, doubling
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   507
            # memory usage for large inputs. So do streaming compression
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   508
            # on large inputs.
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   509
            else:
42042
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   510
                if self._level is None:
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   511
                    z = zlib.compressobj()
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   512
                else:
aaececb4b066 compression: accept level management for zlib compression
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42041
diff changeset
   513
                    z = zlib.compressobj(level=self._level)
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   514
                parts = []
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   515
                pos = 0
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   516
                while pos < insize:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   517
                    pos2 = pos + 2 ** 20
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   518
                    parts.append(z.compress(data[pos:pos2]))
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   519
                    pos = pos2
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   520
                parts.append(z.flush())
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   521
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   522
                if sum(map(len, parts)) < insize:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   523
                    return b''.join(parts)
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   524
                return None
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   525
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   526
        def decompress(self, data):
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   527
            try:
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   528
                return zlib.decompress(data)
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   529
            except zlib.error as e:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   530
                raise error.StorageError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   531
                    _(b'revlog decompress error: %s')
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   532
                    % stringutil.forcebytestr(e)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   533
                )
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   534
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   535
    def revlogcompressor(self, opts=None):
42043
1fac9b931d46 compression: introduce a `storage.revlog.zlib.level` configuration
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42042
diff changeset
   536
        level = None
1fac9b931d46 compression: introduce a `storage.revlog.zlib.level` configuration
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42042
diff changeset
   537
        if opts is not None:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   538
            level = opts.get(b'zlib.level')
42043
1fac9b931d46 compression: introduce a `storage.revlog.zlib.level` configuration
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42042
diff changeset
   539
        return self.zlibrevlogcompressor(level)
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   540
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   541
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   542
compengines.register(_zlibengine())
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   543
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   544
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   545
class _bz2engine(compressionengine):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   546
    def name(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   547
        return b'bz2'
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   548
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   549
    def bundletype(self):
31792
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   550
        """An algorithm that produces smaller bundles than ``gzip``.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   551
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   552
        All Mercurial clients should support this format.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   553
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   554
        This engine will likely produce smaller bundles than ``gzip`` but
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   555
        will be significantly slower, both during compression and
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   556
        decompression.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   557
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   558
        If available, the ``zstd`` engine can yield similar or better
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   559
        compression at much higher speeds.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   560
        """
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   561
        return b'bzip2', b'BZ'
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   562
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   563
    # We declare a protocol name but don't advertise by default because
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   564
    # it is slow.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   565
    def wireprotosupport(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   566
        return compewireprotosupport(b'bzip2', 0, 0)
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   567
30356
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   568
    def compressstream(self, it, opts=None):
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   569
        opts = opts or {}
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   570
        z = bz2.BZ2Compressor(opts.get(b'level', 9))
30356
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   571
        for chunk in it:
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   572
            data = z.compress(chunk)
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   573
            if data:
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   574
                yield data
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   575
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   576
        yield z.flush()
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   577
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   578
    def decompressorreader(self, fh):
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   579
        return _BZ2CompressedStreamReader(fh)
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   580
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   581
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   582
compengines.register(_bz2engine())
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   583
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   584
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   585
class _truncatedbz2engine(compressionengine):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   586
    def name(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   587
        return b'bz2truncated'
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   588
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   589
    def bundletype(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   590
        return None, b'_truncatedBZ'
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   591
30359
673f0fdc1046 util: remove compressorobj API from compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30356
diff changeset
   592
    # We don't implement compressstream because it is hackily handled elsewhere.
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   593
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   594
    def decompressorreader(self, fh):
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   595
        return _TruncatedBZ2CompressedStreamReader(fh)
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   596
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   597
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   598
compengines.register(_truncatedbz2engine())
30265
6a8aff737a17 util: put compression code next to each other
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30181
diff changeset
   599
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   600
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   601
class _noopengine(compressionengine):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   602
    def name(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   603
        return b'none'
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   604
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   605
    def bundletype(self):
31792
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   606
        """No compression is performed.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   607
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   608
        Use this compression engine to explicitly disable compression.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   609
        """
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   610
        return b'none', b'UN'
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   611
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   612
    # Clients always support uncompressed payloads. Servers don't because
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   613
    # unless you are on a fast network, uncompressed payloads can easily
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   614
    # saturate your network pipe.
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   615
    def wireprotosupport(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   616
        return compewireprotosupport(b'none', 0, 10)
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   617
46121
84130fd74a22 revlog: support none compression
Joerg Sonnenberger <joerg@bec.de>
parents: 45942
diff changeset
   618
    # revlog special cases the uncompressed case, but implementing
84130fd74a22 revlog: support none compression
Joerg Sonnenberger <joerg@bec.de>
parents: 45942
diff changeset
   619
    # revlogheader allows forcing uncompressed storage.
84130fd74a22 revlog: support none compression
Joerg Sonnenberger <joerg@bec.de>
parents: 45942
diff changeset
   620
    def revlogheader(self):
84130fd74a22 revlog: support none compression
Joerg Sonnenberger <joerg@bec.de>
parents: 45942
diff changeset
   621
        return b'\0'
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   622
30356
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   623
    def compressstream(self, it, opts=None):
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   624
        return it
c86109eface7 util: add a stream compression API to compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30355
diff changeset
   625
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   626
    def decompressorreader(self, fh):
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   627
        return fh
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   628
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   629
    class nooprevlogcompressor:
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   630
        def compress(self, data):
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   631
            return None
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   632
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   633
    def revlogcompressor(self, opts=None):
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   634
        return self.nooprevlogcompressor()
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   635
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   636
30350
358cda0af6ee util: create new abstraction for compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30332
diff changeset
   637
compengines.register(_noopengine())
26266
1e042e31bd0c changegroup: move all compressions utilities in util
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 26201
diff changeset
   638
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   639
30442
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   640
class _zstdengine(compressionengine):
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   641
    def name(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   642
        return b'zstd'
30442
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   643
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   644
    @propertycache
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   645
    def _module(self):
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   646
        # Not all installs have the zstd module available. So defer importing
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   647
        # until first access.
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   648
        try:
43475
9ce76155e0da compression: tell pytype to not sweat a missing `zstd` module
Augie Fackler <augie@google.com>
parents: 43089
diff changeset
   649
            from .. import zstd  # pytype: disable=import-error
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   650
30442
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   651
            # Force delayed import.
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   652
            zstd.__version__
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   653
            return zstd
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   654
        except ImportError:
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   655
            return None
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   656
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   657
    def available(self):
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   658
        return bool(self._module)
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   659
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   660
    def bundletype(self):
31792
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   661
        """A modern compression algorithm that is fast and highly flexible.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   662
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   663
        Only supported by Mercurial 4.1 and newer clients.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   664
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   665
        With the default settings, zstd compression is both faster and yields
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   666
        better compression than ``gzip``. It also frequently yields better
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   667
        compression than ``bzip2`` while operating at much higher speeds.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   668
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   669
        If this engine is available and backwards compatibility is not a
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   670
        concern, it is likely the best available engine.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   671
        """
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   672
        return b'zstd', b'ZS'
30442
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   673
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   674
    def wireprotosupport(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   675
        return compewireprotosupport(b'zstd', 50, 50)
30761
7283719e2bfd util: declare wire protocol support of compression engines
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30745
diff changeset
   676
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   677
    def revlogheader(self):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   678
        return b'\x28'
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   679
30442
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   680
    def compressstream(self, it, opts=None):
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   681
        opts = opts or {}
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   682
        # zstd level 3 is almost always significantly faster than zlib
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   683
        # while providing no worse compression. It strikes a good balance
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   684
        # between speed and compression.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   685
        level = opts.get(b'level', 3)
46776
471cd86c8eb4 bundle: optional multithreaded compression, ATM zstd-only
Joerg Sonnenberger <joerg@bec.de>
parents: 46121
diff changeset
   686
        # default to single-threaded compression
471cd86c8eb4 bundle: optional multithreaded compression, ATM zstd-only
Joerg Sonnenberger <joerg@bec.de>
parents: 46121
diff changeset
   687
        threads = opts.get(b'threads', 0)
30442
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   688
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   689
        zstd = self._module
46776
471cd86c8eb4 bundle: optional multithreaded compression, ATM zstd-only
Joerg Sonnenberger <joerg@bec.de>
parents: 46121
diff changeset
   690
        z = zstd.ZstdCompressor(level=level, threads=threads).compressobj()
30442
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   691
        for chunk in it:
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   692
            data = z.compress(chunk)
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   693
            if data:
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   694
                yield data
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   695
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   696
        yield z.flush()
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   697
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   698
    def decompressorreader(self, fh):
38713
27391d74aaa2 ssh: avoid reading beyond the end of stream when using compression
Joerg Sonnenberger <joerg@bec.de>
parents: 38575
diff changeset
   699
        return _ZstdCompressedStreamReader(fh, self._module)
30442
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   700
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   701
    class zstdrevlogcompressor:
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   702
        def __init__(self, zstd, level=3):
37496
1765ed63db40 util: drop write_content_size=True
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37455
diff changeset
   703
            # TODO consider omitting frame magic to save 4 bytes.
1765ed63db40 util: drop write_content_size=True
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37455
diff changeset
   704
            # This writes content sizes into the frame header. That is
1765ed63db40 util: drop write_content_size=True
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37455
diff changeset
   705
            # extra storage. But it allows a correct size memory allocation
1765ed63db40 util: drop write_content_size=True
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37455
diff changeset
   706
            # to hold the result.
1765ed63db40 util: drop write_content_size=True
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37455
diff changeset
   707
            self._cctx = zstd.ZstdCompressor(level=level)
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   708
            self._dctx = zstd.ZstdDecompressor()
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   709
            self._compinsize = zstd.COMPRESSION_RECOMMENDED_INPUT_SIZE
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   710
            self._decompinsize = zstd.DECOMPRESSION_RECOMMENDED_INPUT_SIZE
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   711
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   712
        def compress(self, data):
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   713
            insize = len(data)
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   714
            # Caller handles empty input case.
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   715
            assert insize > 0
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   716
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   717
            if insize < 50:
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   718
                return None
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   719
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   720
            elif insize <= 1000000:
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   721
                compressed = self._cctx.compress(data)
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   722
                if len(compressed) < insize:
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   723
                    return compressed
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   724
                return None
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   725
            else:
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   726
                z = self._cctx.compressobj()
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   727
                chunks = []
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   728
                pos = 0
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   729
                while pos < insize:
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   730
                    pos2 = pos + self._compinsize
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   731
                    chunk = z.compress(data[pos:pos2])
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   732
                    if chunk:
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   733
                        chunks.append(chunk)
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   734
                    pos = pos2
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   735
                chunks.append(z.flush())
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   736
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   737
                if sum(map(len, chunks)) < insize:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   738
                    return b''.join(chunks)
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   739
                return None
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   740
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   741
        def decompress(self, data):
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   742
            insize = len(data)
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   743
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   744
            try:
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   745
                # This was measured to be faster than other streaming
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   746
                # decompressors.
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   747
                dobj = self._dctx.decompressobj()
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   748
                chunks = []
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   749
                pos = 0
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   750
                while pos < insize:
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   751
                    pos2 = pos + self._decompinsize
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   752
                    chunk = dobj.decompress(data[pos:pos2])
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   753
                    if chunk:
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   754
                        chunks.append(chunk)
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   755
                    pos = pos2
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   756
                # Frame should be exhausted, so no finish() API.
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   757
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   758
                return b''.join(chunks)
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   759
            except Exception as e:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   760
                raise error.StorageError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   761
                    _(b'revlog decompress error: %s')
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   762
                    % stringutil.forcebytestr(e)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   763
                )
30798
f50c0db50025 util: compression APIs to support revlog decompression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30794
diff changeset
   764
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   765
    def revlogcompressor(self, opts=None):
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   766
        opts = opts or {}
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   767
        level = opts.get(b'zstd.level')
42044
bb271ec2fbfb compression: introduce a `storage.revlog.zstd.level` configuration
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42043
diff changeset
   768
        if level is None:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   769
            level = opts.get(b'level')
42044
bb271ec2fbfb compression: introduce a `storage.revlog.zstd.level` configuration
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42043
diff changeset
   770
        if level is None:
bb271ec2fbfb compression: introduce a `storage.revlog.zstd.level` configuration
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42043
diff changeset
   771
            level = 3
bb271ec2fbfb compression: introduce a `storage.revlog.zstd.level` configuration
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42043
diff changeset
   772
        return self.zstdrevlogcompressor(self._module, level=level)
30794
31e1f0d4ab44 util: compression APIs to support revlog compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30773
diff changeset
   773
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   774
30442
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   775
compengines.register(_zstdengine())
41a8106789ca util: implement zstd compression engine
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30438
diff changeset
   776
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   777
31792
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   778
def bundlecompressiontopics():
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   779
    """Obtains a list of available bundle compressions for use in help."""
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   780
    # help.makeitemsdocs() expects a dict of names to items with a .__doc__.
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   781
    items = {}
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   782
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   783
    # We need to format the docstring. So use a dummy object/type to hold it
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   784
    # rather than mutating the original.
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   785
    class docobject:
31792
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   786
        pass
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   787
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   788
    for name in compengines:
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   789
        engine = compengines[name]
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   790
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   791
        if not engine.available():
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   792
            continue
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   793
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   794
        bt = engine.bundletype()
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   795
        if not bt or not bt[0]:
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   796
            continue
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   797
40251
3c89227788a2 py3: build help of compression engines in bytes
Yuya Nishihara <yuya@tcha.org>
parents: 40029
diff changeset
   798
        doc = b'``%s``\n    %s' % (bt[0], pycompat.getdoc(engine.bundletype))
31792
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   799
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   800
        value = docobject()
40251
3c89227788a2 py3: build help of compression engines in bytes
Yuya Nishihara <yuya@tcha.org>
parents: 40029
diff changeset
   801
        value.__doc__ = pycompat.sysstr(doc)
33818
ed04d7254a91 i18n: use saved object to get actual function information if available
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 33799
diff changeset
   802
        value._origdoc = engine.bundletype.__doc__
ed04d7254a91 i18n: use saved object to get actual function information if available
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 33799
diff changeset
   803
        value._origfunc = engine.bundletype
31792
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   804
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   805
        items[bt[0]] = value
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   806
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   807
    return items
55c0c91f55e6 util: document bundle compression
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31777
diff changeset
   808
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42044
diff changeset
   809
33820
fa7e30efe05a i18n: get translation entries for description of each compression engines
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 33818
diff changeset
   810
i18nfunctions = bundlecompressiontopics().values()