mercurial/revlogutils/sidedata.py
author Joerg Sonnenberger <joerg@bec.de>
Fri, 30 Apr 2021 02:11:58 +0200
changeset 47043 12450fbea288
parent 46709 3d740058b467
child 47085 3aab2330b7d3
permissions -rw-r--r--
manifests: push down expected node length into the parser This strictly enforces the node length in the manifest lines according to what the repository expects. One test case moves large hash testing into the non-treemanifest part as treemanifests don't provide an interface for overriding just the node length for now. Differential Revision: https://phab.mercurial-scm.org/D10533
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
43033
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     1
# sidedata.py - Logic around store extra data alongside revlog revisions
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     2
#
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     3
# Copyright 2019 Pierre-Yves David <pierre-yves.david@octobus.net)
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     4
#
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     5
# This software may be used and distributed according to the terms of the
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     6
# GNU General Public License version 2 or any later version.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     7
"""core code for "sidedata" support
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     8
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     9
The "sidedata" are stored alongside the revision without actually being part of
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    10
its content and not affecting its hash. It's main use cases is to cache
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    11
important information related to a changesets.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    12
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    13
The current implementation is experimental and subject to changes. Do not rely
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    14
on it in production.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    15
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    16
Sidedata are stored in the revlog itself, thanks to a new version of the
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    17
revlog. The following format is currently used::
43033
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    18
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    19
    initial header:
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    20
        <number of sidedata; 2 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    21
    sidedata (repeated N times):
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    22
        <sidedata-key; 2 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    23
        <sidedata-entry-length: 4 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    24
        <sidedata-content-sha1-digest: 20 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    25
        <sidedata-content; X bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    26
    normal raw text:
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    27
        <all bytes remaining in the rawtext>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    28
46195
d6a9e690d620 comments: fix typos
Joerg Sonnenberger <joerg@bec.de>
parents: 45634
diff changeset
    29
This is a simple and effective format. It should be enough to experiment with
43033
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    30
the concept.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    31
"""
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    32
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    33
from __future__ import absolute_import
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    34
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    35
import struct
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    36
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    37
from .. import error
44060
a61287a95dc3 core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents: 43506
diff changeset
    38
from ..utils import hashutil
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    39
43040
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    40
## sidedata type constant
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    41
# reserve a block for testing purposes.
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    42
SD_TEST1 = 1
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    43
SD_TEST2 = 2
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    44
SD_TEST3 = 3
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    45
SD_TEST4 = 4
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    46
SD_TEST5 = 5
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    47
SD_TEST6 = 6
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    48
SD_TEST7 = 7
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    49
43142
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    50
# key to store copies related information
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    51
SD_P1COPIES = 8
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    52
SD_P2COPIES = 9
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    53
SD_FILESADDED = 10
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    54
SD_FILESREMOVED = 11
45634
9a6b409b8ebc changing-files: rework the way we store changed files in side-data
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 44060
diff changeset
    55
SD_FILES = 12
43142
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    56
43040
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    57
# internal format constant
43506
9f70512ae2cf cleanup: remove pointless r-prefixes on single-quoted strings
Augie Fackler <augie@google.com>
parents: 43142
diff changeset
    58
SIDEDATA_HEADER = struct.Struct('>H')
9f70512ae2cf cleanup: remove pointless r-prefixes on single-quoted strings
Augie Fackler <augie@google.com>
parents: 43142
diff changeset
    59
SIDEDATA_ENTRY = struct.Struct('>HL20s')
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    60
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43048
diff changeset
    61
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    62
def serialize_sidedata(sidedata):
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    63
    sidedata = list(sidedata.items())
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    64
    sidedata.sort()
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    65
    buf = [SIDEDATA_HEADER.pack(len(sidedata))]
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    66
    for key, value in sidedata:
44060
a61287a95dc3 core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents: 43506
diff changeset
    67
        digest = hashutil.sha1(value).digest()
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    68
        buf.append(SIDEDATA_ENTRY.pack(key, len(value), digest))
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    69
    for key, value in sidedata:
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    70
        buf.append(value)
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    71
    buf = b''.join(buf)
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    72
    return buf
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    73
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43048
diff changeset
    74
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    75
def deserialize_sidedata(blob):
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    76
    sidedata = {}
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    77
    offset = 0
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    78
    (nbentry,) = SIDEDATA_HEADER.unpack(blob[: SIDEDATA_HEADER.size])
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    79
    offset += SIDEDATA_HEADER.size
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    80
    dataoffset = SIDEDATA_HEADER.size + (SIDEDATA_ENTRY.size * nbentry)
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    81
    for i in range(nbentry):
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    82
        nextoffset = offset + SIDEDATA_ENTRY.size
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    83
        key, size, storeddigest = SIDEDATA_ENTRY.unpack(blob[offset:nextoffset])
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    84
        offset = nextoffset
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    85
        # read the data associated with that entry
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    86
        nextdataoffset = dataoffset + size
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    87
        entrytext = bytes(blob[dataoffset:nextdataoffset])
44060
a61287a95dc3 core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents: 43506
diff changeset
    88
        readdigest = hashutil.sha1(entrytext).digest()
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    89
        if storeddigest != readdigest:
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    90
            raise error.SidedataHashError(key, storeddigest, readdigest)
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    91
        sidedata[key] = entrytext
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    92
        dataoffset = nextdataoffset
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
    93
    return sidedata