mercurial/revlogutils/sidedata.py
author Augie Fackler <augie@google.com>
Mon, 13 Jan 2020 17:15:14 -0500
changeset 44060 a61287a95dc3
parent 43506 9f70512ae2cf
child 45634 9a6b409b8ebc
permissions -rw-r--r--
core: migrate uses of hashlib.sha1 to hashutil.sha1 Differential Revision: https://phab.mercurial-scm.org/D7849
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
43033
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     1
# sidedata.py - Logic around store extra data alongside revlog revisions
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     2
#
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     3
# Copyright 2019 Pierre-Yves David <pierre-yves.david@octobus.net)
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     4
#
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     5
# This software may be used and distributed according to the terms of the
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     6
# GNU General Public License version 2 or any later version.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     7
"""core code for "sidedata" support
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     8
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
     9
The "sidedata" are stored alongside the revision without actually being part of
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    10
its content and not affecting its hash. It's main use cases is to cache
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    11
important information related to a changesets.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    12
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    13
The current implementation is experimental and subject to changes. Do not rely
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    14
on it in production.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    15
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    16
Sidedata are stored in the revlog itself, withing the revision rawtext. They
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    17
are inserted, removed from it using the flagprocessors mechanism. The following
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    18
format is currently used::
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    19
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    20
    initial header:
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    21
        <number of sidedata; 2 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    22
    sidedata (repeated N times):
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    23
        <sidedata-key; 2 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    24
        <sidedata-entry-length: 4 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    25
        <sidedata-content-sha1-digest: 20 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    26
        <sidedata-content; X bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    27
    normal raw text:
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    28
        <all bytes remaining in the rawtext>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    29
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    30
This is a simple and effective format. It should be enought to experiment with
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    31
the concept.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    32
"""
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    33
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
    34
from __future__ import absolute_import
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    35
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    36
import struct
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    37
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    38
from .. import error
44060
a61287a95dc3 core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents: 43506
diff changeset
    39
from ..utils import hashutil
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    40
43040
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    41
## sidedata type constant
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    42
# reserve a block for testing purposes.
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    43
SD_TEST1 = 1
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    44
SD_TEST2 = 2
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    45
SD_TEST3 = 3
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    46
SD_TEST4 = 4
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    47
SD_TEST5 = 5
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    48
SD_TEST6 = 6
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    49
SD_TEST7 = 7
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    50
43142
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    51
# key to store copies related information
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    52
SD_P1COPIES = 8
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    53
SD_P2COPIES = 9
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    54
SD_FILESADDED = 10
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    55
SD_FILESREMOVED = 11
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
    56
43040
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
    57
# internal format constant
43506
9f70512ae2cf cleanup: remove pointless r-prefixes on single-quoted strings
Augie Fackler <augie@google.com>
parents: 43142
diff changeset
    58
SIDEDATA_HEADER = struct.Struct('>H')
9f70512ae2cf cleanup: remove pointless r-prefixes on single-quoted strings
Augie Fackler <augie@google.com>
parents: 43142
diff changeset
    59
SIDEDATA_ENTRY = struct.Struct('>HL20s')
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    60
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43048
diff changeset
    61
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    62
def sidedatawriteprocessor(rl, text, sidedata):
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    63
    sidedata = list(sidedata.items())
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    64
    sidedata.sort()
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    65
    rawtext = [SIDEDATA_HEADER.pack(len(sidedata))]
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    66
    for key, value in sidedata:
44060
a61287a95dc3 core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents: 43506
diff changeset
    67
        digest = hashutil.sha1(value).digest()
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    68
        rawtext.append(SIDEDATA_ENTRY.pack(key, len(value), digest))
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    69
    for key, value in sidedata:
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    70
        rawtext.append(value)
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    71
    rawtext.append(bytes(text))
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    72
    return b''.join(rawtext), False
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
    73
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43048
diff changeset
    74
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    75
def sidedatareadprocessor(rl, text):
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    76
    sidedata = {}
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    77
    offset = 0
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43048
diff changeset
    78
    (nbentry,) = SIDEDATA_HEADER.unpack(text[: SIDEDATA_HEADER.size])
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    79
    offset += SIDEDATA_HEADER.size
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    80
    dataoffset = SIDEDATA_HEADER.size + (SIDEDATA_ENTRY.size * nbentry)
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    81
    for i in range(nbentry):
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    82
        nextoffset = offset + SIDEDATA_ENTRY.size
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    83
        key, size, storeddigest = SIDEDATA_ENTRY.unpack(text[offset:nextoffset])
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    84
        offset = nextoffset
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    85
        # read the data associated with that entry
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    86
        nextdataoffset = dataoffset + size
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    87
        entrytext = text[dataoffset:nextdataoffset]
44060
a61287a95dc3 core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents: 43506
diff changeset
    88
        readdigest = hashutil.sha1(entrytext).digest()
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    89
        if storeddigest != readdigest:
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    90
            raise error.SidedataHashError(key, storeddigest, readdigest)
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    91
        sidedata[key] = entrytext
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    92
        dataoffset = nextdataoffset
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    93
    text = text[dataoffset:]
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
    94
    return text, True, sidedata
43036
e8bc4c3d9a0b sidedata: add a function for _processflagsraw usage
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43035
diff changeset
    95
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43048
diff changeset
    96
43036
e8bc4c3d9a0b sidedata: add a function for _processflagsraw usage
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43035
diff changeset
    97
def sidedatarawprocessor(rl, text):
e8bc4c3d9a0b sidedata: add a function for _processflagsraw usage
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43035
diff changeset
    98
    # side data modifies rawtext and prevent rawtext hash validation
e8bc4c3d9a0b sidedata: add a function for _processflagsraw usage
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43035
diff changeset
    99
    return False
43037
142deb539ccf sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43036
diff changeset
   100
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43048
diff changeset
   101
43037
142deb539ccf sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43036
diff changeset
   102
processors = (
142deb539ccf sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43036
diff changeset
   103
    sidedatareadprocessor,
142deb539ccf sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43036
diff changeset
   104
    sidedatawriteprocessor,
142deb539ccf sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43036
diff changeset
   105
    sidedatarawprocessor,
142deb539ccf sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43036
diff changeset
   106
)