mercurial/setdiscovery.py
author Pierre-Yves David <pierre-yves.david@fb.com>
Sat, 01 Nov 2014 23:52:53 +0000
branchstable
changeset 23130 ced632394371
parent 23084 3ef893520a85
child 23191 86c35b7ae300
permissions -rw-r--r--
setdiscovery: limit the size of all sample (issue4411) Further digging on this issue show that the limit on the sample size used in discovery never works for heads. Here is a quote from the code itself: desiredlen = size - len(always) if desiredlen <= 0: # This could be bad if there are very many heads, all unknown to the # server. We're counting on long request support here. The long request support never landed and evolution make the "very many heads, all unknown to the server" case quite common. We implement a simple and stupid hard limit of sample size for all query. This should prevent HTTP 414 error with the current state of the code.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     1
# setdiscovery.py - improved discovery of common nodeset for mercurial
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     2
#
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     3
# Copyright 2010 Benoit Boissinot <bboissin@gmail.com>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     4
# and Peter Arrenbrecht <peter@arrenbrecht.ch>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     5
#
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     6
# This software may be used and distributed according to the terms of the
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     7
# GNU General Public License version 2 or any later version.
20656
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
     8
"""
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
     9
Algorithm works in the following way. You have two repository: local and
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    10
remote. They both contains a DAG of changelists.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    11
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    12
The goal of the discovery protocol is to find one set of node *common*,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    13
the set of nodes shared by local and remote.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    14
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    15
One of the issue with the original protocol was latency, it could
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    16
potentially require lots of roundtrips to discover that the local repo was a
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    17
subset of remote (which is a very common case, you usually have few changes
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    18
compared to upstream, while upstream probably had lots of development).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    19
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    20
The new protocol only requires one interface for the remote repo: `known()`,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    21
which given a set of changelists tells you if they are present in the DAG.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    22
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    23
The algorithm then works as follow:
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    24
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    25
 - We will be using three sets, `common`, `missing`, `unknown`. Originally
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    26
 all nodes are in `unknown`.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    27
 - Take a sample from `unknown`, call `remote.known(sample)`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    28
   - For each node that remote knows, move it and all its ancestors to `common`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    29
   - For each node that remote doesn't know, move it and all its descendants
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    30
   to `missing`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    31
 - Iterate until `unknown` is empty
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    32
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    33
There are a couple optimizations, first is instead of starting with a random
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    34
sample of missing, start by sending all heads, in the case where the local
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    35
repo is a subset, you computed the answer in one round trip.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    36
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    37
Then you can do something similar to the bisecting strategy used when
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    38
finding faulty changesets. Instead of random samples, you can try picking
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    39
nodes that will maximize the number of nodes that will be
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    40
classified with it (since all ancestors or descendants will be marked as well).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    41
"""
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    42
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    43
from node import nullid
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    44
from i18n import _
20034
1e5b38a919dd cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents: 17426
diff changeset
    45
import random
1e5b38a919dd cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents: 17426
diff changeset
    46
import util, dagutil
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    47
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    48
def _updatesample(dag, nodes, sample, always, quicksamplesize=0):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    49
    # if nodes is empty we scan the entire graph
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    50
    if nodes:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    51
        heads = dag.headsetofconnecteds(nodes)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    52
    else:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    53
        heads = dag.heads()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    54
    dist = {}
16834
cafd8a8fb713 util: subclass deque for Python 2.4 backwards compatibility
Bryan O'Sullivan <bryano@fb.com>
parents: 16683
diff changeset
    55
    visit = util.deque(heads)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    56
    seen = set()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    57
    factor = 1
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    58
    while visit:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    59
        curr = visit.popleft()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    60
        if curr in seen:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    61
            continue
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    62
        d = dist.setdefault(curr, 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    63
        if d > factor:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    64
            factor *= 2
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    65
        if d == factor:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    66
            if curr not in always: # need this check for the early exit below
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    67
                sample.add(curr)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    68
                if quicksamplesize and (len(sample) >= quicksamplesize):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    69
                    return
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    70
        seen.add(curr)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    71
        for p in dag.parents(curr):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    72
            if not nodes or p in nodes:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    73
                dist.setdefault(p, d + 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    74
                visit.append(p)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    75
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    76
def _setupsample(dag, nodes, size):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    77
    if len(nodes) <= size:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    78
        return set(nodes), None, 0
15063
c20688b7c061 setdiscovery: fix hang when #heads>200 (issue2971)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14981
diff changeset
    79
    always = dag.headsetofconnecteds(nodes)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    80
    desiredlen = size - len(always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    81
    if desiredlen <= 0:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    82
        # This could be bad if there are very many heads, all unknown to the
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    83
        # server. We're counting on long request support here.
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    84
        return always, None, desiredlen
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    85
    return always, set(), desiredlen
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    86
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    87
def _takequicksample(dag, nodes, size, initial):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    88
    always, sample, desiredlen = _setupsample(dag, nodes, size)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    89
    if sample is None:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    90
        return always
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    91
    if initial:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    92
        fromset = None
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    93
    else:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    94
        fromset = nodes
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    95
    _updatesample(dag, fromset, sample, always, quicksamplesize=desiredlen)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    96
    sample.update(always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    97
    return sample
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    98
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    99
def _takefullsample(dag, nodes, size):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   100
    always, sample, desiredlen = _setupsample(dag, nodes, size)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   101
    if sample is None:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   102
        return always
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   103
    # update from heads
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   104
    _updatesample(dag, nodes, sample, always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   105
    # update from roots
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   106
    _updatesample(dag.inverse(), nodes, sample, always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   107
    assert sample
23083
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   108
    sample = _limitsample(sample, desiredlen)
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   109
    if len(sample) < desiredlen:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   110
        more = desiredlen - len(sample)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   111
        sample.update(random.sample(list(nodes - sample - always), more))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   112
    sample.update(always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   113
    return sample
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   114
23083
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   115
def _limitsample(sample, desiredlen):
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   116
    """return a random subset of sample of at most desiredlen item"""
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   117
    if len(sample) > desiredlen:
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   118
        sample = set(random.sample(sample, desiredlen))
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   119
    return sample
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   120
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   121
def findcommonheads(ui, local, remote,
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   122
                    initialsamplesize=100,
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   123
                    fullsamplesize=200,
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   124
                    abortwhenunrelated=True):
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   125
    '''Return a tuple (common, anyincoming, remoteheads) used to identify
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   126
    missing nodes from or in remote.
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   127
    '''
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   128
    roundtrips = 0
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   129
    cl = local.changelog
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   130
    dag = dagutil.revlogdag(cl)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   131
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   132
    # early exit if we know all the specified remote heads already
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   133
    ui.debug("query 1; heads\n")
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   134
    roundtrips += 1
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   135
    ownheads = dag.heads()
23084
3ef893520a85 setdiscovery: limit the size of the initial sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23083
diff changeset
   136
    sample = _limitsample(ownheads, initialsamplesize)
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   137
    if remote.local():
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   138
        # stopgap until we have a proper localpeer that supports batch()
17204
4feb55e6931f localpeer: return only visible heads and branchmap
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 17191
diff changeset
   139
        srvheadhashes = remote.heads()
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   140
        yesno = remote.known(dag.externalizeall(sample))
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   141
    elif remote.capable('batch'):
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   142
        batch = remote.batch()
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   143
        srvheadhashesref = batch.heads()
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   144
        yesnoref = batch.known(dag.externalizeall(sample))
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   145
        batch.submit()
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   146
        srvheadhashes = srvheadhashesref.value
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   147
        yesno = yesnoref.value
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   148
    else:
17424
e7cfe3587ea4 fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents: 17204
diff changeset
   149
        # compatibility with pre-batch, but post-known remotes during 1.9
e7cfe3587ea4 fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents: 17204
diff changeset
   150
        # development
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   151
        srvheadhashes = remote.heads()
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   152
        sample = []
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   153
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   154
    if cl.tip() == nullid:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   155
        if srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   156
            return [nullid], True, srvheadhashes
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   157
        return [nullid], False, []
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   158
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   159
    # start actual discovery (we note this before the next "if" for
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   160
    # compatibility reasons)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   161
    ui.status(_("searching for changes\n"))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   162
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   163
    srvheads = dag.internalizeall(srvheadhashes, filterunknown=True)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   164
    if len(srvheads) == len(srvheadhashes):
14833
308e1b5acc87 discovery: quiet note about heads
Matt Mackall <mpm@selenic.com>
parents: 14624
diff changeset
   165
        ui.debug("all remote heads known locally\n")
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   166
        return (srvheadhashes, False, srvheadhashes,)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   167
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   168
    if sample and util.all(yesno):
15497
9bea3aed6ee1 add missing localization markup
Mads Kiilerich <mads@kiilerich.com>
parents: 15063
diff changeset
   169
        ui.note(_("all local heads known remotely\n"))
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   170
        ownheadhashes = dag.externalizeall(ownheads)
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   171
        return (ownheadhashes, True, srvheadhashes,)
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   172
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   173
    # full blown discovery
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   174
16683
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   175
    # own nodes where I don't know if remote knows them
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   176
    undecided = dag.nodeset()
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   177
    # own nodes I know we both know
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   178
    common = set()
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   179
    # own nodes I know remote lacks
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   180
    missing = set()
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   181
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   182
    # treat remote heads (and maybe own heads) as a first implicit sample
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   183
    # response
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   184
    common.update(dag.ancestorset(srvheads))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   185
    undecided.difference_update(common)
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   186
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   187
    full = False
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   188
    while undecided:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   189
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   190
        if sample:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   191
            commoninsample = set(n for i, n in enumerate(sample) if yesno[i])
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   192
            common.update(dag.ancestorset(commoninsample, common))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   193
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   194
            missinginsample = [n for i, n in enumerate(sample) if not yesno[i]]
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   195
            missing.update(dag.descendantset(missinginsample, missing))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   196
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   197
            undecided.difference_update(missing)
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   198
            undecided.difference_update(common)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   199
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   200
        if not undecided:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   201
            break
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   202
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   203
        if full:
15497
9bea3aed6ee1 add missing localization markup
Mads Kiilerich <mads@kiilerich.com>
parents: 15063
diff changeset
   204
            ui.note(_("sampling from both directions\n"))
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   205
            sample = _takefullsample(dag, undecided, size=fullsamplesize)
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   206
            targetsize = fullsamplesize
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   207
        elif common:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   208
            # use cheapish initial sample
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   209
            ui.debug("taking initial sample\n")
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   210
            sample = _takefullsample(dag, undecided, size=fullsamplesize)
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   211
            targetsize = fullsamplesize
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   212
        else:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   213
            # use even cheaper initial sample
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   214
            ui.debug("taking quick initial sample\n")
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   215
            sample = _takequicksample(dag, undecided, size=initialsamplesize,
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   216
                                      initial=True)
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   217
            targetsize = initialsamplesize
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   218
        sample = _limitsample(sample, targetsize)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   219
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   220
        roundtrips += 1
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   221
        ui.progress(_('searching'), roundtrips, unit=_('queries'))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   222
        ui.debug("query %i; still undecided: %i, sample size is: %i\n"
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   223
                 % (roundtrips, len(undecided), len(sample)))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   224
        # indices between sample and externalized version must match
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   225
        sample = list(sample)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   226
        yesno = remote.known(dag.externalizeall(sample))
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   227
        full = True
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   228
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   229
    result = dag.headsetofconnecteds(common)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   230
    ui.progress(_('searching'), None)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   231
    ui.debug("%d total queries\n" % roundtrips)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   232
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   233
    if not result and srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   234
        if abortwhenunrelated:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   235
            raise util.Abort(_("repository is unrelated"))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   236
        else:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   237
            ui.warn(_("warning: repository is unrelated\n"))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   238
        return (set([nullid]), True, srvheadhashes,)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   239
14981
192e02680d09 setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents: 14833
diff changeset
   240
    anyincoming = (srvheadhashes != [nullid])
192e02680d09 setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents: 14833
diff changeset
   241
    return dag.externalizeall(result), anyincoming, srvheadhashes