mercurial/setdiscovery.py
author Gregory Szorc <gregory.szorc@gmail.com>
Fri, 17 Aug 2018 19:35:24 +0000
changeset 39178 274acf379dbb
parent 39176 fec01c69b0f0
child 41110 71b0db4fa027
permissions -rw-r--r--
setdiscovery: precompute children revisions to avoid quadratic lookup Moving away from dagutil a few commits ago introduced quadratic behavior when resolving children revisions during discovery. This commit introduces a precompute step of the children revisions to avoid the bad behavior. I believe the new code should have near identical performance to what dagutil was doing before. Behavior is still slightly different because we take into account filtered revisions. But this change was made when we moved off dagutil. I added a comment about multiple invocations of this function redundantly calculating the children revisions. I believe this potentially undesirable behavior was present when we used dagutil, as the call to inverse() previously in this function created a new object and required computing children on every invocation. I thought we should document the potential for a performance issue rather than let it go undocumented. Differential Revision: https://phab.mercurial-scm.org/D4326
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     1
# setdiscovery.py - improved discovery of common nodeset for mercurial
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     2
#
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     3
# Copyright 2010 Benoit Boissinot <bboissin@gmail.com>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     4
# and Peter Arrenbrecht <peter@arrenbrecht.ch>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     5
#
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     6
# This software may be used and distributed according to the terms of the
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     7
# GNU General Public License version 2 or any later version.
20656
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
     8
"""
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
     9
Algorithm works in the following way. You have two repository: local and
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    10
remote. They both contains a DAG of changelists.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    11
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    12
The goal of the discovery protocol is to find one set of node *common*,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    13
the set of nodes shared by local and remote.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    14
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    15
One of the issue with the original protocol was latency, it could
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    16
potentially require lots of roundtrips to discover that the local repo was a
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    17
subset of remote (which is a very common case, you usually have few changes
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    18
compared to upstream, while upstream probably had lots of development).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    19
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    20
The new protocol only requires one interface for the remote repo: `known()`,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    21
which given a set of changelists tells you if they are present in the DAG.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    22
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    23
The algorithm then works as follow:
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    24
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    25
 - We will be using three sets, `common`, `missing`, `unknown`. Originally
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    26
 all nodes are in `unknown`.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    27
 - Take a sample from `unknown`, call `remote.known(sample)`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    28
   - For each node that remote knows, move it and all its ancestors to `common`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    29
   - For each node that remote doesn't know, move it and all its descendants
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    30
   to `missing`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    31
 - Iterate until `unknown` is empty
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    32
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    33
There are a couple optimizations, first is instead of starting with a random
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    34
sample of missing, start by sending all heads, in the case where the local
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    35
repo is a subset, you computed the answer in one round trip.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    36
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    37
Then you can do something similar to the bisecting strategy used when
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    38
finding faulty changesets. Instead of random samples, you can try picking
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    39
nodes that will maximize the number of nodes that will be
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    40
classified with it (since all ancestors or descendants will be marked as well).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    41
"""
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    42
25973
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    43
from __future__ import absolute_import
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    44
25113
0ca8410ea345 util: drop alias for collections.deque
Martin von Zweigbergk <martinvonz@google.com>
parents: 23817
diff changeset
    45
import collections
20034
1e5b38a919dd cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents: 17426
diff changeset
    46
import random
25973
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    47
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    48
from .i18n import _
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    49
from .node import (
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    50
    nullid,
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    51
    nullrev,
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    52
)
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    53
from . import (
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25973
diff changeset
    54
    error,
32712
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32291
diff changeset
    55
    util,
25973
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    56
)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    57
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
    58
def _updatesample(revs, heads, sample, parentfn, quicksamplesize=0):
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    59
    """update an existing sample to match the expected size
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    60
39168
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39166
diff changeset
    61
    The sample is updated with revs exponentially distant from each head of the
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39166
diff changeset
    62
    <revs> set. (H~1, H~2, H~4, H~8, etc).
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    63
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    64
    If a target size is specified, the sampling will stop once this size is
39168
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39166
diff changeset
    65
    reached. Otherwise sampling will happen until roots of the <revs> set are
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    66
    reached.
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    67
39168
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39166
diff changeset
    68
    :revs:  set of revs we want to discover (if None, assume the whole dag)
39170
754f389b87f2 setdiscovery: pass heads into _updatesample()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39169
diff changeset
    69
    :heads: set of DAG head revs
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    70
    :sample: a sample to update
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
    71
    :parentfn: a callable to resolve parents for a revision
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    72
    :quicksamplesize: optional target size of the sample"""
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    73
    dist = {}
25113
0ca8410ea345 util: drop alias for collections.deque
Martin von Zweigbergk <martinvonz@google.com>
parents: 23817
diff changeset
    74
    visit = collections.deque(heads)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    75
    seen = set()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    76
    factor = 1
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    77
    while visit:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    78
        curr = visit.popleft()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    79
        if curr in seen:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    80
            continue
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    81
        d = dist.setdefault(curr, 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    82
        if d > factor:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    83
            factor *= 2
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    84
        if d == factor:
23814
6a5877a73141 setdiscovery: drop the 'always' argument to '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23813
diff changeset
    85
            sample.add(curr)
6a5877a73141 setdiscovery: drop the 'always' argument to '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23813
diff changeset
    86
            if quicksamplesize and (len(sample) >= quicksamplesize):
6a5877a73141 setdiscovery: drop the 'always' argument to '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23813
diff changeset
    87
                return
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    88
        seen.add(curr)
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
    89
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
    90
        for p in parentfn(curr):
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
    91
            if p != nullrev and (not revs or p in revs):
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    92
                dist.setdefault(p, d + 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    93
                visit.append(p)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    94
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
    95
def _takequicksample(repo, headrevs, revs, size):
23816
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
    96
    """takes a quick sample of size <size>
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
    97
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
    98
    It is meant for initial sampling and focuses on querying heads and close
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
    99
    ancestors of heads.
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
   100
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
   101
    :dag: a dag object
39171
abce899c985f setdiscovery: pass head revisions into sample functions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39170
diff changeset
   102
    :headrevs: set of head revisions in local DAG to consider
39168
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39166
diff changeset
   103
    :revs: set of revs to discover
23816
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
   104
    :size: the maximum size of the sample"""
39169
140992750187 setdiscovery: use a revset for finding DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39168
diff changeset
   105
    sample = set(repo.revs('heads(%ld)', revs))
140992750187 setdiscovery: use a revset for finding DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39168
diff changeset
   106
36718
59802fa590db setdiscovery: avoid a Yoda condition
Martin von Zweigbergk <martinvonz@google.com>
parents: 36717
diff changeset
   107
    if len(sample) >= size:
23815
31e75a362d44 setdiscovery: drop '_setupsample' usage in '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23814
diff changeset
   108
        return _limitsample(sample, size)
39170
754f389b87f2 setdiscovery: pass heads into _updatesample()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39169
diff changeset
   109
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
   110
    _updatesample(None, headrevs, sample, repo.changelog.parentrevs,
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
   111
                  quicksamplesize=size)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   112
    return sample
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   113
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
   114
def _takefullsample(repo, headrevs, revs, size):
39169
140992750187 setdiscovery: use a revset for finding DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39168
diff changeset
   115
    sample = set(repo.revs('heads(%ld)', revs))
140992750187 setdiscovery: use a revset for finding DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39168
diff changeset
   116
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   117
    # update from heads
39173
56279660d264 setdiscovery: use revsets for computing a subset's heads and roots
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39172
diff changeset
   118
    revsheads = set(repo.revs('heads(%ld)', revs))
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
   119
    _updatesample(revs, revsheads, sample, repo.changelog.parentrevs)
39178
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   120
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   121
    # update from roots
39173
56279660d264 setdiscovery: use revsets for computing a subset's heads and roots
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39172
diff changeset
   122
    revsroots = set(repo.revs('roots(%ld)', revs))
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
   123
39178
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   124
    # _updatesample() essentially does interaction over revisions to look up
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   125
    # their children. This lookup is expensive and doing it in a loop is
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   126
    # quadratic. We precompute the children for all relevant revisions and
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   127
    # make the lookup in _updatesample() a simple dict lookup.
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   128
    #
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   129
    # Because this function can be called multiple times during discovery, we
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   130
    # may still perform redundant work and there is room to optimize this by
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   131
    # keeping a persistent cache of children across invocations.
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   132
    children = {}
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
   133
39178
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   134
    parentrevs = repo.changelog.parentrevs
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   135
    for rev in repo.changelog.revs(start=min(revsroots)):
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   136
        # Always ensure revision has an entry so we don't need to worry about
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   137
        # missing keys.
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   138
        children.setdefault(rev, [])
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   139
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   140
        for prev in parentrevs(rev):
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   141
            if prev == nullrev:
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   142
                continue
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   143
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   144
            children.setdefault(prev, []).append(rev)
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   145
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39176
diff changeset
   146
    _updatesample(revs, revsroots, sample, children.__getitem__)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   147
    assert sample
23810
b681d3a2bf04 setdiscovery: randomly pick between heads and sample when taking full sample
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23809
diff changeset
   148
    sample = _limitsample(sample, size)
b681d3a2bf04 setdiscovery: randomly pick between heads and sample when taking full sample
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23809
diff changeset
   149
    if len(sample) < size:
b681d3a2bf04 setdiscovery: randomly pick between heads and sample when taking full sample
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23809
diff changeset
   150
        more = size - len(sample)
39168
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39166
diff changeset
   151
        sample.update(random.sample(list(revs - sample), more))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   152
    return sample
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   153
23083
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   154
def _limitsample(sample, desiredlen):
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   155
    """return a random subset of sample of at most desiredlen item"""
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   156
    if len(sample) > desiredlen:
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   157
        sample = set(random.sample(sample, desiredlen))
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   158
    return sample
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   159
36715
613954a17a25 setdiscovery: back out changeset 5cfdf6137af8 (issue5809)
Martin von Zweigbergk <martinvonz@google.com>
parents: 35849
diff changeset
   160
def findcommonheads(ui, local, remote,
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   161
                    initialsamplesize=100,
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   162
                    fullsamplesize=200,
35304
f77121b6bf1b setdiscover: allow to ignore part of the local graph
Boris Feld <boris.feld@octobus.net>
parents: 32768
diff changeset
   163
                    abortwhenunrelated=True,
f77121b6bf1b setdiscover: allow to ignore part of the local graph
Boris Feld <boris.feld@octobus.net>
parents: 32768
diff changeset
   164
                    ancestorsof=None):
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   165
    '''Return a tuple (common, anyincoming, remoteheads) used to identify
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   166
    missing nodes from or in remote.
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   167
    '''
32712
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32291
diff changeset
   168
    start = util.timer()
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32291
diff changeset
   169
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   170
    roundtrips = 0
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   171
    cl = local.changelog
39159
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38373
diff changeset
   172
    clnode = cl.node
39161
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   173
    clrev = cl.rev
39159
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38373
diff changeset
   174
35304
f77121b6bf1b setdiscover: allow to ignore part of the local graph
Boris Feld <boris.feld@octobus.net>
parents: 32768
diff changeset
   175
    if ancestorsof is not None:
39165
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39161
diff changeset
   176
        ownheads = [clrev(n) for n in ancestorsof]
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39161
diff changeset
   177
    else:
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39161
diff changeset
   178
        ownheads = [rev for rev in cl.headrevs() if rev != nullrev]
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39161
diff changeset
   179
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   180
    # early exit if we know all the specified remote heads already
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   181
    ui.debug("query 1; heads\n")
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   182
    roundtrips += 1
23084
3ef893520a85 setdiscovery: limit the size of the initial sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23083
diff changeset
   183
    sample = _limitsample(ownheads, initialsamplesize)
23192
73cfaa348650 discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents: 23191
diff changeset
   184
    # indices between sample and externalized version must match
73cfaa348650 discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents: 23191
diff changeset
   185
    sample = list(sample)
37631
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   186
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   187
    with remote.commandexecutor() as e:
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   188
        fheads = e.callcommand('heads', {})
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   189
        fknown = e.callcommand('known', {
39159
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38373
diff changeset
   190
            'nodes': [clnode(r) for r in sample],
37631
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   191
        })
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   192
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   193
    srvheadhashes, yesno = fheads.result(), fknown.result()
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   194
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   195
    if cl.tip() == nullid:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   196
        if srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   197
            return [nullid], True, srvheadhashes
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   198
        return [nullid], False, []
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   199
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   200
    # start actual discovery (we note this before the next "if" for
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   201
    # compatibility reasons)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   202
    ui.status(_("searching for changes\n"))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   203
39161
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   204
    srvheads = []
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   205
    for node in srvheadhashes:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   206
        if node == nullid:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   207
            continue
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   208
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   209
        try:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   210
            srvheads.append(clrev(node))
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   211
        # Catches unknown and filtered nodes.
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   212
        except error.LookupError:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   213
            continue
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39159
diff changeset
   214
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   215
    if len(srvheads) == len(srvheadhashes):
14833
308e1b5acc87 discovery: quiet note about heads
Matt Mackall <mpm@selenic.com>
parents: 14624
diff changeset
   216
        ui.debug("all remote heads known locally\n")
39159
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38373
diff changeset
   217
        return srvheadhashes, False, srvheadhashes
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   218
36716
bf485b70d0ae setdiscovery: remove initialsamplesize from a condition
Martin von Zweigbergk <martinvonz@google.com>
parents: 36715
diff changeset
   219
    if len(sample) == len(ownheads) and all(yesno):
15497
9bea3aed6ee1 add missing localization markup
Mads Kiilerich <mads@kiilerich.com>
parents: 15063
diff changeset
   220
        ui.note(_("all local heads known remotely\n"))
39159
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38373
diff changeset
   221
        ownheadhashes = [clnode(r) for r in ownheads]
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38373
diff changeset
   222
        return ownheadhashes, True, srvheadhashes
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   223
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   224
    # full blown discovery
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   225
16683
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   226
    # own nodes I know we both know
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   227
    # treat remote heads (and maybe own heads) as a first implicit sample
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   228
    # response
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   229
    common = cl.incrementalmissingrevs(srvheads)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   230
    commoninsample = set(n for i, n in enumerate(sample) if yesno[i])
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   231
    common.addbases(commoninsample)
23746
4ef2f2fa8b8b setdiscovery: drop shadowed 'undecided' assignment
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23343
diff changeset
   232
    # own nodes where I don't know if remote knows them
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   233
    undecided = set(common.missingancestors(ownheads))
16683
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   234
    # own nodes I know remote lacks
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   235
    missing = set()
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   236
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   237
    full = False
38350
9e70690a21ac setdiscovery: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 37631
diff changeset
   238
    progress = ui.makeprogress(_('searching'), unit=_('queries'))
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   239
    while undecided:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   240
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   241
        if sample:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   242
            missinginsample = [n for i, n in enumerate(sample) if not yesno[i]]
39166
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39165
diff changeset
   243
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39165
diff changeset
   244
            if missing:
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39165
diff changeset
   245
                missing.update(local.revs('descendants(%ld) - descendants(%ld)',
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39165
diff changeset
   246
                                          missinginsample, missing))
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39165
diff changeset
   247
            else:
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39165
diff changeset
   248
                missing.update(local.revs('descendants(%ld)', missinginsample))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   249
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   250
            undecided.difference_update(missing)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   251
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   252
        if not undecided:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   253
            break
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   254
23747
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   255
        if full or common.hasbases():
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   256
            if full:
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   257
                ui.note(_("sampling from both directions\n"))
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   258
            else:
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   259
                ui.debug("taking initial sample\n")
23807
e97e363a7000 setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23806
diff changeset
   260
            samplefunc = _takefullsample
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   261
            targetsize = fullsamplesize
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   262
        else:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   263
            # use even cheaper initial sample
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   264
            ui.debug("taking quick initial sample\n")
23807
e97e363a7000 setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23806
diff changeset
   265
            samplefunc = _takequicksample
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   266
            targetsize = initialsamplesize
23808
07d0f59e0ba7 setdiscovery: avoid calling any sample building if the undecided set is small
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23807
diff changeset
   267
        if len(undecided) < targetsize:
07d0f59e0ba7 setdiscovery: avoid calling any sample building if the undecided set is small
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23807
diff changeset
   268
            sample = list(undecided)
07d0f59e0ba7 setdiscovery: avoid calling any sample building if the undecided set is small
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23807
diff changeset
   269
        else:
39174
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39173
diff changeset
   270
            sample = samplefunc(local, ownheads, undecided, targetsize)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   271
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   272
        roundtrips += 1
38350
9e70690a21ac setdiscovery: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 37631
diff changeset
   273
        progress.update(roundtrips)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   274
        ui.debug("query %i; still undecided: %i, sample size is: %i\n"
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   275
                 % (roundtrips, len(undecided), len(sample)))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   276
        # indices between sample and externalized version must match
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   277
        sample = list(sample)
37630
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36718
diff changeset
   278
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36718
diff changeset
   279
        with remote.commandexecutor() as e:
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36718
diff changeset
   280
            yesno = e.callcommand('known', {
39159
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38373
diff changeset
   281
                'nodes': [clnode(r) for r in sample],
37630
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36718
diff changeset
   282
            }).result()
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36718
diff changeset
   283
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   284
        full = True
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   285
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   286
        if sample:
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   287
            commoninsample = set(n for i, n in enumerate(sample) if yesno[i])
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   288
            common.addbases(commoninsample)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   289
            common.removeancestorsfrom(undecided)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   290
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   291
    # heads(common) == heads(common.bases) since common represents common.bases
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   292
    # and all its ancestors
39176
fec01c69b0f0 setdiscovery: use revset for resolving DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39174
diff changeset
   293
    # The presence of nullrev will confuse heads(). So filter it out.
fec01c69b0f0 setdiscovery: use revset for resolving DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39174
diff changeset
   294
    result = set(local.revs('heads(%ld)', common.bases - {nullrev}))
32712
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32291
diff changeset
   295
    elapsed = util.timer() - start
38373
ef692614e601 progress: hide update(None) in a new complete() method
Martin von Zweigbergk <martinvonz@google.com>
parents: 38350
diff changeset
   296
    progress.complete()
32712
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32291
diff changeset
   297
    ui.debug("%d total queries in %.4fs\n" % (roundtrips, elapsed))
32768
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32713
diff changeset
   298
    msg = ('found %d common and %d unknown server heads,'
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32713
diff changeset
   299
           ' %d roundtrips in %.4fs\n')
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32713
diff changeset
   300
    missing = set(result) - set(srvheads)
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32713
diff changeset
   301
    ui.log('discovery', msg, len(result), len(missing), roundtrips,
32713
28240b75e880 discovery: log discovery result in non-trivial cases
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32712
diff changeset
   302
           elapsed)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   303
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   304
    if not result and srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   305
        if abortwhenunrelated:
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25973
diff changeset
   306
            raise error.Abort(_("repository is unrelated"))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   307
        else:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   308
            ui.warn(_("warning: repository is unrelated\n"))
32291
bd872f64a8ba cleanup: use set literals
Martin von Zweigbergk <martinvonz@google.com>
parents: 28437
diff changeset
   309
        return ({nullid}, True, srvheadhashes,)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   310
14981
192e02680d09 setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents: 14833
diff changeset
   311
    anyincoming = (srvheadhashes != [nullid])
39159
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38373
diff changeset
   312
    result = {clnode(r) for r in result}
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38373
diff changeset
   313
    return result, anyincoming, srvheadhashes