setdiscovery: don't use dagutil for parent resolution
authorGregory Szorc <gregory.szorc@gmail.com>
Fri, 17 Aug 2018 18:22:10 +0000
changeset 39174 71d83b315778
parent 39173 56279660d264
child 39175 bbb855c412c6
setdiscovery: don't use dagutil for parent resolution _updatesample()'s one remaining use of revlogdag is for resolving the parents of a revision. In 2 cases, we actually resolve parents. In 1, we operate on the inverted DAG and resolve children. This commit teaches _updatesample() to receive an argument defining the function to resolve "parent" revisions. Call sites pass in changelog.parentrevs() or a wrapper around changelog.children() accordingly. The use of children() is semantically correct. But it is quadratic, since revlog.children() does a range scan over all revisions starting at its input and effectively calls parentrevs() to build up the list of children. So calling it repeatedly in a loop is a recipe for bad performance. I will be implementing something better in a subsequent commit. I wanted to get the porting off of dagutil done in a way that was simple and correct. Like other patches in this series, this change is potentially impacted but revlogdag's ignorance of filtered revisions. The new code is filtering aware, since changelog's revs() (used by children() will skip filtered revisions and therefore hidden children won't appear. This is potentially backwards incompatible. But no tests fail and I think this code should respect visibility. Differential Revision: https://phab.mercurial-scm.org/D4322
mercurial/setdiscovery.py
--- a/mercurial/setdiscovery.py	Fri Aug 17 18:05:36 2018 +0000
+++ b/mercurial/setdiscovery.py	Fri Aug 17 18:22:10 2018 +0000
@@ -56,7 +56,7 @@
     util,
 )
 
-def _updatesample(dag, revs, heads, sample, quicksamplesize=0):
+def _updatesample(revs, heads, sample, parentfn, quicksamplesize=0):
     """update an existing sample to match the expected size
 
     The sample is updated with revs exponentially distant from each head of the
@@ -66,10 +66,10 @@
     reached. Otherwise sampling will happen until roots of the <revs> set are
     reached.
 
-    :dag: a dag object from dagutil
     :revs:  set of revs we want to discover (if None, assume the whole dag)
     :heads: set of DAG head revs
     :sample: a sample to update
+    :parentfn: a callable to resolve parents for a revision
     :quicksamplesize: optional target size of the sample"""
     dist = {}
     visit = collections.deque(heads)
@@ -87,12 +87,13 @@
             if quicksamplesize and (len(sample) >= quicksamplesize):
                 return
         seen.add(curr)
-        for p in dag.parents(curr):
-            if not revs or p in revs:
+
+        for p in parentfn(curr):
+            if p != nullrev and (not revs or p in revs):
                 dist.setdefault(p, d + 1)
                 visit.append(p)
 
-def _takequicksample(repo, dag, headrevs, revs, size):
+def _takequicksample(repo, headrevs, revs, size):
     """takes a quick sample of size <size>
 
     It is meant for initial sampling and focuses on querying heads and close
@@ -107,18 +108,23 @@
     if len(sample) >= size:
         return _limitsample(sample, size)
 
-    _updatesample(dag, None, headrevs, sample, quicksamplesize=size)
+    _updatesample(None, headrevs, sample, repo.changelog.parentrevs,
+                  quicksamplesize=size)
     return sample
 
-def _takefullsample(repo, dag, headrevs, revs, size):
+def _takefullsample(repo, headrevs, revs, size):
     sample = set(repo.revs('heads(%ld)', revs))
 
     # update from heads
     revsheads = set(repo.revs('heads(%ld)', revs))
-    _updatesample(dag, revs, revsheads, sample)
+    _updatesample(revs, revsheads, sample, repo.changelog.parentrevs)
     # update from roots
     revsroots = set(repo.revs('roots(%ld)', revs))
-    _updatesample(dag.inverse(), revs, revsroots, sample)
+
+    # TODO this is quadratic
+    parentfn = lambda rev: repo.changelog.children(repo.changelog.node(rev))
+
+    _updatesample(revs, revsroots, sample, parentfn)
     assert sample
     sample = _limitsample(sample, size)
     if len(sample) < size:
@@ -244,7 +250,7 @@
         if len(undecided) < targetsize:
             sample = list(undecided)
         else:
-            sample = samplefunc(local, dag, ownheads, undecided, targetsize)
+            sample = samplefunc(local, ownheads, undecided, targetsize)
 
         roundtrips += 1
         progress.update(roundtrips)