outgoing: rework the handling of the `missingroots` case to be faster
authorPierre-Yves David <pierre-yves.david@octobus.net>
Tue, 09 Apr 2024 22:36:35 +0200
changeset 51594 e3a5ec2d236a
parent 51593 b5500857e173
child 51595 3a6fae3bef35
outgoing: rework the handling of the `missingroots` case to be faster The previous implementation was slow, to the point it was taking a significant amount of `hg bundle --type none-streamv2` call. We rework the code to compute the same value much faster, making the operation disappear from the `hg bundle --type none-streamv2` profile. Someone would remark that producing a streamclone does not requires an `outgoing` object. However that is a matter for another day. There is other user of `missingroots` (non stream `hg bundle` call for example), and they will also benefit from this rework. We implement an old TODO in the process, directly computing the missing and common attribute as we have most element at hand already. ### benchmark.name = hg.command.bundle # bin-env-vars.hg.flavor = default # bin-env-vars.hg.py-re2-module = default # benchmark.variants.revs = all # benchmark.variants.type = none-streamv2 ## data-env-vars.name = heptapod-public-2024-03-25-zstd-sparse-revlog before: 7.750458 after: 6.665565 (-14.00%, -1.08) ## data-env-vars.name = mercurial-public-2024-03-22-zstd-sparse-revlog before: 0.700229 after: 0.496050 (-29.16%, -0.20) ## data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog before: 346.508952 after: 316.749699 (-8.59%, -29.76) ## data-env-vars.name = pypy-2024-03-22-zstd-sparse-revlog before: 3.401700 after: 2.915810 (-14.28%, -0.49) ## data-env-vars.name = tryton-public-2024-03-22-zstd-sparse-revlog before: 1.870798 after: 1.461583 (-21.87%, -0.41) note: this whole `missingroots` of outgoing has a limited number of callers and could likely be replace by something simpler (like taking an explicit "missing_revs" set for example). However this is a wider change and we focus on a small impact, quick rework that does not change the API for now.
mercurial/discovery.py
--- a/mercurial/discovery.py	Sun Apr 14 02:27:10 2024 +0200
+++ b/mercurial/discovery.py	Tue Apr 09 22:36:35 2024 +0200
@@ -18,6 +18,7 @@
     bookmarks,
     branchmap,
     error,
+    node as nodemod,
     obsolete,
     phases,
     pycompat,
@@ -98,29 +99,50 @@
     def __init__(
         self, repo, commonheads=None, ancestorsof=None, missingroots=None
     ):
-        # at least one of them must not be set
-        assert None in (commonheads, missingroots)
+        # at most one of them must not be set
+        if commonheads is not None and missingroots is not None:
+            m = 'commonheads and missingroots arguments are mutually exclusive'
+            raise error.ProgrammingError(m)
         cl = repo.changelog
+        missing = None
+        common = None
         if ancestorsof is None:
             ancestorsof = cl.heads()
         if missingroots:
             # TODO remove call to nodesbetween.
-            # TODO populate attributes on outgoing instance instead of setting
-            # discbases.
-            csets, roots, heads = cl.nodesbetween(missingroots, ancestorsof)
-            included = set(csets)
-            discbases = []
-            for n in csets:
-                discbases.extend([p for p in cl.parents(n) if p != repo.nullid])
-            ancestorsof = heads
-            commonheads = [n for n in discbases if n not in included]
+            missing_rev = repo.revs('%ln::%ln', missingroots, ancestorsof)
+            unfi = repo.unfiltered()
+            ucl = unfi.changelog
+            to_node = ucl.node
+            ancestorsof = [to_node(r) for r in ucl.headrevs(missing_rev)]
+            parent_revs = ucl.parentrevs
+            common_legs = set()
+            for r in missing_rev:
+                p1, p2 = parent_revs(r)
+                if p1 not in missing_rev:
+                    common_legs.add(p1)
+                if p2 not in missing_rev:
+                    common_legs.add(p2)
+            common_legs.discard(nodemod.nullrev)
+            if not common_legs:
+                commonheads = [repo.nullid]
+                common = set()
+            else:
+                commonheads_revs = unfi.revs(
+                    'heads(%ld::%ld)',
+                    common_legs,
+                    common_legs,
+                )
+                commonheads = [to_node(r) for r in commonheads_revs]
+                common = ucl.ancestors(commonheads_revs, inclusive=True)
+            missing = [to_node(r) for r in missing_rev]
         elif not commonheads:
             commonheads = [repo.nullid]
         self.commonheads = commonheads
         self.ancestorsof = ancestorsof
         self._revlog = cl
-        self._common = None
-        self._missing = None
+        self._common = common
+        self._missing = missing
         self.excluded = []
 
     def _computecommonmissing(self):