branchmap: avoid ancestor computations in absence of non-continous branches
authorJoerg Sonnenberger <joerg@bec.de>
Fri, 18 Dec 2020 14:45:28 +0100
changeset 46254 c4b792fa109e
parent 46253 1cebad969d88
child 46255 cce05a6c271f
branchmap: avoid ancestor computations in absence of non-continous branches The branchhead computation is one of the more heavy operations for bigger repositories as it has to scan all changesets and potentially involves the expensive computation of the ancestor sets. Redo the computation to handle the common cases directly and use tighter conditions for when the ancestor scan is necessary. Most importantly, avoid it completely if the non-continous branches are processed in one update as seen in the initial computation after a clone. For the Mercurial repository, it gives a small 2-3% performance boost. For the NetBSD test repository, it cuts the time in half. Differential Revision: https://phab.mercurial-scm.org/D9631
mercurial/branchmap.py
relnotes/next
tests/test-branches.t
--- a/mercurial/branchmap.py	Tue Jan 12 19:49:18 2021 +0100
+++ b/mercurial/branchmap.py	Fri Dec 18 14:45:28 2020 +0100
@@ -443,33 +443,82 @@
             if closesbranch:
                 self._closednodes.add(cl.node(r))
 
-        # fetch current topological heads to speed up filtering
-        topoheads = set(cl.headrevs())
-
         # new tip revision which we found after iterating items from new
         # branches
         ntiprev = self.tiprev
 
-        # if older branchheads are reachable from new ones, they aren't
-        # really branchheads. Note checking parents is insufficient:
-        # 1 (branch a) -> 2 (branch b) -> 3 (branch a)
+        # Delay fetching the topological heads until they are needed.
+        # A repository without non-continous branches can skip this part.
+        topoheads = None
+
+        # If a changeset is visible, its parents must be visible too, so
+        # use the faster unfiltered parent accessor.
+        parentrevs = repo.unfiltered().changelog.parentrevs
+
         for branch, newheadrevs in pycompat.iteritems(newbranches):
+            # For every branch, compute the new branchheads.
+            # A branchhead is a revision such that no descendant is on
+            # the same branch.
+            #
+            # The branchheads are computed iteratively in revision order.
+            # This ensures topological order, i.e. parents are processed
+            # before their children. Ancestors are inclusive here, i.e.
+            # any revision is an ancestor of itself.
+            #
+            # Core observations:
+            # - The current revision is always a branchhead for the
+            #   repository up to that point.
+            # - It is the first revision of the branch if and only if
+            #   there was no branchhead before. In that case, it is the
+            #   only branchhead as there are no possible ancestors on
+            #   the same branch.
+            # - If a parent is on the same branch, a branchhead can
+            #   only be an ancestor of that parent, if it is parent
+            #   itself. Otherwise it would have been removed as ancestor
+            #   of that parent before.
+            # - Therefore, if all parents are on the same branch, they
+            #   can just be removed from the branchhead set.
+            # - If one parent is on the same branch and the other is not
+            #   and there was exactly one branchhead known, the existing
+            #   branchhead can only be an ancestor if it is the parent.
+            #   Otherwise it would have been removed as ancestor of
+            #   the parent before. The other parent therefore can't have
+            #   a branchhead as ancestor.
+            # - In all other cases, the parents on different branches
+            #   could have a branchhead as ancestor. Those parents are
+            #   kept in the "uncertain" set. If all branchheads are also
+            #   topological heads, they can't have descendants and further
+            #   checks can be skipped. Otherwise, the ancestors of the
+            #   "uncertain" set are removed from branchheads.
+            #   This computation is heavy and avoided if at all possible.
             bheads = self._entries.setdefault(branch, [])
             bheadset = {cl.rev(node) for node in bheads}
-
-            # This have been tested True on all internal usage of this function.
-            # run it again in case of doubt
-            # assert not (set(bheadrevs) & set(newheadrevs))
-            bheadset.update(newheadrevs)
+            uncertain = set()
+            for newrev in sorted(newheadrevs):
+                if not bheadset:
+                    bheadset.add(newrev)
+                    continue
 
-            # This prunes out two kinds of heads - heads that are superseded by
-            # a head in newheadrevs, and newheadrevs that are not heads because
-            # an existing head is their descendant.
-            uncertain = bheadset - topoheads
+                parents = [p for p in parentrevs(newrev) if p != nullrev]
+                samebranch = set()
+                otherbranch = set()
+                for p in parents:
+                    if p in bheadset or getbranchinfo(p)[0] == branch:
+                        samebranch.add(p)
+                    else:
+                        otherbranch.add(p)
+                if otherbranch and not (len(bheadset) == len(samebranch) == 1):
+                    uncertain.update(otherbranch)
+                bheadset.difference_update(samebranch)
+                bheadset.add(newrev)
+
             if uncertain:
-                floorrev = min(uncertain)
-                ancestors = set(cl.ancestors(newheadrevs, floorrev))
-                bheadset -= ancestors
+                if topoheads is None:
+                    topoheads = set(cl.headrevs())
+                if bheadset - topoheads:
+                    floorrev = min(bheadset)
+                    ancestors = set(cl.ancestors(newheadrevs, floorrev))
+                    bheadset -= ancestors
             bheadrevs = sorted(bheadset)
             self[branch] = [cl.node(rev) for rev in bheadrevs]
             tiprev = bheadrevs[-1]
--- a/relnotes/next	Tue Jan 12 19:49:18 2021 +0100
+++ b/relnotes/next	Fri Dec 18 14:45:28 2020 +0100
@@ -39,6 +39,10 @@
 
  * External hooks are now called with `HGPLAIN=1` preset.
 
+ * The `branchmap` cache is updated more intelligently and can be
+   significantly faster for repositories with many branches and changesets.
+
+
 == New Experimental Features ==
 
 * `experimental.single-head-per-branch:public-changes-only` can be used
--- a/tests/test-branches.t	Tue Jan 12 19:49:18 2021 +0100
+++ b/tests/test-branches.t	Fri Dec 18 14:45:28 2020 +0100
@@ -988,3 +988,257 @@
 
   $ hg ci -m "branch closed" --force-close-branch
   created new head
+  $ cd ..
+
+Test various special cases for the branchmap
+--------------------------------------------
+
+Basic fork of the same branch
+
+  $ hg init branchmap-testing1
+  $ cd branchmap-testing1
+  $ hg debugbuild '@A . :base . :p1 *base /p1'
+  $ hg log -G
+  o    changeset:   3:71ca9a6d524e
+  |\   branch:      A
+  | |  tag:         tip
+  | |  parent:      2:a3b807b3ff0b
+  | |  parent:      1:99ba08759bc7
+  | |  user:        debugbuilddag
+  | |  date:        Thu Jan 01 00:00:03 1970 +0000
+  | |  summary:     r3
+  | |
+  | o  changeset:   2:a3b807b3ff0b
+  | |  branch:      A
+  | |  parent:      0:2ab8003a1750
+  | |  user:        debugbuilddag
+  | |  date:        Thu Jan 01 00:00:02 1970 +0000
+  | |  summary:     r2
+  | |
+  o |  changeset:   1:99ba08759bc7
+  |/   branch:      A
+  |    tag:         p1
+  |    user:        debugbuilddag
+  |    date:        Thu Jan 01 00:00:01 1970 +0000
+  |    summary:     r1
+  |
+  o  changeset:   0:2ab8003a1750
+     branch:      A
+     tag:         base
+     user:        debugbuilddag
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     r0
+  
+  $ hg branches
+  A                              3:71ca9a6d524e
+  $ hg clone -r 1 -r 2 . ../branchmap-testing1-clone
+  adding changesets
+  adding manifests
+  adding file changes
+  added 3 changesets with 0 changes to 0 files (+1 heads)
+  new changesets 2ab8003a1750:a3b807b3ff0b
+  updating to branch A
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ cd ../branchmap-testing1-clone
+  $ hg pull ../branchmap-testing1
+  pulling from ../branchmap-testing1
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 0 changes to 0 files (-1 heads)
+  new changesets 71ca9a6d524e
+  (run 'hg update' to get a working copy)
+  $ hg branches
+  A                              3:71ca9a6d524e
+  $ cd ..
+
+Switching to a different branch and back
+
+  $ hg init branchmap-testing2
+  $ cd branchmap-testing2
+  $ hg debugbuild '@A . @B . @A .'
+  $ hg log -G
+  o  changeset:   2:9699e9f260b5
+  |  branch:      A
+  |  tag:         tip
+  |  user:        debugbuilddag
+  |  date:        Thu Jan 01 00:00:02 1970 +0000
+  |  summary:     r2
+  |
+  o  changeset:   1:0bc7d348d965
+  |  branch:      B
+  |  user:        debugbuilddag
+  |  date:        Thu Jan 01 00:00:01 1970 +0000
+  |  summary:     r1
+  |
+  o  changeset:   0:2ab8003a1750
+     branch:      A
+     user:        debugbuilddag
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     r0
+  
+  $ hg branches
+  A                              2:9699e9f260b5
+  B                              1:0bc7d348d965 (inactive)
+  $ hg clone -r 1 . ../branchmap-testing2-clone
+  adding changesets
+  adding manifests
+  adding file changes
+  added 2 changesets with 0 changes to 0 files
+  new changesets 2ab8003a1750:0bc7d348d965
+  updating to branch B
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ cd ../branchmap-testing2-clone
+  $ hg pull ../branchmap-testing2
+  pulling from ../branchmap-testing2
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 0 changes to 0 files
+  new changesets 9699e9f260b5
+  (run 'hg update' to get a working copy)
+  $ hg branches
+  A                              2:9699e9f260b5
+  B                              1:0bc7d348d965 (inactive)
+  $ cd ..
+
+A fork on a branch switching to a different branch and back
+is still collecting the fork.
+
+  $ hg init branchmap-testing3
+  $ cd branchmap-testing3
+  $ hg debugbuild '@A . :base . :p1 *base @B . @A /p1'
+  $ hg log -G
+  o    changeset:   4:3614a1711d23
+  |\   branch:      A
+  | |  tag:         tip
+  | |  parent:      3:e9c8abcf65aa
+  | |  parent:      1:99ba08759bc7
+  | |  user:        debugbuilddag
+  | |  date:        Thu Jan 01 00:00:04 1970 +0000
+  | |  summary:     r4
+  | |
+  | o  changeset:   3:e9c8abcf65aa
+  | |  branch:      B
+  | |  user:        debugbuilddag
+  | |  date:        Thu Jan 01 00:00:03 1970 +0000
+  | |  summary:     r3
+  | |
+  | o  changeset:   2:a3b807b3ff0b
+  | |  branch:      A
+  | |  parent:      0:2ab8003a1750
+  | |  user:        debugbuilddag
+  | |  date:        Thu Jan 01 00:00:02 1970 +0000
+  | |  summary:     r2
+  | |
+  o |  changeset:   1:99ba08759bc7
+  |/   branch:      A
+  |    tag:         p1
+  |    user:        debugbuilddag
+  |    date:        Thu Jan 01 00:00:01 1970 +0000
+  |    summary:     r1
+  |
+  o  changeset:   0:2ab8003a1750
+     branch:      A
+     tag:         base
+     user:        debugbuilddag
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     r0
+  
+  $ hg branches
+  A                              4:3614a1711d23
+  B                              3:e9c8abcf65aa (inactive)
+  $ hg clone -r 1 -r 3 . ../branchmap-testing3-clone
+  adding changesets
+  adding manifests
+  adding file changes
+  added 4 changesets with 0 changes to 0 files (+1 heads)
+  new changesets 2ab8003a1750:e9c8abcf65aa
+  updating to branch A
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ cd ../branchmap-testing3-clone
+  $ hg pull ../branchmap-testing3
+  pulling from ../branchmap-testing3
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 0 changes to 0 files (-1 heads)
+  new changesets 3614a1711d23
+  (run 'hg update' to get a working copy)
+  $ hg branches
+  A                              4:3614a1711d23
+  B                              3:e9c8abcf65aa (inactive)
+  $ cd ..
+
+Intermediary parents are on different branches.
+
+  $ hg init branchmap-testing4
+  $ cd branchmap-testing4
+  $ hg debugbuild '@A . @B :base . @A :p1 *base @C . @A /p1'
+  $ hg log -G
+  o    changeset:   4:4bf67499b70a
+  |\   branch:      A
+  | |  tag:         tip
+  | |  parent:      3:4a546028fa8f
+  | |  parent:      1:0bc7d348d965
+  | |  user:        debugbuilddag
+  | |  date:        Thu Jan 01 00:00:04 1970 +0000
+  | |  summary:     r4
+  | |
+  | o  changeset:   3:4a546028fa8f
+  | |  branch:      C
+  | |  user:        debugbuilddag
+  | |  date:        Thu Jan 01 00:00:03 1970 +0000
+  | |  summary:     r3
+  | |
+  | o  changeset:   2:a3b807b3ff0b
+  | |  branch:      A
+  | |  parent:      0:2ab8003a1750
+  | |  user:        debugbuilddag
+  | |  date:        Thu Jan 01 00:00:02 1970 +0000
+  | |  summary:     r2
+  | |
+  o |  changeset:   1:0bc7d348d965
+  |/   branch:      B
+  |    tag:         p1
+  |    user:        debugbuilddag
+  |    date:        Thu Jan 01 00:00:01 1970 +0000
+  |    summary:     r1
+  |
+  o  changeset:   0:2ab8003a1750
+     branch:      A
+     tag:         base
+     user:        debugbuilddag
+     date:        Thu Jan 01 00:00:00 1970 +0000
+     summary:     r0
+  
+  $ hg branches
+  A                              4:4bf67499b70a
+  C                              3:4a546028fa8f (inactive)
+  B                              1:0bc7d348d965 (inactive)
+  $ hg clone -r 1 -r 3 . ../branchmap-testing4-clone
+  adding changesets
+  adding manifests
+  adding file changes
+  added 4 changesets with 0 changes to 0 files (+1 heads)
+  new changesets 2ab8003a1750:4a546028fa8f
+  updating to branch B
+  0 files updated, 0 files merged, 0 files removed, 0 files unresolved
+  $ cd ../branchmap-testing4-clone
+  $ hg pull ../branchmap-testing4
+  pulling from ../branchmap-testing4
+  searching for changes
+  adding changesets
+  adding manifests
+  adding file changes
+  added 1 changesets with 0 changes to 0 files (-1 heads)
+  new changesets 4bf67499b70a
+  (run 'hg update' to get a working copy)
+  $ hg branches
+  A                              4:4bf67499b70a
+  C                              3:4a546028fa8f (inactive)
+  B                              1:0bc7d348d965 (inactive)
+  $ cd ..