emitrevision: consider ancestors revision to emit as available base
authorPierre-Yves David <pierre-yves.david@octobus.net>
Sat, 12 Nov 2022 00:18:41 +0100
changeset 49672 e92de86cf4f8
parent 49671 92c65bd0c4d6
child 49673 e33776297d1f
emitrevision: consider ancestors revision to emit as available base This should make more delta base valid. This notably affects: * case where we skipped some parent with empty delta to directly delta against an ancestors * case where an intermediate snapshots is stored. This change means we could sent largish intermediate snapshots over the wire. However this is actually a sub goal here. Sending snapshots over the wire means the client have a high odd of simply storing the pre-computed delta instead of doing a lengthy process that will… end up doing the same intermediate snapshot. In addition the overall size of snapshot (or any level) is "only" some or the overall delta size. (0.17% for my mercurial clone, 20% for my clone of Mozilla try). So Sending them other the wire is unlikely to change large impact on the bandwidth used. If we decide that minimising the bandwidth is an explicit goal, we should introduce new logic to filter-out snapshot as delta. The current code has no notion explicite of snapshot so far, they just tended to fall into the wobbly filtering options. In some cases, this patch can yield large improvement to the bundling time: ### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog # benchmark.name = perf-bundle # benchmark.variants.revs = last-100000 before: 68.787066 seconds after: 47.552677 seconds (-30.87%) That translate to large improvement to the pull time : ### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog # benchmark.name = pull # benchmark.variants.issue6528 = disabled # benchmark.variants.revs = last-100000 before: 142.186625 seconds after: 75.897745 seconds (-46.62%) No significant negative impact have been observed.
mercurial/utils/storageutil.py
--- a/mercurial/utils/storageutil.py	Wed Nov 09 13:54:15 2022 -0500
+++ b/mercurial/utils/storageutil.py	Sat Nov 12 00:18:41 2022 +0100
@@ -379,6 +379,7 @@
 
     fnode = store.node
     frev = store.rev
+    parents = store.parentrevs
 
     if nodesorder == b'nodes':
         revs = [frev(n) for n in nodes]
@@ -391,23 +392,21 @@
     prevrev = None
 
     if deltamode == repository.CG_DELTAMODE_PREV or assumehaveparentrevisions:
-        prevrev = store.parentrevs(revs[0])[0]
+        prevrev = parents(revs[0])[0]
 
-    # Set of revs available to delta against.
+    # Sets of revs available to delta against.
+    emitted = set()
     available = set()
-    parents = []
+    if assumehaveparentrevisions:
+        common_heads = set(p for r in revs for p in parents(r))
+        common_heads.difference_update(revs)
+        available = store.ancestors(common_heads, inclusive=True)
 
     def is_usable_base(rev):
         """Is a delta against this revision usable over the wire"""
         if rev == nullrev:
             return False
-        # Base revision was already emitted in this group.
-        if rev in available:
-            return True
-        # Base revision is a parent that hasn't been emitted already.
-        if assumehaveparentrevisions and rev in parents:
-            return True
-        return False
+        return rev in emitted or rev in available
 
     for rev in revs:
         if rev == nullrev:
@@ -418,7 +417,7 @@
             debug_info['revision-total'] += 1
 
         node = fnode(rev)
-        parents[:] = p1rev, p2rev = store.parentrevs(rev)
+        p1rev, p2rev = parents(rev)
 
         if debug_info is not None:
             if p1rev != p2rev and p1rev != nullrev and p2rev != nullrev:
@@ -531,7 +530,7 @@
                     debug_info['computed-delta'] += 1  # close enough
                     debug_info['delta-full'] += 1
                 revision = store.rawdata(node)
-                available.add(rev)
+                emitted.add(rev)
             else:
                 if revdifffn:
                     if debug_info is not None:
@@ -571,7 +570,7 @@
                         store.rawdata(baserev), store.rawdata(rev)
                     )
 
-                available.add(rev)
+                emitted.add(rev)
 
         serialized_sidedata = None
         sidedata_flags = (0, 0)