scmutil: avoid quadratic membership testing (issue5969) stable
authorGregory Szorc <gregory.szorc@gmail.com>
Fri, 24 Aug 2018 18:21:55 -0700
branchstable
changeset 39264 d750a6c9248d
parent 39200 bd63ada7e1f8
child 39265 f98d3c57906f
child 39416 ede3bf31fe63
scmutil: avoid quadratic membership testing (issue5969) tr.changes['revs'] is an xrange, which has an O(n) __contains__ implementation. The `rev not in newrevs` lookup a few lines below will therefore be O(n^2) if all incoming changesets are public. This issue isn't present on @ because 45e05d39d9ce introduced a custom type implementing an xrange primitive with O(1) contains and switched tr.changes['revs'] to be an instance of that type. We work around the problem on the stable branch by casting the xrange to a set. This is a bit hacky because it requires allocating memory to hold each integer in the range. But we are already holding the full set of pulled revision numbers in memory multiple times (such as in `tr.changes['phases']`). So this is a relatively minor problem. This issue has been present since the phases reporting code was introduced in the 4.7 cycle by eb9835014d20. This change should be reverted/ignored when stable is merged into default. On the mozilla-unified repository with 483492 changesets, `hg clone` time improves substantially: before: 1843.700s user; 29.810s sys after: 461.170s user; 29.360s sys
mercurial/scmutil.py
--- a/mercurial/scmutil.py	Sat Aug 18 10:24:57 2018 +0200
+++ b/mercurial/scmutil.py	Fri Aug 24 18:21:55 2018 -0700
@@ -1565,7 +1565,10 @@
             """Report statistics of phase changes for changesets pre-existing
             pull/unbundle.
             """
-            newrevs = tr.changes.get('revs', xrange(0, 0))
+            # TODO set() is only appropriate for 4.7 since revs post
+            # 45e05d39d9ce is a pycompat.membershiprange, which has O(n)
+            # membership testing.
+            newrevs = set(tr.changes.get('revs', xrange(0, 0)))
             phasetracking = tr.changes.get('phases', {})
             if not phasetracking:
                 return