treemanifest: introduce lazy loading of subdirs
authorspectral <spectral@google.com>
Thu, 16 Aug 2018 12:31:52 -0700
changeset 39515 93486cc46125
parent 39514 b29ec19748a7
child 39516 079d7bfa463d
treemanifest: introduce lazy loading of subdirs An earlier patch series made it so that what to load was up to the calling code, which works fine until manifests are copied - when they're copied, they're loaded completely and thus we lose the entire benefit. By lazy loading everything, we can avoid having to pass in the matcher to ~every manifest function, and handle copies correctly as well. This changeset doesn't go as far as it could with loading only the necessary subsets, that will happen in later changes in this series; at the moment, except in a few situations, we just load everything the moment we want to interact with treemanifest._dirs. This is thus most likely to be a small slowdown if treemanifests is in use regardless of whether narrow is in use, but hopefully easier to verify correctness and review. This is part of a series of speedups, it is not expected to produce any real speed improvements itself, but the numbers show that it doesn't produce a large speed penalty in any common case, and for the cases it does provide a penalty in, it is not a large absolute amount (even if it is a large percentage amount). Timing numbers according to command: hyperfine --prepare <preparation_script> 'hg status' HGRCPATH points to a file with the following contents: [extensions] narrow = strip = rebase = mozilla-unified (called m-u below) was at revision #468856. regular hash: eb39298e432d treemanifests hash: 0553b7f29eaf large-dir-repo (called l-d-r below) was generated with the following script: #!/bin/bash hg init large-dir-repo mkdir -p large-dir-repo/third_party/rust/log touch large-dir-repo/third_party/rust/log/foo.txt for i in $(seq 1 30000); do d=$(mktemp -d large-dir-repo/third_party/XXXXXXXXX) touch $d/file.txt done hg -R large-dir-repo ci -Am 'rev0' --user test --date '0 0' echo hi > large-dir-repo/third_party/rust/log/bar.txt hg -R large-dir-repo ci -Am 'rev1' --user test --date '0 0' echo hi > large-dir-repo/third_party/rust/log/baz.txt hg -R large-dir-repo ci -Am 'rev2' --user test --date '0 0' for the repos that use narrow, the narrowspec was this: [include] rootfilesin:accessible/jsat rootfilesin:accessible/tests/mochitest/jsat rootfilesin:mobile/android/chrome/content rootfilesin:mobile/android/modules/geckoview rootfilesin:third_party/rust/log [exclude] This narrowspec was chosen due to the size of the third_party/rust directory (this directory was *not* modified in revision #468856 in mozilla-unified), plus all the directories that *were* modified in revision #468856 of mozilla-unified. Importantly, when using narrow, these repos had everything checked out (in the case of large-dir-repo, that means all 30,001 directories), *before* adding the narrowspec. This is to simulate the behavior when using a virtual filesystem that shows everything for the user even if they haven't added it to the narrowspec yet. This is not a supported configuration, and `hg update` and `hg rebase` will not really do the "correct" thing if there are mutations outside of the narrowspec (which is not the case in these tests, due to a carefully crafted narrowspec), but non-mutating commands should behave correctly. I'm not claiming anything less than a 5% speed win as improvements due to this change; these are probably eiter measurement artifacts or constant time improvements. The numbers that aren't changing are shown primarily to prove that this doesn't make anything worse in any case I plan on testing during this series. 'before' is hg from commit 6268fed3 'N' indicates narrow in use 'T' indicates treemanifest in use Please note that these commands and the narrowspec are a little different than the ones in a similar table that I made in a3cabe9415e1. Important: it is my understanding that these numbers below are *not super reliable*, the large slowdowns may be artifacts of some odd interaction between GC and python module/code complexity. Another changeset of mine (D4351) had shown large timing differences when ~empty, uncalled functions were added to match.py, though only when using --color=never or redirecting to /dev/null. We seem to be on some cusp of complexity or code size that is causing, at my best guess (according to linux `perf` benchmarks) GC to alter behavior and cause a 200-400ms difference in timings. I haven't had a chance to replicate these results on another machine. diff --git: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 1.580 s +- 0.034 s | 1.576 s +- 0.022 s | 99.7% m-u | | x | 1.568 s +- 0.025 s | 1.584 s +- 0.044 s | 101.0% m-u | x | | 1.569 s +- 0.031 s | 1.554 s +- 0.025 s | 99.0% m-u | x | x | 107.3 ms +- 1.6 ms | 106.3 ms +- 1.5 ms | 99.1% l-d-r | | | 232.5 ms +- 5.9 ms | 233.5 ms +- 5.3 ms | 100.4% l-d-r | | x | 236.6 ms +- 6.3 ms | 233.6 ms +- 7.0 ms | 98.7% l-d-r | x | | 118.4 ms +- 2.1 ms | 118.4 ms +- 1.4 ms | 100.0% l-d-r | x | x | 116.8 ms +- 1.5 ms | 118.9 ms +- 1.6 ms | 101.8% diff -c . --git: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 354.4 ms +- 16.6 ms | 351.0 ms +- 6.9 ms | 99.0% m-u | | x | 207.2 ms +- 3.0 ms | 206.2 ms +- 2.7 ms | 99.5% m-u | x | | 422.0 ms +- 26.0 ms | 351.2 ms +- 6.4 ms | 83.2% <-- m-u | x | x | 166.7 ms +- 2.1 ms | 169.5 ms +- 4.1 ms | 101.7% l-d-r | | | 98.4 ms +- 4.5 ms | 98.5 ms +- 2.1 ms | 100.1% l-d-r | | x | 5.519 s +- 0.060 s | 5.149 s +- 0.042 s | 93.3% <-- l-d-r | x | | 99.1 ms +- 3.2 ms | 102.6 ms +- 9.7 ms | 103.5% <--? l-d-r | x | x | 994.9 ms +- 10.7 ms | 1.026 s +- 0.012 s | 103.1% <--? rebase -r . --keep -d .^^: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 6.639 s +- 0.168 s | 6.559 s +- 0.097 s | 98.8% m-u | | x | 6.601 s +- 0.143 s | 6.640 s +- 0.207 s | 100.6% m-u | x | | 6.582 s +- 0.098 s | 6.543 s +- 0.098 s | 99.4% m-u | x | x | 678.4 ms +- 57.7 ms | 703.7 ms +- 52.4 ms | 103.7% <--? l-d-r | | | 780.0 ms +- 23.9 ms | 776.0 ms +- 12.6 ms | 99.5% l-d-r | | x | 7.520 s +- 0.255 s | 7.395 s +- 0.044 s | 98.3% l-d-r | x | | 331.9 ms +- 16.5 ms | 327.0 ms +- 3.4 ms | 98.5% l-d-r | x | x | 6.228 s +- 0.113 s | 5.924 s +- 0.044 s | 95.1% status --change . --copies: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 330.8 ms +- 7.2 ms | 329.0 ms +- 7.1 ms | 99.5% m-u | | x | 182.9 ms +- 2.7 ms | 183.5 ms +- 2.7 ms | 100.3% m-u | x | | 330.0 ms +- 7.6 ms | 327.1 ms +- 5.4 ms | 99.1% m-u | x | x | 146.2 ms +- 2.4 ms | 147.1 ms +- 1.3 ms | 100.6% l-d-r | | | 95.3 ms +- 1.4 ms | 95.9 ms +- 1.5 ms | 100.6% l-d-r | | x | 5.157 s +- 0.035 s | 5.166 s +- 0.058 s | 100.2% l-d-r | x | | 99.7 ms +- 3.0 ms | 100.2 ms +- 4.4 ms | 100.5% l-d-r | x | x | 993.6 ms +- 13.1 ms | 1.025 s +- 0.015 s | 103.2% <--? status --copies: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 2.348 s +- 0.031 s | 2.329 s +- 0.019 s | 99.2% m-u | | x | 2.337 s +- 0.026 s | 2.346 s +- 0.034 s | 100.4% m-u | x | | 2.354 s +- 0.015 s | 2.342 s +- 0.021 s | 99.5% m-u | x | x | 120.6 ms +- 4.3 ms | 119.2 ms +- 2.1 ms | 98.8% l-d-r | | | 731.5 ms +- 11.1 ms | 719.6 ms +- 9.8 ms | 98.4% l-d-r | | x | 729.0 ms +- 15.5 ms | 725.7 ms +- 10.6 ms | 99.5% l-d-r | x | | 211.0 ms +- 3.9 ms | 212.8 ms +- 3.7 ms | 100.9% l-d-r | x | x | 211.5 ms +- 4.2 ms | 211.0 ms +- 3.3 ms | 99.8% update $rev^; ~/src/hg/hg{hg}/hg update $rev: repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before ------+---+---+------------------------+-----------------------+------------ m-u | | | 3.910 s +- 0.055 s | 3.920 s +- 0.075 s | 100.3% m-u | | x | 3.613 s +- 0.056 s | 3.630 s +- 0.056 s | 100.5% m-u | x | | 3.873 s +- 0.055 s | 3.864 s +- 0.049 s | 99.8% m-u | x | x | 400.4 ms +- 7.4 ms | 403.6 ms +- 5.0 ms | 100.8% l-d-r | | | 531.6 ms +- 10.0 ms | 528.8 ms +- 9.6 ms | 99.5% l-d-r | | x | 10.377 s +- 0.049 s | 9.955 s +- 0.046 s | 95.9% l-d-r | x | | 308.3 ms +- 4.4 ms | 306.8 ms +- 3.7 ms | 99.5% l-d-r | x | x | 1.805 s +- 0.015 s | 1.834 s +- 0.020 s | 101.6% Differential Revision: https://phab.mercurial-scm.org/D4366
mercurial/manifest.py
--- a/mercurial/manifest.py	Fri Sep 07 17:54:55 2018 -0400
+++ b/mercurial/manifest.py	Thu Aug 16 12:31:52 2018 -0700
@@ -686,6 +686,7 @@
         self._copyfunc = _noop
         self._dirty = False
         self._dirs = {}
+        self._lazydirs = {}
         # Using _lazymanifest here is a little slower than plain old dicts
         self._files = {}
         self._flags = {}
@@ -699,9 +700,20 @@
     def _subpath(self, path):
         return self._dir + path
 
+    def _loadalllazy(self):
+        for k, (path, node, readsubtree) in self._lazydirs.iteritems():
+            self._dirs[k] = readsubtree(path, node)
+        self._lazydirs = {}
+
+    def _loadlazy(self, d):
+        path, node, readsubtree = self._lazydirs[d]
+        self._dirs[d] = readsubtree(path, node)
+        del self._lazydirs[d]
+
     def __len__(self):
         self._load()
         size = len(self._files)
+        self._loadalllazy()
         for m in self._dirs.values():
             size += m.__len__()
         return size
@@ -714,6 +726,7 @@
 
     def _isempty(self):
         self._load() # for consistency; already loaded by all callers
+        self._loadalllazy()
         return (not self._files and (not self._dirs or
                 all(m._isempty() for m in self._dirs.values())))
 
@@ -741,6 +754,7 @@
 
     def iterentries(self):
         self._load()
+        self._loadalllazy()
         for p, n in sorted(itertools.chain(self._dirs.items(),
                                            self._files.items())):
             if p in self._files:
@@ -751,6 +765,7 @@
 
     def items(self):
         self._load()
+        self._loadalllazy()
         for p, n in sorted(itertools.chain(self._dirs.items(),
                                            self._files.items())):
             if p in self._files:
@@ -763,6 +778,7 @@
 
     def iterkeys(self):
         self._load()
+        self._loadalllazy()
         for p in sorted(itertools.chain(self._dirs, self._files)):
             if p in self._files:
                 yield self._subpath(p)
@@ -782,8 +798,12 @@
         self._load()
         dir, subpath = _splittopdir(f)
         if dir:
+            if dir in self._lazydirs:
+                self._loadlazy(dir)
+
             if dir not in self._dirs:
                 return False
+
             return self._dirs[dir].__contains__(subpath)
         else:
             return f in self._files
@@ -792,6 +812,9 @@
         self._load()
         dir, subpath = _splittopdir(f)
         if dir:
+            if dir in self._lazydirs:
+                self._loadlazy(dir)
+
             if dir not in self._dirs:
                 return default
             return self._dirs[dir].get(subpath, default)
@@ -802,6 +825,9 @@
         self._load()
         dir, subpath = _splittopdir(f)
         if dir:
+            if dir in self._lazydirs:
+                self._loadlazy(dir)
+
             return self._dirs[dir].__getitem__(subpath)
         else:
             return self._files[f]
@@ -810,11 +836,14 @@
         self._load()
         dir, subpath = _splittopdir(f)
         if dir:
+            if dir in self._lazydirs:
+                self._loadlazy(dir)
+
             if dir not in self._dirs:
                 return ''
             return self._dirs[dir].flags(subpath)
         else:
-            if f in self._dirs:
+            if f in self._lazydirs or f in self._dirs:
                 return ''
             return self._flags.get(f, '')
 
@@ -822,6 +851,9 @@
         self._load()
         dir, subpath = _splittopdir(f)
         if dir:
+            if dir in self._lazydirs:
+                self._loadlazy(dir)
+
             return self._dirs[dir].find(subpath)
         else:
             return self._files[f], self._flags.get(f, '')
@@ -830,6 +862,9 @@
         self._load()
         dir, subpath = _splittopdir(f)
         if dir:
+            if dir in self._lazydirs:
+                self._loadlazy(dir)
+
             self._dirs[dir].__delitem__(subpath)
             # If the directory is now empty, remove it
             if self._dirs[dir]._isempty():
@@ -845,6 +880,8 @@
         self._load()
         dir, subpath = _splittopdir(f)
         if dir:
+            if dir in self._lazydirs:
+                self._loadlazy(dir)
             if dir not in self._dirs:
                 self._dirs[dir] = treemanifest(self._subpath(dir))
             self._dirs[dir].__setitem__(subpath, n)
@@ -865,6 +902,8 @@
         self._load()
         dir, subpath = _splittopdir(f)
         if dir:
+            if dir in self._lazydirs:
+                self._loadlazy(dir)
             if dir not in self._dirs:
                 self._dirs[dir] = treemanifest(self._subpath(dir))
             self._dirs[dir].setflag(subpath, flags)
@@ -879,8 +918,12 @@
         if self._copyfunc is _noop:
             def _copyfunc(s):
                 self._load()
-                for d in self._dirs:
-                    s._dirs[d] = self._dirs[d].copy()
+                # OPT: it'd be nice to not load everything here. Unfortunately
+                # this makes a mess of the "dirty" state tracking if we don't.
+                self._loadalllazy()
+                sdirs = s._dirs
+                for d, v in self._dirs.iteritems():
+                    sdirs[d] = v.copy()
                 s._files = dict.copy(self._files)
                 s._flags = dict.copy(self._flags)
             if self._loadfunc is _noop:
@@ -904,6 +947,8 @@
                 return
             t1._load()
             t2._load()
+            t1._loadalllazy()
+            t2._loadalllazy()
             for d, m1 in t1._dirs.iteritems():
                 if d in t2._dirs:
                     m2 = t2._dirs[d]
@@ -929,10 +974,13 @@
         self._load()
         topdir, subdir = _splittopdir(dir)
         if topdir:
+            if topdir in self._lazydirs:
+                self._loadlazy(topdir)
             if topdir in self._dirs:
                 return self._dirs[topdir].hasdir(subdir)
             return False
-        return (dir + '/') in self._dirs
+        dirslash = dir + '/'
+        return dirslash in self._dirs or dirslash in self._lazydirs
 
     def walk(self, match):
         '''Generates matching file names.
@@ -970,6 +1018,7 @@
 
         # yield this dir's files and walk its submanifests
         self._load()
+        self._loadalllazy()
         for p in sorted(list(self._dirs) + list(self._files)):
             if p in self._files:
                 fullp = self._subpath(p)
@@ -1006,6 +1055,8 @@
             if fn in self._flags:
                 ret._flags[fn] = self._flags[fn]
 
+        # OPT: use visitchildrenset to avoid loading everything
+        self._loadalllazy()
         for dir, subm in self._dirs.iteritems():
             m = subm._matches(match)
             if not m._isempty():
@@ -1041,6 +1092,9 @@
                 return
             t1._load()
             t2._load()
+            # OPT: do we need to load everything?
+            t1._loadalllazy()
+            t2._loadalllazy()
             for d, m1 in t1._dirs.iteritems():
                 m2 = t2._dirs.get(d, emptytree)
                 _diff(m1, m2)
@@ -1070,10 +1124,12 @@
         return not self._dirty and not m2._dirty and self._node == m2._node
 
     def parse(self, text, readsubtree):
+        selflazy = self._lazydirs
+        subpath = self._subpath
         for f, n, fl in _parse(text):
             if fl == 't':
                 f = f + '/'
-                self._dirs[f] = readsubtree(self._subpath(f), n)
+                selflazy[f] = (subpath(f), n, readsubtree)
             elif '/' in f:
                 # This is a flat manifest, so use __setitem__ and setflag rather
                 # than assigning directly to _files and _flags, so we can
@@ -1100,9 +1156,11 @@
         """
         self._load()
         flags = self.flags
+        lazydirs = [(d[:-1], node, 't') for
+                    d, (path, node, readsubtree) in self._lazydirs.iteritems()]
         dirs = [(d[:-1], self._dirs[d]._node, 't') for d in self._dirs]
         files = [(f, self._files[f], flags(f)) for f in self._files]
-        return _text(sorted(dirs + files))
+        return _text(sorted(dirs + files + lazydirs))
 
     def read(self, gettext, readsubtree):
         def _load_for_read(s):
@@ -1115,6 +1173,11 @@
         m1._load()
         m2._load()
         emptytree = treemanifest()
+        # OPT: Do we really need to load everything? Presumably things in lazy
+        # aren't dirty and don't need to be written.
+        self._loadalllazy()
+        m1._loadalllazy()
+        m2._loadalllazy()
         for d, subm in self._dirs.iteritems():
             subp1 = m1._dirs.get(d, emptytree)._node
             subp2 = m2._dirs.get(d, emptytree)._node
@@ -1134,6 +1197,8 @@
             yield self
 
         self._load()
+        # OPT: use visitchildrenset to avoid loading everything.
+        self._loadalllazy()
         for d, subm in self._dirs.iteritems():
             for subtree in subm.walksubtrees(matcher=matcher):
                 yield subtree