bisect: avoid copying ancestor list for non-merge commits
During a bisection, hg needs to compute a list of all ancestors for every
candidate commit. This is accomplished via a bottom-up traversal of the set of
candidates, during which each revision's ancestor list is populated using the
ancestor list of its parent(s). Previously, this involved copying the entire
list, which could be very long in if the bisection range was large.
To help improve this, we can observe that each candidate commit is visited
exactly once, at which point its ancestor list is copied into its children's
lists and then dropped. In the case of non-merge commits, a commit's ancestor
list consists exactly of its parent's list plus itself. This means that we can
trivially reuse the parent's existing list for one of its non-merge children,
which avoids copying entirely if that commit is the parent's only child. This
makes bisections over linear ranges of commits much faster.
During some informal testing in the large publicly-available `mozilla-central`
repository, this noticeably sped up bisections over large ranges of history:
Setup:
$ cd mozilla-central
$ hg bisect --reset
$ hg bisect --good 0
$ hg log -r tip -T '{rev}\n'
628417
Test:
$ time hg bisect --bad tip --noupdate
Before:
real 3m35.927s
user 3m35.553s
sys 0m0.319s
After:
real 1m41.142s
user 1m40.810s
sys 0m0.285s
#require rhg
$ NO_FALLBACK="env RHG_ON_UNSUPPORTED=abort"
Rhg works well when sparse working copy is enabled.
$ cd "$TESTTMP"
$ hg init repo-sparse
$ cd repo-sparse
$ cat > .hg/hgrc <<EOF
> [extensions]
> sparse=
> EOF
$ echo a > show
$ echo x > hide
$ mkdir dir1 dir2
$ echo x > dir1/x
$ echo y > dir1/y
$ echo z > dir2/z
$ hg ci -Aqm 'initial'
$ hg debugsparse --include 'show'
$ ls -A
.hg
show
$ tip=$(hg log -r . --template '{node}')
$ $NO_FALLBACK rhg files -r "$tip"
dir1/x
dir1/y
dir2/z
hide
show
$ $NO_FALLBACK rhg files
show
$ $NO_FALLBACK rhg cat -r "$tip" hide
x
$ cd ..
We support most things when narrow is enabled, too, with a couple of caveats.
$ . "$TESTDIR/narrow-library.sh"
$ real_hg=$RHG_FALLBACK_EXECUTABLE
$ cat >> $HGRCPATH <<EOF
> [extensions]
> narrow=
> EOF
$ hg clone --narrow ./repo-sparse repo-narrow --include dir1
requesting all changes
adding changesets
adding manifests
adding file changes
added 1 changesets with 2 changes to 2 files
new changesets 6d714a4a2998
updating to branch default
2 files updated, 0 files merged, 0 files removed, 0 files unresolved
$ cd repo-narrow
$ $NO_FALLBACK rhg cat -r "$tip" dir1/x
x
$ "$real_hg" cat -r "$tip" dir1/x
x
TODO: bad error message
$ $NO_FALLBACK rhg cat -r "$tip" hide
abort: invalid revision identifier: 6d714a4a2998cbfd0620db44da58b749f6565d63
[255]
$ "$real_hg" cat -r "$tip" hide
[1]
A naive implementation of [rhg files] leaks the paths that are supposed to be
hidden by narrow, so we just fall back to hg.
$ $NO_FALLBACK rhg files -r "$tip"
unsupported feature: rhg files -r <rev> is not supported in narrow clones
[252]
$ "$real_hg" files -r "$tip"
dir1/x
dir1/y
Hg status needs to do some filtering based on narrow spec
$ mkdir dir2
$ touch dir2/q
$ "$real_hg" status
$ $NO_FALLBACK rhg --config rhg.status=true status
Adding "orphaned" index files:
$ (cd ..; cp repo-sparse/.hg/store/data/hide.i repo-narrow/.hg/store/data/hide.i)
$ (cd ..; mkdir repo-narrow/.hg/store/data/dir2; cp repo-sparse/.hg/store/data/dir2/z.i repo-narrow/.hg/store/data/dir2/z.i)
$ "$real_hg" verify
checking changesets
checking manifests
crosschecking files in changesets and manifests
checking files
checked 1 changesets with 2 changes to 2 files
$ "$real_hg" files -r "$tip"
dir1/x
dir1/y
# TODO: even though [hg files] hides the orphaned dir2/z, [hg cat] still shows it.
# rhg has the same issue, but at least it's not specific to rhg.
# This is despite [hg verify] succeeding above.
$ $NO_FALLBACK rhg cat -r "$tip" dir2/z
z
$ "$real_hg" cat -r "$tip" dir2/z
z