outgoing: rework the handling of the `missingroots` case to be faster
The previous implementation was slow, to the point it was taking a significant
amount of `hg bundle --type none-streamv2` call. We rework the code to compute
the same value much faster, making the operation disappear from the `hg bundle
--type none-streamv2` profile. Someone would remark that producing a streamclone
does not requires an `outgoing` object. However that is a matter for another
day. There is other user of `missingroots` (non stream `hg bundle` call for
example), and they will also benefit from this rework.
We implement an old TODO in the process, directly computing the missing and
common attribute as we have most element at hand already.
### benchmark.name = hg.command.bundle
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.revs = all
# benchmark.variants.type = none-streamv2
## data-env-vars.name = heptapod-public-2024-03-25-zstd-sparse-revlog
before: 7.750458
after: 6.665565 (-14.00%, -1.08)
## data-env-vars.name = mercurial-public-2024-03-22-zstd-sparse-revlog
before: 0.700229
after: 0.496050 (-29.16%, -0.20)
## data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
before: 346.508952
after: 316.749699 (-8.59%, -29.76)
## data-env-vars.name = pypy-2024-03-22-zstd-sparse-revlog
before: 3.401700
after: 2.915810 (-14.28%, -0.49)
## data-env-vars.name = tryton-public-2024-03-22-zstd-sparse-revlog
before: 1.870798
after: 1.461583 (-21.87%, -0.41)
note: this whole `missingroots` of outgoing has a limited number of callers and
could likely be replace by something simpler (like taking an explicit
"missing_revs" set for example). However this is a wider change and we focus on
a small impact, quick rework that does not change the API for now.
# reproduce issue2264, issue2516
create test repo
$ cat <<EOF >> $HGRCPATH
> [extensions]
> transplant =
> EOF
$ hg init repo
$ cd repo
$ template="{rev} {desc|firstline} [{branch}]\n"
# we need to start out with two changesets on the default branch
# in order to avoid the cute little optimization where transplant
# pulls rather than transplants
add initial changesets
$ echo feature1 > file1
$ hg ci -Am"feature 1"
adding file1
$ echo feature2 >> file2
$ hg ci -Am"feature 2"
adding file2
# The changes to 'bugfix' are enough to show the bug: in fact, with only
# those changes, it's a very noisy crash ("RuntimeError: nothing
# committed after transplant"). But if we modify a second file in the
# transplanted changesets, the bug is much more subtle: transplant
# silently drops the second change to 'bugfix' on the floor, and we only
# see it when we run 'hg status' after transplanting. Subtle data loss
# bugs are worse than crashes, so reproduce the subtle case here.
commit bug fixes on bug fix branch
$ hg branch fixes
marked working directory as branch fixes
(branches are permanent and global, did you want a bookmark?)
$ echo fix1 > bugfix
$ echo fix1 >> file1
$ hg ci -Am"fix 1"
adding bugfix
$ echo fix2 > bugfix
$ echo fix2 >> file1
$ hg ci -Am"fix 2"
$ hg log -G --template="$template"
@ 3 fix 2 [fixes]
|
o 2 fix 1 [fixes]
|
o 1 feature 2 [default]
|
o 0 feature 1 [default]
transplant bug fixes onto release branch
$ hg update 0
1 files updated, 0 files merged, 2 files removed, 0 files unresolved
$ hg branch release
marked working directory as branch release
$ hg transplant 2 3
applying [0-9a-f]{12} (re)
[0-9a-f]{12} transplanted to [0-9a-f]{12} (re)
applying [0-9a-f]{12} (re)
[0-9a-f]{12} transplanted to [0-9a-f]{12} (re)
$ hg log -G --template="$template"
@ 5 fix 2 [release]
|
o 4 fix 1 [release]
|
| o 3 fix 2 [fixes]
| |
| o 2 fix 1 [fixes]
| |
| o 1 feature 2 [default]
|/
o 0 feature 1 [default]
$ hg status
$ hg status --rev 0:4
M file1
A bugfix
$ hg status --rev 4:5
M bugfix
M file1
now test that we fixed the bug for all scripts/extensions
$ cat > $TESTTMP/committwice.py <<__EOF__
> import time
> from mercurial import hg, match, node, ui as uimod
>
> def replacebyte(fn, b):
> f = open(fn, "rb+")
> f.seek(0, 0)
> f.write(b)
> f.close()
>
> def printfiles(repo, rev):
> repo.ui.status(b"revision %d files: [%s]\n"
> % (rev, b', '.join(b"'%s'" % f
> for f in repo[rev].files())))
>
> repo = hg.repository(uimod.ui.load(), b'.')
> assert len(repo) == 6, "initial: len(repo): %d, expected: 6" % len(repo)
>
> replacebyte(b"bugfix", b"u")
> time.sleep(2)
> try:
> repo.ui.status(b"PRE: len(repo): %d\n" % len(repo))
> wlock = repo.wlock()
> lock = repo.lock()
> replacebyte(b"file1", b"x")
> repo.commit(text=b"x", user=b"test", date=(0, 0))
> replacebyte(b"file1", b"y")
> repo.commit(text=b"y", user=b"test", date=(0, 0))
> repo.ui.status(b"POST: len(repo): %d\n" % len(repo))
> finally:
> lock.release()
> wlock.release()
> printfiles(repo, 6)
> printfiles(repo, 7)
> __EOF__
$ "$PYTHON" $TESTTMP/committwice.py
PRE: len(repo): 6
POST: len(repo): 8
revision 6 files: ['bugfix', 'file1']
revision 7 files: ['file1']
Do a size-preserving modification outside of that process
$ echo abcd > bugfix
$ hg status
M bugfix
$ hg log --template "{rev} {desc} {files}\n" -r5:
5 fix 2 bugfix file1
6 x bugfix file1
7 y file1
$ cd ..