cext: fix memory leak in phases computation
Without this a buffer whose size in bytes is the number of
changesets in the repository is leaked each time the repository is
opened and changeset phases are computed.
Impact: the current code in hgwebdir creates a new `localrepository`
instance for each HTTP request. Since any pull or push is made of several
requests, a team of 100 people can easily produce thousands of such
requests per day.
Being a low-level malloc, this leak can't be seen with the gc module and
tools relying on that, but was spotted by valgrind immediately.
Reproduction
------------
for i in range(cl_args.iterations):
repo = hg.repository(baseui, repo_path)
rev = repo.revs(rev).first()
ctx = repo[rev]
del ctx
del repo
# avoid any pollution by other type of leak
# (that should be fixed in 5.8)
repoview._filteredrepotypes.clear()
gc.collect()
Measurements
------------
Resident Set Size (RSS), taken on a clone of
mozilla-central for performance analysis (440 000
changesets).
before:
5.8+hg19.5ac0f2a8ba72 1000 iterations: 1606MB
5.8+hg19.5ac0f2a8ba72 10000 iterations: 5723MB
after:
5.8+hg20.e2084d39e145 1000 iterations: 555MB
5.8+hg20.e2084d39e145 10000 iterations: 555MB
(double checked, not a copy/paste error)
(e2084d39e14 is the present changeset, before amendment
of the message to add the measurements)
$ echo '[extensions]' >> $HGRCPATH
$ echo 'strip =' >> $HGRCPATH
$ cat >findbranch.py <<EOF
> from __future__ import absolute_import
> import re
> import sys
>
> head_re = re.compile(r'^#(?:(?:\\s+([A-Za-z][A-Za-z0-9_]*)(?:\\s.*)?)|(?:\\s*))$')
>
> for line in sys.stdin:
> hmatch = head_re.match(line)
> if not hmatch:
> sys.exit(1)
> if hmatch.group(1) == 'Branch':
> sys.exit(0)
> sys.exit(1)
> EOF
$ hg init a
$ cd a
$ echo "Rev 1" >rev
$ hg add rev
$ hg commit -m "No branch."
$ hg branch abranch
marked working directory as branch abranch
(branches are permanent and global, did you want a bookmark?)
$ echo "Rev 2" >rev
$ hg commit -m "With branch."
$ hg export 0 > ../r0.patch
$ hg export 1 > ../r1.patch
$ cd ..
$ if "$PYTHON" findbranch.py < r0.patch; then
> echo "Export of default branch revision has Branch header" 1>&2
> exit 1
> fi
$ if "$PYTHON" findbranch.py < r1.patch; then
> : # Do nothing
> else
> echo "Export of branch revision is missing Branch header" 1>&2
> exit 1
> fi
Make sure import still works with branch information in patches.
$ hg init b
$ cd b
$ hg import ../r0.patch
applying ../r0.patch
$ hg import ../r1.patch
applying ../r1.patch
$ cd ..
$ hg init c
$ cd c
$ hg import --exact --no-commit ../r0.patch
applying ../r0.patch
warning: can't check exact import with --no-commit
$ hg st
A rev
$ hg revert -a
forgetting rev
$ rm rev
$ hg import --exact ../r0.patch
applying ../r0.patch
$ hg import --exact ../r1.patch
applying ../r1.patch
Test --exact and patch header separators (issue3356)
$ hg strip --no-backup .
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
>>> import re
>>> p = open('../r1.patch', 'rb').read()
>>> p = re.sub(br'Parent\s+', b'Parent ', p)
>>> open('../r1-ws.patch', 'wb').write(p) and None
$ hg import --exact ../r1-ws.patch
applying ../r1-ws.patch
$ cd ..