perf: clear vfs audit_cache before each run
When generating a stream clone, we spend a large amount of time auditing path.
Before this changes, the first run was warming the vfs cache for the other
runs, leading to a large runtime difference and a "faulty" reported timing for
the operation.
We now clear this important cache between run to get a more realistic timing.
Below are some example of median time change when clearing these cases. The
maximum time for a run did not changed significantly.
### data-env-vars.name = mozilla-central-2018-08-01-zstd-sparse-revlog
# benchmark.name = hg.perf.exchange.stream.generate
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.version = latest
no-clearing: 17.289905
cache-clearing: 21.587965 (+24.86%, +4.30)
## data-env-vars.name = mozilla-central-2024-03-22-zstd-sparse-revlog
no-clearing: 32.670748
cache-clearing: 40.467095 (+23.86%, +7.80)
## data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
no-clearing: 37.838858
cache-clearing: 46.072749 (+21.76%, +8.23)
## data-env-vars.name = mozilla-unified-2024-03-22-zstd-sparse-revlog
no-clearing: 32.969395
cache-clearing: 39.646209 (+20.25%, +6.68)
In addition, this significantly reduce the timing difference between the
performance command, from the perf extensions and a `real `hg bundle` call
producing a stream bundle. Some significant differences remain especially on
the "mozilla-try" repositories, but they are now smaller.
Note that some of that difference will actually not be
attributable to the stream generation (like maybe phases or branch map
computation).
Below are some benchmarks done on a currently draft changeset fixing some
unrelated slowness in `hg bundle` (34a78972af409d1ff37c29e60f6ca811ad1a457d)
### data-env-vars.name = mozilla-central-2018-08-01-zstd-sparse-revlog
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
hg.perf.exchange.stream.generate: 21.587965
hg.command.bundle: 24.301799 (+12.57%, +2.71)
## data-env-vars.name = mozilla-central-2024-03-22-zstd-sparse-revlog
hg.perf.exchange.stream.generate: 40.467095
hg.command.bundle: 44.831317 (+10.78%, +4.36)
## data-env-vars.name = mozilla-unified-2024-03-22-zstd-sparse-revlog
hg.perf.exchange.stream.generate: 39.646209
hg.command.bundle: 45.395258 (+14.50%, +5.75)
## data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
hg.perf.exchange.stream.generate: 46.072749
hg.command.bundle: 55.882608 (+21.29%, +9.81)
## data-env-vars.name = mozilla-try-2023-03-22-zlib-general-delta
hg.perf.exchange.stream.generate: 334.716708
hg.command.bundle: 377.856767 (+12.89%, +43.14)
## data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
hg.perf.exchange.stream.generate: 302.972301
hg.command.bundle: 326.098755 (+7.63%, +23.13)
$ hg init repo
$ cd repo
$ for n in 0 1 2 3 4 5 6 7 8 9 10 11; do
> echo $n > $n
> hg ci -qAm $n
> done
test revset support
$ cat <<'EOF' >> .hg/hgrc
> [extdata]
> filedata = file:extdata.txt
> notes = notes.txt
> shelldata = shell:cat extdata.txt | grep 2
> emptygrep = shell:cat extdata.txt | grep empty
> badparse = shell:cat badparse.txt
> EOF
$ cat <<'EOF' > extdata.txt
> 2 another comment on 2
> 3
> EOF
$ cat <<'EOF' > notes.txt
> f6ed this change is great!
> e834 this is buggy :(
> 0625 first post
> bogusnode gives no error
> a ambiguous node gives no error
> EOF
$ hg log -qr "extdata(filedata)"
2:f6ed99a58333
3:9de260b1e88e
$ hg log -qr "extdata(shelldata)"
2:f6ed99a58333
test weight of extdata() revset
$ hg debugrevspec -p optimized "extdata(filedata) & 3"
* optimized:
(andsmally
(func
(symbol 'extdata')
(symbol 'filedata'))
(symbol '3'))
3
test non-zero exit of shell command
$ hg log -qr "extdata(emptygrep)"
abort: extdata command 'cat extdata.txt | grep empty' failed: exited with status 1
[255]
test bad extdata() revset source
$ hg log -qr "extdata()"
hg: parse error: extdata takes at least 1 string argument
[10]
$ hg log -qr "extdata(unknown)"
abort: unknown extdata source 'unknown'
[255]
test a zero-exiting source that emits garbage to confuse the revset parser
$ cat > badparse.txt <<'EOF'
> +---------------------------------------+
> 9de260b1e88e
> EOF
It might be nice if this error message mentioned where the bad string
came from (eg line X of extdata source S), but the important thing is
that we don't crash before we can print the parse error.
$ hg log -qr "extdata(badparse)"
hg: parse error at 0: not a prefix: +
(+---------------------------------------+
^ here)
[10]
test template support:
$ hg log -r:3 -T "{node|short}{if(extdata('notes'), ' # {extdata('notes')}')}\n"
06254b906311 # first post
e8342c9a2ed1 # this is buggy :(
f6ed99a58333 # this change is great!
9de260b1e88e
test template cache:
$ hg log -r:3 -T '{rev} "{extdata("notes")}" "{extdata("shelldata")}"\n'
0 "first post" ""
1 "this is buggy :(" ""
2 "this change is great!" "another comment on 2"
3 "" ""
test bad extdata() template source
$ hg log -T "{extdata()}\n"
hg: parse error: extdata expects one argument
[10]
$ hg log -T "{extdata('unknown')}\n"
abort: unknown extdata source 'unknown'
[255]
$ hg log -T "{extdata(unknown)}\n"
hg: parse error: empty data source specified
(did you mean extdata('unknown')?)
[10]
$ hg log -T "{extdata('{unknown}')}\n"
hg: parse error: empty data source specified
[10]
we don't fix up relative file URLs, but we do run shell commands in repo root
$ mkdir sub
$ cd sub
$ hg log -qr "extdata(filedata)"
abort: error: $ENOENT$
[100]
$ hg log -qr "extdata(shelldata)"
2:f6ed99a58333
$ cd ..