revset: stop serializing node when using "%ln"
Turning hundred of thousand of node from node to hex and back can be slow… what
about we stop doing it?
In many case were we are using node id we should be using revision id. However
this is not a good reason to have a stupidly slow implementation of "%ln".
This caught my attention again because the phase discovery during push make an
extensive use of "%ln" or huge set. In absolute, that phase discovery probably
should use "%ld" and need to improves its algorithmic complexity, but improving
"%ln" seems simple and long overdue. This greatly speeds up `hg push` on
repository with many drafts.
Here are some relevant poulpe benchmarks:
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.command.push
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.explicit-rev = all-out-heads
# benchmark.variants.issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = default
## benchmark.variants.revs = any-1-extra-rev
before: 44.235070
after: 20.416329 (-53.85%, -23.82)
## benchmark.variants.revs = any-100-extra-rev
before: 49.234697
after: 26.519829 (-46.14%, -22.71)
### benchmark.name = hg.command.bundle
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.revs = all
# benchmark.variants.type = none-streamv2
## data-env-vars.name = heptapod-public-2024-03-25-zstd-sparse-revlog
before: 10.138396
after: 7.750458 (-23.55%, -2.39)
## data-env-vars.name = mercurial-public-2024-03-22-zstd-sparse-revlog
before: 1.263859
after: 0.700229 (-44.60%, -0.56)
## data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
before: 399.484481
after: 346.5089 (-13.26%, -52.98)
## data-env-vars.name = pypy-2024-03-22-zstd-sparse-revlog
before: 4.540080
after: 3.401700 (-25.07%, -1.14)
## data-env-vars.name = tryton-public-2024-03-22-zstd-sparse-revlog
before: 2.975765
after: 1.870798 (-37.13%, -1.10)
$ cat >> $HGRCPATH << EOF
> [extensions]
> fastannotate=
> [fastannotate]
> mainbranch=@
> EOF
setup the server repo
$ hg init repo-server
$ cd repo-server
$ cat >> .hg/hgrc << EOF
> [fastannotate]
> server=1
> EOF
$ for i in 1 2 3 4; do
> echo $i >> a
> hg commit -A -m $i a
> done
$ [ -d .hg/fastannotate ]
[1]
$ hg bookmark @
$ cd ..
setup the local repo
$ hg clone 'ssh://user@dummy/repo-server' repo-local -q
$ cd repo-local
$ cat >> .hg/hgrc << EOF
> [fastannotate]
> client=1
> clientfetchthreshold=0
> EOF
$ [ -d .hg/fastannotate ]
[1]
$ hg fastannotate a --debug
running * (glob)
sending hello command
sending between command
remote: * (glob) (?)
remote: capabilities: * (glob)
remote: * (glob) (?)
sending protocaps command
fastannotate: requesting 1 files
sending getannotate command
fastannotate: writing 112 bytes to fastannotate/default/a.l
fastannotate: writing 94 bytes to fastannotate/default/a.m
fastannotate: a: using fast path (resolved fctx: True)
0: 1
1: 2
2: 3
3: 4
the cache could be reused and no download is necessary
$ hg fastannotate a --debug
fastannotate: a: using fast path (resolved fctx: True)
0: 1
1: 2
2: 3
3: 4
if the client agrees where the head of the master branch is, no re-download
happens even if the client has more commits
$ echo 5 >> a
$ hg commit -m 5
$ hg bookmark -r 3 @ -f
$ hg fastannotate a --debug
0: 1
1: 2
2: 3
3: 4
4: 5
if the client has a different "@" (head of the master branch) and "@" is ahead
of the server, the server can detect things are unchanged and does not return
full contents (not that there is no "writing ... to fastannotate"), but the
client can also build things up on its own (causing diverge)
$ hg bookmark -r 4 @ -f
$ hg fastannotate a --debug
running * (glob)
sending hello command
sending between command
remote: * (glob) (?)
remote: capabilities: * (glob)
remote: * (glob) (?)
sending protocaps command
fastannotate: requesting 1 files
sending getannotate command
fastannotate: a: 1 new changesets in the main branch
0: 1
1: 2
2: 3
3: 4
4: 5
if the client has a different "@" which is behind the server. no download is
necessary
$ hg fastannotate a --debug --config fastannotate.mainbranch=2
fastannotate: a: using fast path (resolved fctx: True)
0: 1
1: 2
2: 3
3: 4
4: 5
define fastannotate on-disk paths
$ p1=.hg/fastannotate/default
$ p2=../repo-server/.hg/fastannotate/default
revert bookmark change so the client is behind the server
$ hg bookmark -r 2 @ -f
in the "fctx" mode with the "annotate" command, the client also downloads the
cache. but not in the (default) "fastannotate" mode.
$ rm $p1/a.l $p1/a.m
$ hg annotate a --debug | grep 'fastannotate: writing'
[1]
$ hg annotate a --config fastannotate.modes=fctx --debug | grep 'fastannotate: writing' | sort
fastannotate: writing 112 bytes to fastannotate/default/a.l
fastannotate: writing 94 bytes to fastannotate/default/a.m
the fastannotate cache (built server-side, downloaded client-side) in two repos
have the same content (because the client downloads from the server)
$ diff $p1/a.l $p2/a.l
$ diff $p1/a.m $p2/a.m
in the "fctx" mode, the client could also build the cache locally
$ hg annotate a --config fastannotate.modes=fctx --debug --config fastannotate.mainbranch=4 | grep fastannotate
fastannotate: requesting 1 files
fastannotate: a: 1 new changesets in the main branch
the server would rebuild broken cache automatically
$ cp $p2/a.m $p2/a.m.bak
$ echo BROKEN1 > $p1/a.m
$ echo BROKEN2 > $p2/a.m
$ hg fastannotate a --debug | grep 'fastannotate: writing' | sort
fastannotate: writing 112 bytes to fastannotate/default/a.l
fastannotate: writing 94 bytes to fastannotate/default/a.m
$ diff $p1/a.m $p2/a.m
$ diff $p2/a.m $p2/a.m.bak
use the "debugbuildannotatecache" command to build annotate cache
$ rm -rf $p1 $p2
$ hg --cwd ../repo-server debugbuildannotatecache a --debug
fastannotate: a: 4 new changesets in the main branch
$ hg --cwd ../repo-local debugbuildannotatecache a --debug
running * (glob)
sending hello command
sending between command
remote: * (glob) (?)
remote: capabilities: * (glob)
remote: * (glob) (?)
sending protocaps command
fastannotate: requesting 1 files
sending getannotate command
fastannotate: writing * (glob)
fastannotate: writing * (glob)
$ diff $p1/a.l $p2/a.l
$ diff $p1/a.m $p2/a.m
with the clientfetchthreshold config option, the client can build up the cache
without downloading from the server
$ rm -rf $p1
$ hg fastannotate a --debug --config fastannotate.clientfetchthreshold=10
fastannotate: a: 3 new changesets in the main branch
0: 1
1: 2
2: 3
3: 4
4: 5
if the fastannotate directory is not writable, the fctx mode still works
$ rm -rf $p1
$ touch $p1
$ hg annotate a --debug --traceback --config fastannotate.modes=fctx
fastannotate: a: cache broken and deleted
fastannotate: prefetch failed: * (glob)
fastannotate: a: cache broken and deleted
fastannotate: falling back to the vanilla annotate: * (glob)
0: 1
1: 2
2: 3
3: 4
4: 5
with serverbuildondemand=False, the server will not build anything
$ cat >> ../repo-server/.hg/hgrc <<EOF
> [fastannotate]
> serverbuildondemand=False
> EOF
$ rm -rf $p1 $p2
$ hg fastannotate a --debug | grep 'fastannotate: writing'
[1]