Wed, 26 Feb 2020 17:16:25 +0100 revlog-compression: release note entry for update the config to be a list
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 26 Feb 2020 17:16:25 +0100] rev 44391
revlog-compression: release note entry for update the config to be a list I updated the changeset, but forgot to phabsend apparently. Differential Revision: https://phab.mercurial-scm.org/D8165
Tue, 18 Feb 2020 19:11:18 +0100 rust-nodemap: a method for full invalidation
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:18 +0100] rev 44390
rust-nodemap: a method for full invalidation This will be used for exceptional operations, such as a `__delitem__` on the `MixedIndex` with Rust nodemap. In principle, `NodeTree` should also be able to forget an entry in an efficient way, by accepting to insert `Element::None` instead of only `Element::Rev(r)`, but that seems really overkill at this point. We need to support exceptional operations such as `__delitem__`, only for completeness of the revlog index as seen from Python. The Python callers don't seem to even really need it, deciding to drop the nodemap unconditionally at at higher level when calling `hg strip`. Also, `hg strip` is very costly for reasons that are unrelated to nodemap aspects. Differential Revision: https://phab.mercurial-scm.org/D8098
Tue, 18 Feb 2020 19:11:17 +0100 rust-nodemap: accounting for dead blocks
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:17 +0100] rev 44389
rust-nodemap: accounting for dead blocks By the very append-only nature of the `NodeTree`, inserting new blocks has the effect of making some of the older ones useless as they become unreachable. Therefore some automatic housekeeping will need to be provided. This is standard procedure in the word of databases, under names such as "repack" or "vacuum". The new `masked_readonly_blocks()` will provide callers with useful information to decide if the nodetree is ripe for repacking, but all the `NodeTree` can provide is how many blocks have been masked in the currently mutable part. Analysing the readonly part would be way too long to do it for each transaction and defeat the whole purpose of nodemap persistence. Serializing callers (from the Python layer) will get this figure before each extraction and maintain an aggregate counter of unreachable blocks separately. Note: at this point, the most efficient repacking is just to restart afresh with a full rescan. Differential Revision: https://phab.mercurial-scm.org/D8097
Tue, 18 Feb 2020 19:11:17 +0100 rust-nodemap: core implementation for shortest
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:17 +0100] rev 44388
rust-nodemap: core implementation for shortest In this implementation, we just make `lookup()` return also the number of steps that have been needed to come to a conclusion from the nodetree data, and `validate_candidate()` takes care of the special cases related to `NULL_NODE`. This way of doing minimizes code duplication, but it means that the comparatively slower finding of first non zero nybble will run for all calls to `find()` where it is not needed. Still running on the file generated for the mozilla-central repository, it seems indeed that we now get more ofter 320 ns than 310. The odds that this could have a significant impact on real life Mercurial performance are still looking low. Let's wait for actual benchmark runs to see if an optimization is needed here. Differential Revision: https://phab.mercurial-scm.org/D7819
Tue, 18 Feb 2020 19:11:16 +0100 rust-nodemap: special case for prefixes of NULL_NODE
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:16 +0100] rev 44387
rust-nodemap: special case for prefixes of NULL_NODE We have to behave as though NULL_NODE was stored in the node tree, although we don't store it. Differential Revision: https://phab.mercurial-scm.org/D7798
Tue, 18 Feb 2020 19:11:15 +0100 rust-nodemap: pure Rust example
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:15 +0100] rev 44386
rust-nodemap: pure Rust example To run, use `cargo run --release --example nodemap` This demonstrates that simple scenarios entirely written in Rust can content themselves with `NodeTree<T>`. The example mmaps both the nodemap file and the changelog index. We had of course to include an implementation of `RevlogIndex` directly, which isn't much at this stage. It felt a bit prematurate to include it in the lib. Here are some first performance measurements, obtained with this example, on a clone of mozilla-central with 440000 changesets: (create) Nodemap constructed in RAM in 153.638305ms (query CAE63161B68962) found in 22.362us: Ok(Some(269489)) (bench) Did 3 queries in 36.418µs (mean 12.139µs) (bench) Did 50 queries in 184.318µs (mean 3.686µs) (bench) Did 100000 queries in 31.053461ms (mean 310ns) To be fair, even between bench runs, results tend to depend whether the file is still in kernel caches, and it's not so easy to get back to a real cold start. The worst we've seen was in the 50us ballpark. In any busy server setting, the pages would always be in RAM. We hope it's good enough not to be significantly slower on any concrete Mercurial operation than the C nodetree when fully in RAM, and of course this implementation has the serious headstart advantage of persistence. Differential Revision: https://phab.mercurial-scm.org/D7797
Tue, 18 Feb 2020 19:11:15 +0100 rust-nodemap: input/output primitives
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:15 +0100] rev 44385
rust-nodemap: input/output primitives These allow to initiate a `NodeTree` from an immutable opaque sequence of bytes, which could be passed over from Python (extracted from a `PyBuffer`) or directly mmapped from a file. Conversely, we can consume a `NodeTree`, extracting the bytes that express what has been added to the immutable part, together with the original immutable part. This gives callers the choice to start a new Nodetree. After writing to disk, some would prefer to reread for best guarantees (very cheap if mmapping), some others will find it more convenient to grow the memory that was considered immutable in the `NodeTree` and continue from there. This is enough to build examples running on real data and start gathering performance hints. Differential Revision: https://phab.mercurial-scm.org/D7796
Thu, 13 Feb 2020 15:33:36 -0800 pyoxidizer: allow extensions to be loaded from the file system
Martin von Zweigbergk <martinvonz@google.com> [Thu, 13 Feb 2020 15:33:36 -0800] rev 44384
pyoxidizer: allow extensions to be loaded from the file system It seems that setting this config is all that's needed to be able to load extensions from the file system (which we clearly want). Thanks for making this work, Gregory Szorc!. Differential Revision: https://phab.mercurial-scm.org/D8122
Mon, 17 Feb 2020 20:30:03 -0500 graft: always allow hg graft --base . (issue6248)
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Mon, 17 Feb 2020 20:30:03 -0500] rev 44383
graft: always allow hg graft --base . (issue6248) `hg graft --base . -r abc` is rejected before this change with a "nothing to merge" error, if `abc` does not descend from `.`. This looks like an artifact of the implementation rather than intended behavior. It makes perfect sense to apply the diff between `.` and `abc` to the working copy (i.e. degenerate into `hg revert`), regardless of what `abc` is. Differential Revision: https://phab.mercurial-scm.org/D8127
Wed, 19 Feb 2020 17:30:04 +0100 revlog-compression: update the config to be a list
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 19 Feb 2020 17:30:04 +0100] rev 44382
revlog-compression: update the config to be a list format.revlog-compression is now a list of engine, the first supported one is to be used. Doing this have several benefits: 1) this is fully backward compatible, config using a single entry will be read as a single item list, not changing any behavior. 2) This open the way to use zstd by default without impacting platform were it is not available. This will be done in a later changesets. Using zstd provide a significant performance boost explained in : bb271ec2fbfb. However zstd is not available in some cases, A notable example is the `--pure` version of Mercurial which doesn't come with zstd support. Differential Revision: https://phab.mercurial-scm.org/D8148
Wed, 19 Feb 2020 13:39:00 +0530 remotefilelog: add 'changelog' arg to shallowcg1packer.generate (issue6269)
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 19 Feb 2020 13:39:00 +0530] rev 44381
remotefilelog: add 'changelog' arg to shallowcg1packer.generate (issue6269) This cause traceback on widening using narrow extension when remotefilelog is enabled. Differential Revision: https://phab.mercurial-scm.org/D8134
Tue, 25 Feb 2020 12:41:35 -0800 drawdag: abide by new createmarkers() API
Martin von Zweigbergk <martinvonz@google.com> [Tue, 25 Feb 2020 12:41:35 -0800] rev 44380
drawdag: abide by new createmarkers() API The `obsolete.createmarkers()` API was changed in 6335c0de80fa (obsolete: allow multiple predecessors in createmarkers, 2018-09-22) to prefer its precursors input to be a tuple instead of a single precursor. Let's fix `drawdag.py` to comply. Differential Revision: https://phab.mercurial-scm.org/D8149
Mon, 24 Feb 2020 14:52:46 -0500 lfutil: provide a hint if the largefiles/lfs cache path cannot be determined
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 14:52:46 -0500] rev 44379
lfutil: provide a hint if the largefiles/lfs cache path cannot be determined A coworker hit this error using an LFS repo in a stripped down environment, and didn't know how to resolve it. The final conditional is a bit fast and loose, but there is currently no 'posix' test in hghave, and it doesn't seem like it's worth adding for this since I think Windows is the only non-POSIX platform we run tests on. Differential Revision: https://phab.mercurial-scm.org/D8145
Mon, 24 Feb 2020 00:20:47 -0500 setup: exclude the __index__ module from itself when generating stable
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 00:20:47 -0500] rev 44378
setup: exclude the __index__ module from itself when generating This module is generated on Windows to hold all of the extension names and the help summaries, so that they are discoverable inside the py2exe zipfile. The problem is this file is generated by dumping the disabled list, and that list comes from walking the filesystem. So once an install from source into a virtualenv created this module, then next build from source from that virtualenv would also see __index__.py in the filesystem, and include it. Clearly that's wrong because this isn't a real extension, so just filter it from the list when generating it. The Mercurial installer was unaffected by this, but the TortoiseHg package was. In the final package, `hg help -v extensions` and the panel of extensions both showed it. Differential Revision: https://phab.mercurial-scm.org/D8142
Mon, 24 Feb 2020 16:33:10 -0500 tests: stabilize test-amend.t on Windows stable
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 16:33:10 -0500] rev 44377
tests: stabilize test-amend.t on Windows If $TESTTMP isn't quoted in this context, it ends up like `C:Temphgtests.pikkoxchild1test-amend.t-obsstore-off`. Differential Revision: https://phab.mercurial-scm.org/D8144
Mon, 24 Feb 2020 17:43:34 -0500 tests: replace truncate(1) with inline python
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 17:43:34 -0500] rev 44376
tests: replace truncate(1) with inline python MSYS doesn't have truncate(1) installed by default, and FreeBSD looked unhappy with the arguments provided. Differential Revision: https://phab.mercurial-scm.org/D8147
Mon, 24 Feb 2020 16:59:35 -0500 tests: stabilize test-rename-merge2.t on Windows
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 16:59:35 -0500] rev 44375
tests: stabilize test-rename-merge2.t on Windows I have no idea why, but this shifted in b4057d001760. Differential Revision: https://phab.mercurial-scm.org/D8146
Mon, 24 Feb 2020 13:50:55 -0500 merge with stable
Augie Fackler <augie@google.com> [Mon, 24 Feb 2020 13:50:55 -0500] rev 44374
merge with stable
Mon, 24 Feb 2020 13:28:49 +0900 py3: fix EOL detection in commandserver.channeledinput stable
Yuya Nishihara <yuya@tcha.org> [Mon, 24 Feb 2020 13:28:49 +0900] rev 44373
py3: fix EOL detection in commandserver.channeledinput This breaks TortoiseHg's email preview which sends b'\n' while readline request is issued and the loop never ends. Spotted by Matt Harbison.
Thu, 13 Feb 2020 22:51:17 -0500 bookmarks: prevent pushes of divergent bookmarks (foo@remote)
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Thu, 13 Feb 2020 22:51:17 -0500] rev 44372
bookmarks: prevent pushes of divergent bookmarks (foo@remote) Before this change, such bookmarks are write-only: a client can push them but not pull/read them. And because these bookmark can't be read, even pushes are limited (for instance trying to delete such a bookmark fails with a vanilla client because the client thinks the bookmark is neither on the local nor the remote). This change makes the server refuses such bookmarks, and for earlier errors, makes the client refuse to send them. I think the change of behavior is acceptable because I think this is a bug in push/pull, and I don't think we change the behavior of `hg unbundle`, because it doesn't seem that `hg bundle` ever store bookmarks (and even if it did, it would seem weird anyway to try to send divergent bookmarks). Differential Revision: https://phab.mercurial-scm.org/D8117
Thu, 13 Feb 2020 22:06:57 -0500 bookmarks: refactor in preparation for next commit
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Thu, 13 Feb 2020 22:06:57 -0500] rev 44371
bookmarks: refactor in preparation for next commit Differential Revision: https://phab.mercurial-scm.org/D8116
Sat, 15 Feb 2020 14:51:33 -0500 bookmarks: avoid traceback when two pushes race to delete the same bookmark
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sat, 15 Feb 2020 14:51:33 -0500] rev 44370
bookmarks: avoid traceback when two pushes race to delete the same bookmark `hg push -f -B remote-only-bookmark` can raise server-side in `bookmarks._del` (specifically in `self._refmap.pop(mark)`), if the remote-only bookmark got deleted concurrently. Fix this by simply not deleting the non-existent bookmark in that case. For avoidance of doubt, refusing to delete a bookmark that doesn't exist when the push starts is taking care of elsewhere; no change of behavior there. Differential Revision: https://phab.mercurial-scm.org/D8124
Sat, 15 Feb 2020 15:06:41 -0500 relnotes: add entry about previous `hg recover` change
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sat, 15 Feb 2020 15:06:41 -0500] rev 44369
relnotes: add entry about previous `hg recover` change Differential Revision: https://phab.mercurial-scm.org/D8123
Thu, 20 Feb 2020 15:15:23 -0800 darwin: add another preemptive gui() call when using chg
Kyle Lippincott <spectral@google.com> [Thu, 20 Feb 2020 15:15:23 -0800] rev 44368
darwin: add another preemptive gui() call when using chg Changeset a89381e04c58 added this gui() call before background forks, and Google's extensions do background forks on essentially every invocation for logging purposes. The crash is reliably (though not 100%) reproducible without this change when running `HGPLAIN=1 chg status` in one of our repos. With this fix, I haven't been able to trigger the crash anymore. Differential Revision: https://phab.mercurial-scm.org/D8141
Fri, 20 Dec 2019 13:24:46 -0800 copy: add experimental support for marking committed copies
Martin von Zweigbergk <martinvonz@google.com> [Fri, 20 Dec 2019 13:24:46 -0800] rev 44367
copy: add experimental support for marking committed copies The simplest way I'm aware of to mark a file as copied/moved after committing is this: hg uncommit --keep <src> <dest> # <src> needed for move, but not copy hg mv --after <src> <dest> hg amend This patch teaches `hg copy` a `--at-rev` argument to simplify that into: hg copy --after --at-rev . <src> <dest> In addition to being simpler, it doesn't touch the working copy, so it can easily be used even if the destination file has been modified in the working copy. Differential Revision: https://phab.mercurial-scm.org/D8035
Thu, 26 Dec 2019 14:02:50 -0800 copy: move argument validation a little earlier
Martin von Zweigbergk <martinvonz@google.com> [Thu, 26 Dec 2019 14:02:50 -0800] rev 44366
copy: move argument validation a little earlier Argument validation is usually done early and I will want it done before some code that I'm about to add. Differential Revision: https://phab.mercurial-scm.org/D8033
Tue, 28 Jan 2020 14:07:57 -0800 copy: add experimetal support for unmarking committed copies
Martin von Zweigbergk <martinvonz@google.com> [Tue, 28 Jan 2020 14:07:57 -0800] rev 44365
copy: add experimetal support for unmarking committed copies The simplest way I'm aware of to unmark a file as copied after committing is this: hg uncommit --keep <dest> hg forget <dest> hg add <dest> hg amend This patch teaches `hg copy --forget` a `-r` argument to simplify that into: hg copy --forget --at-rev . <dest> In addition to being simpler, it doesn't touch the working copy, so it can easily be used even if the destination file has been modified in the working copy. I'll teach `hg copy` without `--forget` to work with `--at-rev` next. Differential Revision: https://phab.mercurial-scm.org/D8030
Fri, 20 Dec 2019 15:50:13 -0800 copy: add option to unmark file as copied
Martin von Zweigbergk <martinvonz@google.com> [Fri, 20 Dec 2019 15:50:13 -0800] rev 44364
copy: add option to unmark file as copied To unmark a file as copied, the user currently has to do this: hg forget <dest> hg add <dest> The new command simplifies that to: hg copy --forget <dest> That's not a very big improvement, but I'm planning to also teach `hg copy [--forget]` a `--at-rev` argument for marking/unmarking copies after commit (usually with `--at-rev .`). Differential Revision: https://phab.mercurial-scm.org/D8029
Tue, 11 Feb 2020 11:18:52 +0100 nodemap: introduce an option to use mmap to read the nodemap mapping
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 11 Feb 2020 11:18:52 +0100] rev 44363
nodemap: introduce an option to use mmap to read the nodemap mapping The performance and memory benefit is much greater if we don't have to copy all the data in memory for each information. So we introduce an option (on by default) to read the data using mmap. This changeset is the last one definition the API for index support nodemap data. (they have to be able to use the mmaping). Below are some benchmark comparing the best we currently have in 5.3 with the final step of this series (using the persistent nodemap implementation in Rust). The benchmark run `hg perfindex` with various revset and the following variants: Before: * do not use the persistent nodemap * use the CPython implementation of the index for nodemap * use mmapping of the changelog index After: * use the MixedIndex Rust code, with the NodeTree object for nodemap access (still in review) * use the persistent nodemap data from disk * access the persistent nodemap data through mmap * use mmapping of the changelog index The persistent nodemap greatly speed up most operation on very large repositories. Some of the previously very fast lookup end up a bit slower because the persistent nodemap has to be setup. However the absolute slowdown is very small and won't matters in the big picture. Here are some numbers (in seconds) for the reference copy of mozilla-try: Revset Before After abs-change speedup -10000: 0.004622 0.005532 0.000910 × 0.83 -10: 0.000050 0.000132 0.000082 × 0.37 tip 0.000052 0.000085 0.000033 × 0.61 0 + (-10000:) 0.028222 0.005337 -0.022885 × 5.29 0 0.023521 0.000084 -0.023437 × 280.01 (-10000:) + 0 0.235539 0.005308 -0.230231 × 44.37 (-10:) + :9 0.232883 0.000180 -0.232703 ×1293.79 (-10000:) + (:99) 0.238735 0.005358 -0.233377 × 44.55 :99 + (-10000:) 0.317942 0.005593 -0.312349 × 56.84 :9 + (-10:) 0.313372 0.000179 -0.313193 ×1750.68 :9 0.316450 0.000143 -0.316307 ×2212.93 On smaller repositories, the cost of nodemap related operation is not as big, so the win is much more modest. Yet it helps shaving a handful of millisecond here and there. Here are some numbers (in seconds) for the reference copy of mercurial: Revset Before After abs-change speedup -10: 0.000065 0.000097 0.000032 × 0.67 tip 0.000063 0.000078 0.000015 × 0.80 0 0.000561 0.000079 -0.000482 × 7.10 -10000: 0.004609 0.003648 -0.000961 × 1.26 0 + (-10000:) 0.005023 0.003715 -0.001307 × 1.35 (-10:) + :9 0.002187 0.000108 -0.002079 ×20.25 (-10000:) + 0 0.006252 0.003716 -0.002536 × 1.68 (-10000:) + (:99) 0.006367 0.003707 -0.002660 × 1.71 :9 + (-10:) 0.003846 0.000110 -0.003736 ×34.96 :9 0.003854 0.000099 -0.003755 ×38.92 :99 + (-10000:) 0.007644 0.003778 -0.003866 × 2.02 Differential Revision: https://phab.mercurial-scm.org/D7894
Fri, 14 Feb 2020 15:03:26 +0100 rust-dirstatemap: directly return `non_normal` and `other_entries`
Raphaël Gomès <rgomes@octobus.net> [Fri, 14 Feb 2020 15:03:26 +0100] rev 44362
rust-dirstatemap: directly return `non_normal` and `other_entries` This cleans up the interface which I previously thought needed to be uglier than in reality. No performance difference, simple refactoring. Differential Revision: https://phab.mercurial-scm.org/D8121
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -30 +30 +50 +100 +300 +1000 +3000 tip