Thu, 26 Oct 2023 15:26:19 +0200 rust-index: add Sync bound to all relevant mmap-derived values
Raphaël Gomès <rgomes@octobus.net> [Thu, 26 Oct 2023 15:26:19 +0200] rev 51233
rust-index: add Sync bound to all relevant mmap-derived values All readonly mmaps are Sync as far as Rust is concerned. Integrity of the mmap'ed file is a concern separate to Rust's memory model, since it requires out-of-program handling via locks, etc. This will help when we start sharing the Rust Index with Python.
Tue, 31 Oct 2023 18:09:43 +0100 debugindexstats: handle the lack of Rust support better
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 18:09:43 +0100] rev 51232
debugindexstats: handle the lack of Rust support better We don't have any stats in the Rust index. Currently it is not known which stats would be interesting to get, so if they end up being important, we can add them later.
Tue, 31 Oct 2023 17:36:59 +0100 rust-python-index: don't panic on a corrupted index when calling from Python
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:36:59 +0100] rev 51231
rust-python-index: don't panic on a corrupted index when calling from Python This makes `test-verify.t` pass again. In an ideal world, we would find the exact commit where this test breaks and amend part of this change there, but this is a long enough series.
Tue, 31 Oct 2023 17:34:31 +0100 tests: ignore test-storage when using Rust
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:34:31 +0100] rev 51230
tests: ignore test-storage when using Rust This is only relevant for Python code and the SQLite backend, which is in a half-abandoned state.
Fri, 20 Oct 2023 09:12:22 +0200 rust-index: optimize find_gca_candidates() on less than 8 revisions
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:12:22 +0200] rev 51229
rust-index: optimize find_gca_candidates() on less than 8 revisions This is expected to be by far the most common case, given that, e.g., merging involves using it on two revisions. Using a `u8` as support for the bitset obviously divides the amount of RAM needed by 8. To state the obvious, on a repository with 10 million changesets, this spares 70MB. It is also possible that it'd be slightly faster, because it is easier to allocate and provides better cache locality. It is possible that some exhaustive listing of the traits implemented by `u8` and `u64` would avoid the added duplication, but that can be done later and would need a replacement for the `MAX` consts.
Fri, 20 Oct 2023 08:54:49 +0200 rust-index: simplification in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:54:49 +0200] rev 51228
rust-index: simplification in find_gca_candidates() `parent_seen` can be made a mutable ref, making this part more obvious, not needing to be commented so much. The micro-optimization of avoiding the union if `parent_seen` and `current_seen` agree is pushed down in the `union()` method of the fast, `u64` based bit set implementation (in case it matters).
Fri, 20 Oct 2023 08:43:00 +0200 rust-index: avoid double negation in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:43:00 +0200] rev 51227
rust-index: avoid double negation in find_gca_candidates()
Fri, 20 Oct 2023 08:17:00 +0200 rust-index: avoid some cloning in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:17:00 +0200] rev 51226
rust-index: avoid some cloning in find_gca_candidates() Instead of keeping the information whether the current revision is poisoned on `current_seen`, we extract it as a boolean. This also allows us to simplify the explanation of `seen[r].is_poisoned()`, as the exceptional case where it is poisoned right after `r` has been determined to be a solution does no longer exist.
Wed, 18 Oct 2023 15:35:38 +0200 rust-index: implement common_ancestors_heads() and ancestors()
Georges Racinet <georges.racinet@octobus.net> [Wed, 18 Oct 2023 15:35:38 +0200] rev 51225
rust-index: implement common_ancestors_heads() and ancestors() The only differences betwwen `common_ancestors_heads()` and `find_gca_candidates()` seems to be that: - the former accepts "overlapping" input revisions (meaning with duplicates). - limitation to 24 inputs (in the C code), that we translate to using the arbitrary size bit sets in the Rust code because we cannot bail to Python. Given that the input is expected to be small in most cases, we take the heavy handed approach of going through a HashSet and wait for perfomance assessment In case this is used via `hg-cpython`, we can anyway absorb the overhead by having `commonancestorheads` build a vector of unique values directly, and introduce a thin wrapper over `find_gca_candidates`, to take care of bit set type dispatching only. As far as `ancestors` is concerneed, this is just chaining `common_ancestors_heads()` with `find_deepest_revs`.
Tue, 17 Oct 2023 22:42:40 +0200 rust-index: find_gca_candidates bit sets genericization
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Tue, 17 Oct 2023 22:42:40 +0200] rev 51224
rust-index: find_gca_candidates bit sets genericization This allows to use arbitratry size of inputs in `find_gca_candidates()`. We're genericizing so that the common case of up to 63 inputs can be treated with the efficient implementation backed by `u64`. Some complications with the borrow checker came, because arbitrary sized bit sets will not be `Copy`, hence mutating them keeps a mut ref on the `seen` vector. This is solved by some cloning, most of which can be avoided, preferably in a follow-up after proof that this works (hence after exposition to Python layer). As far as performance is concerned, calling `clone()` on a `Copy` object (good case when number of revs is less than 64) should end up just doing a copy, according to this excerpt of the `Clone` trait documentation: Types that are Copy should have a trivial implementation of Clone. More formally: if T: Copy, x: T, and y: &T, then let x = y.clone(); is equivalent to let x = *y;. Manual implementations should be careful to uphold this invariant; however, unsafe code must not rely on it to ensure memory safety. We kept the general structure, hence why there are some double negations. This also could be made nicer in a follow-up. The `NonStaticPoisonableBitSet` is included to ensure that the `PoisonableBitSet` trait is general enough (had to correct `vec_of_empty()` for instance). Moving the genericization one level to encompass the `seen` vector and not its elements would be better for performance, if worth it.
(0) -30000 -10000 -3000 -1000 -300 -100 -10 +10 +100 +300 tip