Wed, 07 Jan 2015 17:28:51 -0800 setdiscovery: always add exponential sample to the heads
Pierre-Yves David <pierre-yves.david@fb.com> [Wed, 07 Jan 2015 17:28:51 -0800] rev 23813
setdiscovery: always add exponential sample to the heads As explained in a previous changeset, prioritizing heads too much behaves pathologically when there are more heads than the sample size. To counter this, we always inject exponential samples before reducing to the sample size limit. This already show some benefit in the test themselves, but on a real-world example this moves my discovery for push to pathologically headed repo from 45 rounds to 17 of them. We should maybe ensure that at least 25% of the result sample is heads, but I think the random sampling will be fine in practice.
Wed, 07 Jan 2015 17:23:21 -0800 setdiscovery: directly run '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com> [Wed, 07 Jan 2015 17:23:21 -0800] rev 23812
setdiscovery: directly run '_updatesample' The heads and exponential sample are going to end up in the same set before any extra processing happens. We simplify the code by directly updating a set with heads. Changes in the order the set is built lead to small changes in the random sampling output. But after double checking, I can confirm the input data to the random sampling is consistent.
Wed, 07 Jan 2015 17:17:56 -0800 setdiscovery: stop using '_setupsample' in '_takefullsample'
Pierre-Yves David <pierre-yves.david@fb.com> [Wed, 07 Jan 2015 17:17:56 -0800] rev 23811
setdiscovery: stop using '_setupsample' in '_takefullsample' Very few of the return values of '_setupsample' remain in use, so we directly retrieve the value we care about and drop the '_setupsample' call.
Wed, 07 Jan 2015 12:09:51 -0800 setdiscovery: randomly pick between heads and sample when taking full sample
Pierre-Yves David <pierre-yves.david@fb.com> [Wed, 07 Jan 2015 12:09:51 -0800] rev 23810
setdiscovery: randomly pick between heads and sample when taking full sample Before this changeset, the discovery protocol was too heads-centric. Heads of the undiscovered set were always sent for discovery and any room remaining in the sample were filled with exponential samples (and random ones if any room remained). This behaved extremely poorly when the number of heads exceeded the sample size, because we keep just asking about the existence of heads, then their direct parent and so on. As a result, the 'O(log(len(repo)))' discovery turns into a 'O(len(repo))' one. As a solution we take a random sample of the heads plus exponential samples. This way we ensure some exponential sampling is achieved, bringing back some logarithmic convergence of the discovery again. This patch only applies this principle in one place. More places will be updated in future patches. One test is impacted because the random sample happen to be different. By chance, it helps a bit in this case.
Tue, 06 Jan 2015 17:02:32 -0800 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 06 Jan 2015 17:02:32 -0800] rev 23809
setdiscovery: document the '_updatesample' function This function is central in the sample building process, having it documented help code readability a lot.
Tue, 06 Jan 2015 16:40:33 -0800 setdiscovery: avoid calling any sample building if the undecided set is small
Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 06 Jan 2015 16:40:33 -0800] rev 23808
setdiscovery: avoid calling any sample building if the undecided set is small If the length of undecided is smaller than the sample size, we can just request information for all of them. This conditional was previously handled by '_setupsample'. But '_setupsample' is in my opinion a problematic function with blurry semantics. Having this conditional explicitly earlier makes the code more explicit and moves us closer to removing this '_setupsample' function.
Wed, 07 Jan 2015 09:30:06 -0800 setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com> [Wed, 07 Jan 2015 09:30:06 -0800] rev 23807
setdiscovery: delay sample building calls to gather them in a single place Some of the logic around sample building is duplicated in the sample builders, it would clean up thing to extract it in the top function, but this requires all codes to be in the same place. This changeset mostly exists to make the next one more clear.
Tue, 06 Jan 2015 16:32:23 -0800 setdiscovery: drop unused 'initial' argument for '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com> [Tue, 06 Jan 2015 16:32:23 -0800] rev 23806
setdiscovery: drop unused 'initial' argument for '_takequicksample' There is a single call site, and it is always using 'initial=True'. So we just drop the argument and the associated condition.
Sun, 11 Jan 2015 16:46:13 -0600 readmarkers: add a SHA256 fixme note
Matt Mackall <mpm@selenic.com> [Sun, 11 Jan 2015 16:46:13 -0600] rev 23805
readmarkers: add a SHA256 fixme note
Sun, 11 Jan 2015 16:37:57 -0600 readmarkers: fast-path single successors and parents
Matt Mackall <mpm@selenic.com> [Sun, 11 Jan 2015 16:37:57 -0600] rev 23804
readmarkers: fast-path single successors and parents This gives about a 5% performance bump.
(0) -10000 -3000 -1000 -300 -100 -10 +10 +100 +300 +1000 +3000 +10000 tip