revlog: avoid wrongly updating the data file location on "divert" stable
authorPierre-Yves David <pierre-yves.david@octobus.net>
Wed, 06 Dec 2023 16:29:43 +0100
branchstable
changeset 51166 849745d7da89
parent 51165 66417f55ea33
child 51167 b79f13d6ef25
revlog: avoid wrongly updating the data file location on "divert" If we are in the inline case, we need to align the location of the "data" file with the temporary location of the file (i.e. "00changelog.i.a"). However we should not do that for non-inline caseā€¦ and before this changeset we had been doing it. In addition `index_file` is already a property taking care of updating the "segment file" filename when needed. So we can simply remove all that code. As a result, code trying to read the diverted data before they were committed ended deeply confused as the "00changelog.i.a" file is nothing like the "00changelog.d" file. However nothing corrupted data as all writing where properly handled outside of the "segment file". In "best" cases this small in-memory corruption of the filename when unnoticed until the transaction was committed or rolled back and in the worse case, some data reading was failing during the transaction and resulted in the transaction to be rolled back. However wrong data never reached the disk, so this bug should be have corrupted any repository. This is not catch by tests because most test use a small repository and therefor an inline revlog. In addition the bug only triggers when a changelog read is done in the following "rare" situation: - after some delayed write - after that data have been written in a "divert" file (i.e. `00.changelog.i.a`) - before transaction commit - outside of a "writing" context The issue was introduced in d83d788590a8
mercurial/revlog.py
--- a/mercurial/revlog.py	Thu Dec 07 00:34:08 2023 +0100
+++ b/mercurial/revlog.py	Wed Dec 06 16:29:43 2023 +0100
@@ -1166,7 +1166,6 @@
         elif len(self.index) == 0:
             self._orig_index_file = self.index_file
             self.index_file = self._divert_index()
-            self._segmentfile.filename = self.index_file
             assert self._orig_index_file is not None
             assert self.index_file is not None
             if self.opener.exists(self.index_file):
@@ -1202,7 +1201,6 @@
             assert self._segmentfile._delay_buffer is None
         self._orig_index_file = self.index_file
         self.index_file = pending_index_file
-        self._segmentfile.filename = self.index_file
         return self.index_file, any_pending
 
     def finalize_pending(self):
@@ -1228,7 +1226,6 @@
                 )
             self.index_file = self._orig_index_file
             self._orig_index_file = None
-            self._segmentfile.filename = self.index_file
         else:
             msg = b"not delay or divert found on this revlog"
             raise error.ProgrammingError(msg)