verify: allow the storage to signal when renames can be tested on `skipread`
authorMatt Harbison <matt_harbison@yahoo.com>
Mon, 23 Dec 2019 01:12:20 -0500
changeset 44073 b9e174d4ed11
parent 44072 1a6dd50cd0db
child 44074 806d14efec8d
verify: allow the storage to signal when renames can be tested on `skipread` This applies the new marker in the lfs handler to show it in action, and adds the test mentioned at the beginning of the series to show that fulltext isn't necessary in the LFS case. The existing `skipread` isn't enough, because it is also set if an error occurs reading the revlog data, or the data is censored. It could probably be cleared, but then it technically violates the interface contract. That wouldn't matter for the existing verify algorithm, but it isn't clear how that will change as alternate storage support is added. The flag is probably pretty revlog specific, given the comments in verify.py. But there's already filelog specific stuff in there and I'm not sure what future storage will bring, so I don't want to over-engineer this. Likewise, I'm not sure that we want the verify method for each storage type to completely drive the bus when it comes to detecting renames, so I don't want to go down the rabbithole of having verifyintegrity() return metadata hints at this point. Differential Revision: https://phab.mercurial-scm.org/D7713
hgext/lfs/wrapper.py
mercurial/interfaces/repository.py
mercurial/revlog.py
mercurial/verify.py
tests/test-lfs.t
--- a/hgext/lfs/wrapper.py	Sun Dec 22 23:50:19 2019 -0500
+++ b/hgext/lfs/wrapper.py	Mon Dec 23 01:12:20 2019 -0500
@@ -236,6 +236,10 @@
         # the revlog.
         if rl.opener.lfslocalblobstore.has(metadata.oid()):
             skipflags &= ~revlog.REVIDX_EXTSTORED
+        elif skipflags & revlog.REVIDX_EXTSTORED:
+            # The wrapped method will set `skipread`, but there's enough local
+            # info to check renames.
+            state[b'safe_renamed'].add(node)
 
     orig(rl, skipflags, state, node)
 
--- a/mercurial/interfaces/repository.py	Sun Dec 22 23:50:19 2019 -0500
+++ b/mercurial/interfaces/repository.py	Mon Dec 23 01:12:20 2019 -0500
@@ -878,7 +878,9 @@
 
         If individual revisions cannot have their revision content resolved,
         the method is expected to set the ``skipread`` key to a set of nodes
-        that encountered problems.
+        that encountered problems.  If set, the method can also add the node(s)
+        to ``safe_renamed`` in order to indicate nodes that may perform the
+        rename checks with currently accessible data.
 
         The method yields objects conforming to the ``iverifyproblem``
         interface.
--- a/mercurial/revlog.py	Sun Dec 22 23:50:19 2019 -0500
+++ b/mercurial/revlog.py	Mon Dec 23 01:12:20 2019 -0500
@@ -2874,6 +2874,7 @@
             )
 
         state[b'skipread'] = set()
+        state[b'safe_renamed'] = set()
 
         for rev in self:
             node = self.node(rev)
--- a/mercurial/verify.py	Sun Dec 22 23:50:19 2019 -0500
+++ b/mercurial/verify.py	Mon Dec 23 01:12:20 2019 -0500
@@ -529,6 +529,8 @@
             else:
                 # Guard against implementations not setting this.
                 state[b'skipread'] = set()
+                state[b'safe_renamed'] = set()
+
                 for problem in fl.verifyintegrity(state):
                     if problem.node is not None:
                         linkrev = fl.linkrev(fl.rev(problem.node))
@@ -560,7 +562,7 @@
                     else:
                         del filenodes[f][n]
 
-                if n in state[b'skipread']:
+                if n in state[b'skipread'] and n not in state[b'safe_renamed']:
                     continue
 
                 # check renames
--- a/tests/test-lfs.t	Sun Dec 22 23:50:19 2019 -0500
+++ b/tests/test-lfs.t	Mon Dec 23 01:12:20 2019 -0500
@@ -218,6 +218,15 @@
   R large
   $ hg commit -m 'renames'
 
+  $ hg cat -r . l -T '{rawdata}\n'
+  version https://git-lfs.github.com/spec/v1
+  oid sha256:66100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e
+  size 39
+  x-hg-copy large
+  x-hg-copyrev 2c531e0992ff3107c511b53cb82a91b6436de8b2
+  x-is-binary 0
+  
+
   $ hg files -r . 'set:copied()'
   l
   s
@@ -796,27 +805,57 @@
   $ test -f fromcorrupt/.hg/store/lfs/objects/66/100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e
   [1]
 
-Verify will not try to download lfs blobs, if told not to process lfs content
+Verify will not try to download lfs blobs, if told not to process lfs content.
+The extension makes sure that the filelog.renamed() path is taken on a missing
+blob, and the output shows that it isn't fetched.
 
-  $ hg -R fromcorrupt --config lfs.usercache=emptycache verify -v --no-lfs
+  $ cat > $TESTTMP/lfsrename.py <<EOF
+  > from mercurial import (
+  >     exthelper,
+  > )
+  > 
+  > from hgext.lfs import (
+  >     pointer,
+  >     wrapper,
+  > )
+  > 
+  > eh = exthelper.exthelper()
+  > uisetup = eh.finaluisetup
+  > 
+  > @eh.wrapfunction(wrapper, b'filelogrenamed')
+  > def filelogrenamed(orig, orig1, self, node):
+  >     ret = orig(orig1, self, node)
+  >     if wrapper._islfs(self._revlog, node) and ret:
+  >         rawtext = self._revlog.rawdata(node)
+  >         metadata = pointer.deserialize(rawtext)
+  >         print('lfs blob %s renamed %s -> %s'
+  >               % (metadata[b'oid'], ret[0], self._revlog.filename))
+  >     return ret
+  > EOF
+
+  $ hg -R fromcorrupt --config lfs.usercache=emptycache verify -v --no-lfs \
+  >                   --config extensions.x=$TESTTMP/lfsrename.py
   repository uses revlog format 1
   checking changesets
   checking manifests
   crosschecking files in changesets and manifests
   checking files
   lfs: found 22f66a3fc0b9bf3f012c814303995ec07099b3a9ce02a7af84b5970811074a3b in the local lfs store
+  lfs blob sha256:66100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e renamed large -> l
   checked 5 changesets with 10 changes to 4 files
 
 Verify will not try to download lfs blobs, if told not to by the config option
 
   $ hg -R fromcorrupt --config lfs.usercache=emptycache verify -v \
-  >                   --config verify.skipflags=8192
+  >                   --config verify.skipflags=8192 \
+  >                   --config extensions.x=$TESTTMP/lfsrename.py
   repository uses revlog format 1
   checking changesets
   checking manifests
   crosschecking files in changesets and manifests
   checking files
   lfs: found 22f66a3fc0b9bf3f012c814303995ec07099b3a9ce02a7af84b5970811074a3b in the local lfs store
+  lfs blob sha256:66100b384bf761271b407d79fc30cdd0554f3b2c5d944836e936d584b88ce88e renamed large -> l
   checked 5 changesets with 10 changes to 4 files
 
 Verify will copy/link all lfs objects into the local store that aren't already