arbitraryfilecontext: skip the cmp fast path if any side is a symlink
authorPhil Cohen <phillco@fb.com>
Tue, 17 Oct 2017 12:41:24 -0700
changeset 34835 14c87708f432
parent 34834 2e8477059d4f
child 34836 537de0b14030
arbitraryfilecontext: skip the cmp fast path if any side is a symlink `filecmp` follows symlinks by default, which a `filectx.cmp()` call should not be doing as it should only compare the requested entry. After this patch, only the contexts' data are compared, which is the correct contract. This is a corrected version of D1122. Differential Revision: https://phab.mercurial-scm.org/D1165
mercurial/context.py
tests/test-arbitraryfilectx.t
--- a/mercurial/context.py	Mon Sep 14 14:17:27 2015 -0400
+++ b/mercurial/context.py	Tue Oct 17 12:41:24 2017 -0700
@@ -2570,9 +2570,13 @@
         self._path = path
 
     def cmp(self, fctx):
-        if isinstance(fctx, workingfilectx) and self._repo:
+        # filecmp follows symlinks whereas `cmp` should not, so skip the fast
+        # path if either side is a symlink.
+        symlinks = ('l' in self.flags() or 'l' in fctx.flags())
+        if not symlinks and isinstance(fctx, workingfilectx) and self._repo:
             # Add a fast-path for merge if both sides are disk-backed.
-            # Note that filecmp uses the opposite return values as cmp.
+            # Note that filecmp uses the opposite return values (True if same)
+            # from our cmp functions (True if different).
             return not filecmp.cmp(self.path(), self._repo.wjoin(fctx.path()))
         return self.data() != fctx.data()
 
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tests/test-arbitraryfilectx.t	Tue Oct 17 12:41:24 2017 -0700
@@ -0,0 +1,57 @@
+Setup:
+  $ cat > eval.py <<EOF
+  > from __future__ import absolute_import
+  > import filecmp
+  > from mercurial import commands, context, registrar
+  > cmdtable = {}
+  > command = registrar.command(cmdtable)
+  > @command(b'eval', [], 'hg eval CMD')
+  > def eval_(ui, repo, *cmds, **opts):
+  >     cmd = " ".join(cmds)
+  >     res = str(eval(cmd, globals(), locals()))
+  >     ui.warn("%s" % res)
+  > EOF
+
+  $ echo "[extensions]" >> $HGRCPATH
+  $ echo "eval=`pwd`/eval.py" >> $HGRCPATH
+
+Arbitraryfilectx.cmp does not follow symlinks:
+  $ mkdir case1
+  $ cd case1
+  $ hg init
+  $ printf "A" > real_A
+  $ printf "foo" > A
+  $ printf "foo" > B
+  $ ln -s A sym_A
+  $ hg add .
+  adding A
+  adding B
+  adding real_A
+  adding sym_A
+  $ hg commit -m "base"
+
+These files are different and should return True (different):
+(Note that filecmp.cmp's return semantics are inverted from ours, so we invert
+for simplicity):
+  $ hg eval "context.arbitraryfilectx('A', repo).cmp(repo[None]['real_A'])"
+  True (no-eol)
+  $ hg eval "not filecmp.cmp('A', 'real_A')"
+  True (no-eol)
+
+These files are identical and should return False (same):
+  $ hg eval "context.arbitraryfilectx('A', repo).cmp(repo[None]['A'])"
+  False (no-eol)
+  $ hg eval "context.arbitraryfilectx('A', repo).cmp(repo[None]['B'])"
+  False (no-eol)
+  $ hg eval "not filecmp.cmp('A', 'B')"
+  False (no-eol)
+
+This comparison should also return False, since A and sym_A are substantially
+the same in the eyes of ``filectx.cmp``, which looks at data only.
+  $ hg eval "context.arbitraryfilectx('real_A', repo).cmp(repo[None]['sym_A'])"
+  False (no-eol)
+
+A naive use of filecmp on those two would wrongly return True, since it follows
+the symlink to "A", which has different contents.
+  $ hg eval "not filecmp.cmp('real_A', 'sym_A')"
+  True (no-eol)