tests/test-largefiles-small-disk.t
author Martin von Zweigbergk <martinvonz@google.com>
Fri, 05 May 2017 08:49:07 -0700
changeset 32176 cf042543afa2
parent 31207 1ef37b16b8e8
child 33962 991134261057
permissions -rw-r--r--
match: optimize visitdir() for patterns matching only root directory Because _rootsanddirs() returns a list of directories to visit recursively and a list of directories to visit non-recursively. For patterns such as 'rootfilesin:foo/bar', we clearly need to visit the directory foo/bar, but we also need to visit its parents. The method therefore uses util.dirs() to find the parent directories of 'foo/bar'. That method does not include the root directory, but since we obviously need to visit the root directory, we always added '.' to the set of directories to visit non-recursively. The visitdir() method had special handling to consider set(['.']) to mean that no includes had been specified and would thus visit all directories. However, when the pattern is 'rootfilesin:.', set(['.']) is actually the real set of directories to visit and the special handling of that set meant that all directories got visited instead of just the root directory. The fix is simple: add '.' to the set of parent directories in _rootsanddirs() and stop treating set(['.']) specially. This makes hg files -r . -I rootfilesin:. in a treemanifest version of the Firefox repo go from 1.5s to 0.26s on warm disk (and a *much* bigger improvement on cold disk). Note that the -I is necessary for no good reason. We just haven't optimized visitdir() for regular (non-include, non-exclude) patterns yet.

Test how largefiles abort in case the disk runs full

  $ cat > criple.py <<EOF
  > import os, errno, shutil
  > from mercurial import util
  > #
  > # this makes the original largefiles code abort:
  > _origcopyfileobj = shutil.copyfileobj
  > def copyfileobj(fsrc, fdst, length=16*1024):
  >     # allow journal files (used by transaction) to be written
  >     if 'journal.' in fdst.name:
  >         return _origcopyfileobj(fsrc, fdst, length)
  >     fdst.write(fsrc.read(4))
  >     raise IOError(errno.ENOSPC, os.strerror(errno.ENOSPC))
  > shutil.copyfileobj = copyfileobj
  > #
  > # this makes the rewritten code abort:
  > def filechunkiter(f, size=131072, limit=None):
  >     yield f.read(4)
  >     raise IOError(errno.ENOSPC, os.strerror(errno.ENOSPC))
  > util.filechunkiter = filechunkiter
  > #
  > def oslink(src, dest):
  >     raise OSError("no hardlinks, try copying instead")
  > util.oslink = oslink
  > EOF

  $ echo "[extensions]" >> $HGRCPATH
  $ echo "largefiles =" >> $HGRCPATH

  $ hg init alice
  $ cd alice
  $ echo "this is a very big file" > big
  $ hg add --large big
  $ hg commit --config extensions.criple=$TESTTMP/criple.py -m big
  abort: No space left on device
  [255]

The largefile is not created in .hg/largefiles:

  $ ls .hg/largefiles
  dirstate

The user cache is not even created:

  >>> import os; os.path.exists("$HOME/.cache/largefiles/")
  False

Make the commit with space on the device:

  $ hg commit -m big

Now make a clone with a full disk, and make sure lfutil.link function
makes copies instead of hardlinks:

  $ cd ..
  $ hg --config extensions.criple=$TESTTMP/criple.py clone --pull alice bob
  requesting all changes
  adding changesets
  adding manifests
  adding file changes
  added 1 changesets with 1 changes to 1 files
  updating to branch default
  getting changed largefiles
  abort: No space left on device
  [255]

The largefile is not created in .hg/largefiles:

  $ ls bob/.hg/largefiles
  dirstate