match: strengthen visit_children_set invariant, Recursive means "all files"
My previous interpretation of "Recursive" was too relaxed: I thought it
instructed the caller to do something like this:
> you can stop calling `visit_children_set` because you'll need to descend into
> every directory recursively, but you should still check every file if it
> matches or not
Whereas the real instruction seems to be:
> I guarantee that everything in this subtree matches, you can stop
> querying the matcher for all files and dirs altogether.
The evidence to support this:
- the test actually passes with the stronger invariant, revealing no
exceptions from this rule
- the implementation of `visit_children_set` for `DifferenceMatcher`
clearly relies on this requirement, so it must hold for that not to
lead to bugs.
#require test-repo hg10
$ . "$TESTDIR/helpers-testrepo.sh"
$ cat > $TESTTMP/check_ascii.py <<EOF
> import sys
> for file_path in sys.argv[1:]:
> with open(file_path, 'br') as f:
> try:
> f.read().decode('ascii', 'strict')
> except UnicodeDecodeError as exc:
> print('%s: %s' % (file_path, exc))
> EOF
There are some web servers in the wild that can serve static files with an
incorrect encoding (e.g. https://bz.mercurial-scm.org/show_bug.cgi?id=6559).
One way to prevent any issues is to not use any non-ASCII characters, e.g.
URL-encoding them or using HTML entities.
check charset of all tracked files ending in .js
$ cd "`dirname "$TESTDIR"`"
$ testrepohg locate 'set:**.js' \
> 2>/dev/null \
> | xargs "$PYTHON" $TESTTMP/check_ascii.py