highlight: add option to prevent content-only based fallback
When Mozilla enabled Pygments on hg.mozilla.org, we got a lot of weirdly
colorized files. Upon further investigation, the hightlight extension
is first attempting a filename+content based match then falling back to a
purely content-driven detection mode in Pygments. Sounds good in theory.
Unfortunately, Pygments' content-driven detection establishes no minimum
threshold for returning a lexer. Furthermore, the detection code for
a number of languages is very liberal. For example, ActionScript 3 will
return a confidence of 0.3 (out of 1.0) if the first 1k of the file
we pass in matches the regex "\w+\s*:\s*\w"! Python matches on
"import ". It's no coincidence that a number of our extension-less files
were getting highlighted improperly.
This patch adds an option to have the highlighter not fall back to
purely content-based detection when filename+content detection failed.
This can be enabled to render unlighted text instead of taking the risk
that unknown file types are highlighted incorrectly. The old behavior is
still the default.
#!/bin/sh
#
# This is an example of using HGEDITOR to create of diff to review the
# changes while commiting.
# If you want to pass your favourite editor some other parameters
# only for Mercurial, modify this:
case "${EDITOR}" in
"")
EDITOR="vi"
;;
emacs)
EDITOR="$EDITOR -nw"
;;
gvim|vim)
EDITOR="$EDITOR -f -o"
;;
esac
HGTMP=""
cleanup_exit() {
rm -rf "$HGTMP"
}
# Remove temporary files even if we get interrupted
trap "cleanup_exit" 0 # normal exit
trap "exit 255" HUP INT QUIT ABRT TERM
HGTMP=$(mktemp -d ${TMPDIR-/tmp}/hgeditor.XXXXXX)
[ x$HGTMP != x -a -d $HGTMP ] || {
echo "Could not create temporary directory! Exiting." 1>&2
exit 1
}
(
grep '^HG: changed' "$1" | cut -b 13- | while read changed; do
"$HG" diff "$changed" >> "$HGTMP/diff"
done
)
cat "$1" > "$HGTMP/msg"
MD5=$(which md5sum 2>/dev/null) || \
MD5=$(which md5 2>/dev/null)
[ -x "${MD5}" ] && CHECKSUM=`${MD5} "$HGTMP/msg"`
if [ -s "$HGTMP/diff" ]; then
$EDITOR "$HGTMP/msg" "$HGTMP/diff" || exit $?
else
$EDITOR "$HGTMP/msg" || exit $?
fi
[ -x "${MD5}" ] && (echo "$CHECKSUM" | ${MD5} -c >/dev/null 2>&1 && exit 13)
mv "$HGTMP/msg" "$1"
exit $?