parsers: fail fast if Python has wrong minor version (issue4110)
authorChris Jerdonek <chris.jerdonek@gmail.com>
Wed, 04 Dec 2013 20:38:27 -0800
changeset 20742 3681de20b0a7
parent 20741 f1dfef0a9352
child 20743 05267e6e94dd
parsers: fail fast if Python has wrong minor version (issue4110) This change causes an informative ImportError to be raised when importing the parsers extension module if the minor version of the currently-running Python interpreter doesn't match that of the Python used when compiling the extension module. This change also exposes a parsers.versionerrortext constant in the C implementation of the module. Its presence can be used to determine whether this behavior is present in a version of the module. The value of the constant is the leading text of the ImportError raised and is set to "Python minor version mismatch". Here is an example of what the new error looks like: Traceback (most recent call last): File "test.py", line 1, in <module> import mercurial.parsers ImportError: Python minor version mismatch: The Mercurial extension modules were compiled with Python 2.7.6, but Mercurial is currently using Python with sys.hexversion=33883888: Python 2.5.6 (r256:88840, Nov 18 2012, 05:37:10) [GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] at: /opt/local/Library/Frameworks/Python.framework/Versions/2.5/Resources/ Python.app/Contents/MacOS/Python The reason for raising an error in this scenario is that Python's C API is known not to be compatible from minor version to minor version, even if sys.api_version is the same. See for example this Python bug report about incompatibilities between 2.5 and 2.6+: http://bugs.python.org/issue8118 These incompatibilities can cause Mercurial to break in mysterious, unforeseen ways. For example, when Mercurial compiled with Python 2.7 was run with 2.5, the following crash occurred when running "hg status": http://bz.selenic.com/show_bug.cgi?id=4110 After this crash was fixed, running with Python 2.5 no longer crashes, but the following puzzling behavior still occurs: $ hg status ... File ".../mercurial/changelog.py", line 123, in __init__ revlog.revlog.__init__(self, opener, "00changelog.i") File ".../mercurial/revlog.py", line 251, in __init__ d = self._io.parseindex(i, self._inline) File ".../mercurial/revlog.py", line 158, in parseindex index, cache = parsers.parse_index2(data, inline) TypeError: data is not a string which can be reproduced more simply with: import mercurial.parsers as parsers parsers.parse_index2("", True) Both the crash and the TypeError occurred because the Python C API's PyString_Check() returns the wrong value when the C header files from Python 2.7 are run with Python 2.5. This is an example of an incompatibility of the sort mentioned in the Python bug report above. Failing fast with an informative error message results in a better user experience in cases like the above. The information in the ImportError also simplifies troubleshooting for those on Mercurial mailing lists, the bug tracker, etc. This patch only adds the version check to parsers.c, which is sufficient to affect command-line commands like "hg status" and "hg summary". An idea for a future improvement is to move the version-checking C code to a more central location, and have it run when importing all Mercurial extension modules and not just parsers.c.
mercurial/parsers.c
tests/test-parseindex2.py
--- a/mercurial/parsers.c	Fri Mar 14 16:00:11 2014 -0500
+++ b/mercurial/parsers.c	Wed Dec 04 20:38:27 2013 -0800
@@ -14,6 +14,8 @@
 
 #include "util.h"
 
+static char *versionerrortext = "Python minor version mismatch";
+
 static int8_t hextable[256] = {
 	-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
 	-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
@@ -1911,6 +1913,16 @@
 
 static void module_init(PyObject *mod)
 {
+	/* This module constant has two purposes.  First, it lets us unit test
+	 * the ImportError raised without hard-coding any error text.  This
+	 * means we can change the text in the future without breaking tests,
+	 * even across changesets without a recompile.  Second, its presence
+	 * can be used to determine whether the version-checking logic is
+	 * present, which also helps in testing across changesets without a
+	 * recompile.  Note that this means the pure-Python version of parsers
+	 * should not have this module constant. */
+	PyModule_AddStringConstant(mod, "versionerrortext", versionerrortext);
+
 	dirs_module_init(mod);
 
 	indexType.tp_new = PyType_GenericNew;
@@ -1928,6 +1940,24 @@
 	dirstate_unset = Py_BuildValue("ciii", 'n', 0, -1, -1);
 }
 
+static int check_python_version(void)
+{
+	PyObject *sys = PyImport_ImportModule("sys");
+	long hexversion = PyInt_AsLong(PyObject_GetAttrString(sys, "hexversion"));
+	/* sys.hexversion is a 32-bit number by default, so the -1 case
+	 * should only occur in unusual circumstances (e.g. if sys.hexversion
+	 * is manually set to an invalid value). */
+	if ((hexversion == -1) || (hexversion >> 16 != PY_VERSION_HEX >> 16)) {
+		PyErr_Format(PyExc_ImportError, "%s: The Mercurial extension "
+			"modules were compiled with Python " PY_VERSION ", but "
+			"Mercurial is currently using Python with sys.hexversion=%ld: "
+			"Python %s\n at: %s", versionerrortext, hexversion,
+			Py_GetVersion(), Py_GetProgramFullPath());
+		return -1;
+	}
+	return 0;
+}
+
 #ifdef IS_PY3K
 static struct PyModuleDef parsers_module = {
 	PyModuleDef_HEAD_INIT,
@@ -1939,6 +1969,8 @@
 
 PyMODINIT_FUNC PyInit_parsers(void)
 {
+	if (check_python_version() == -1)
+		return;
 	PyObject *mod = PyModule_Create(&parsers_module);
 	module_init(mod);
 	return mod;
@@ -1946,6 +1978,8 @@
 #else
 PyMODINIT_FUNC initparsers(void)
 {
+	if (check_python_version() == -1)
+		return;
 	PyObject *mod = Py_InitModule3("parsers", methods, parsers_doc);
 	module_init(mod);
 }
--- a/tests/test-parseindex2.py	Fri Mar 14 16:00:11 2014 -0500
+++ b/tests/test-parseindex2.py	Wed Dec 04 20:38:27 2013 -0800
@@ -1,8 +1,13 @@
-"""This unit test tests parsers.parse_index2()."""
+"""This unit test primarily tests parsers.parse_index2().
+
+It also checks certain aspects of the parsers module as a whole.
+"""
 
 from mercurial import parsers
 from mercurial.node import nullid, nullrev
 import struct
+import subprocess
+import sys
 
 # original python implementation
 def gettype(q):
@@ -95,7 +100,70 @@
     index, chunkcache = parsers.parse_index2(data, inline)
     return list(index), chunkcache
 
+def importparsers(hexversion):
+    """Import mercurial.parsers with the given sys.hexversion."""
+    # The file parsers.c inspects sys.hexversion to determine the version
+    # of the currently-running Python interpreter, so we monkey-patch
+    # sys.hexversion to simulate using different versions.
+    code = ("import sys; sys.hexversion=%s; "
+            "import mercurial.parsers" % hexversion)
+    cmd = "python -c \"%s\"" % code
+    # We need to do these tests inside a subprocess because parser.c's
+    # version-checking code happens inside the module init function, and
+    # when using reload() to reimport an extension module, "The init function
+    # of extension modules is not called a second time"
+    # (from http://docs.python.org/2/library/functions.html?#reload).
+    p = subprocess.Popen(cmd, shell=True,
+                         stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
+    return p.communicate()  # returns stdout, stderr
+
+def printhexfail(testnumber, hexversion, stdout, expected):
+    try:
+        hexstring = hex(hexversion)
+    except TypeError:
+        hexstring = None
+    print ("FAILED: version test #%s with Python %s and patched "
+           "sys.hexversion %r (%r):\n Expected %s but got:\n-->'%s'\n" %
+           (testnumber, sys.version_info, hexversion, hexstring, expected,
+            stdout))
+
+def testversionokay(testnumber, hexversion):
+    stdout, stderr = importparsers(hexversion)
+    if stdout:
+        printhexfail(testnumber, hexversion, stdout, expected="no stdout")
+
+def testversionfail(testnumber, hexversion):
+    stdout, stderr = importparsers(hexversion)
+    # We include versionerrortext to distinguish from other ImportErrors.
+    errtext = "ImportError: %s" % parsers.versionerrortext
+    if errtext not in stdout:
+        printhexfail(testnumber, hexversion, stdout,
+                     expected="stdout to contain %r" % errtext)
+
+def makehex(major, minor, micro):
+    return int("%x%02x%02x00" % (major, minor, micro), 16)
+
+def runversiontests():
+    """Check the version-detection logic when importing parsers."""
+    info = sys.version_info
+    major, minor, micro = info[0], info[1], info[2]
+    # Test same major-minor versions.
+    testversionokay(1, makehex(major, minor, micro))
+    testversionokay(2, makehex(major, minor, micro + 1))
+    # Test different major-minor versions.
+    testversionfail(3, makehex(major + 1, minor, micro))
+    testversionfail(4, makehex(major, minor + 1, micro))
+    testversionfail(5, "'foo'")
+
 def runtest() :
+    # Only test the version-detection logic if it is present.
+    try:
+        parsers.versionerrortext
+    except AttributeError:
+        pass
+    else:
+        runversiontests()
+
     # Check that parse_index2() raises TypeError on bad arguments.
     try:
         parse_index2(0, True)