fsmonitor: handle unicode keys in tuples stable
authorGregory Szorc <gregory.szorc@gmail.com>
Sat, 02 Nov 2019 14:17:48 -0700
branchstable
changeset 43394 d359dfc15aca
parent 43393 bdebc7b54dca
child 43395 2b8be670dcb6
fsmonitor: handle unicode keys in tuples In Python 3, keys in the bset tuple are typically str, not bytes. PyBytes_AsString() would return NULL. But we weren't checking the return value and this would lead to a segfault. This commit makes the code type and Python version aware. The Python version specific code is to allow us to utilize a modern API for converting str -> char* without having to allocate an extra PyObject. FWIW I wanted to assume that keys were always str. However, there appear to be some bytes keys in some cases. I haven't debugged this further. Differential Revision: https://phab.mercurial-scm.org/D7210
hgext/fsmonitor/pywatchman/bser.c
--- a/hgext/fsmonitor/pywatchman/bser.c	Sat Nov 02 13:39:23 2019 -0700
+++ b/hgext/fsmonitor/pywatchman/bser.c	Sat Nov 02 14:17:48 2019 -0700
@@ -175,7 +175,22 @@
     const char* item_name = NULL;
     PyObject* key = PyTuple_GET_ITEM(obj->keys, i);
 
-    item_name = PyBytes_AsString(key);
+    if (PyUnicode_Check(key)) {
+#if PY_MAJOR_VERSION >= 3
+      item_name = PyUnicode_AsUTF8(key);
+#else
+      PyObject* utf = PyUnicode_AsEncodedString(key, "utf-8", "ignore");
+      if (utf == NULL) {
+        goto bail;
+      }
+      item_name = PyBytes_AsString(utf);
+#endif
+    } else {
+      item_name = PyBytes_AsString(key);
+    }
+    if (item_name == NULL) {
+      goto bail;
+    }
     if (!strcmp(item_name, namestr)) {
       ret = PySequence_GetItem(obj->values, i);
       goto bail;