worker: Use buffered input from the pickle stream stable
authorJan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Thu, 30 Jan 2020 19:16:12 +0100
branchstable
changeset 44226 cb52e619c99e
parent 44225 4aedef6d51ad
child 44227 7f5410dfc8a6
worker: Use buffered input from the pickle stream On Python 3, "pickle.load" will raise an exception ("_pickle.UnpicklingError: pickle data was truncated") when it gets a short read, i.e. it receives fewer bytes than it requested. On our build machine, Mercurial seems to frequently hit this problem while updating a mozilla-central clone iff it gets scheduled in batch mode. It is easy to trigger with: #wipe the workdir rm -rf * hg update null chrt -b 0 hg update default I've also written the following program, which demonstrates the core problem: from __future__ import print_function import io import os import pickle import time obj = {"a": 1, "b": 2} obj_data = pickle.dumps(obj) assert len(obj_data) > 10 rfd, wfd = os.pipe() pid = os.fork() if pid == 0: os.close(rfd) for _ in range(4): time.sleep(0.5) print("First write") os.write(wfd, obj_data[:10]) time.sleep(0.5) print("Second write") os.write(wfd, obj_data[10:]) os._exit(0) try: os.close(wfd) rfile = os.fdopen(rfd, "rb", 0) print("Reading") while True: try: obj_copy = pickle.load(rfile) assert obj == obj_copy except EOFError: break print("Success") finally: os.kill(pid, 15) The program reliably fails with Python 3.8 and succeeds with Python 2.7. Providing the unpickler with a buffered reader fixes the issue, so let "os.fdopen" create one. https://bugzilla.mozilla.org/show_bug.cgi?id=1604486 Differential Revision: https://phab.mercurial-scm.org/D8051
mercurial/worker.py
--- a/mercurial/worker.py	Sat Feb 01 01:32:28 2020 -0500
+++ b/mercurial/worker.py	Thu Jan 30 19:16:12 2020 +0100
@@ -226,7 +226,7 @@
     selector = selectors.DefaultSelector()
     for rfd, wfd in pipes:
         os.close(wfd)
-        selector.register(os.fdopen(rfd, 'rb', 0), selectors.EVENT_READ)
+        selector.register(os.fdopen(rfd, 'rb'), selectors.EVENT_READ)
 
     def cleanup():
         signal.signal(signal.SIGINT, oldhandler)