mail: always fall back to iso-8859-1 if us-ascii won't work (BC)
authorAugie Fackler <augie@google.com>
Thu, 09 Aug 2018 21:04:15 -0400
changeset 39037 ede768cfe83e
parent 39036 598467614c15
child 39038 2161faf0d24b
mail: always fall back to iso-8859-1 if us-ascii won't work (BC) It looks like this was a well-intentioned backwards compat hack for previewing the output of `hg email` in a stable way. Unfortunately I think this hack's time has come, because Python 3 does a much better job of ensuring it actually emits *valid* email messages. In particular, Python 2 would blindly trust us that the bytes we handed it were valid for the encoding we claimed, but Python 3 has some more sniff-tests that we end up failing. As a result, if we're going to print an email to the terminal, try us-ascii first, but if that fails go straight to iso-8859-1 which should be reasonably readable for ascii-compatible patch bodies. This *will* be a breaking change for ascii-incompatible textual patch content, but I don't think that's avoidable if we want to continue using the email library from the stdlib. .. bc:: Emails from the patchbomb extension will always be printed as though they are iso-8859-1 if they're not valid us-ascii. Previously, previewed emails were always claimed to be us-ascii and might contain invalid byte sequences. Differential Revision: https://phab.mercurial-scm.org/D4231
mercurial/mail.py
tests/test-patchbomb.t
--- a/mercurial/mail.py	Thu Aug 09 20:57:27 2018 -0400
+++ b/mercurial/mail.py	Thu Aug 09 21:04:15 2018 -0400
@@ -220,7 +220,7 @@
 
     cs = ['us-ascii', 'utf-8', encoding.encoding, encoding.fallbackencoding]
     if display:
-        return mimetextqp(s, subtype, 'us-ascii')
+        cs = ['us-ascii']
     for charset in cs:
         try:
             s.decode(pycompat.sysstr(charset))
--- a/tests/test-patchbomb.t	Thu Aug 09 20:57:27 2018 -0400
+++ b/tests/test-patchbomb.t	Thu Aug 09 21:04:15 2018 -0400
@@ -454,8 +454,8 @@
   
   displaying [PATCH] utf-8 content ...
   MIME-Version: 1.0
-  Content-Type: text/plain; charset="us-ascii"
-  Content-Transfer-Encoding: 8bit
+  Content-Type: text/plain; charset="iso-8859-1"
+  Content-Transfer-Encoding: quoted-printable
   Subject: [PATCH] utf-8 content
   X-Mercurial-Node: 909a00e13e9d78b575aeee23dddbada46d5a143f
   X-Mercurial-Series-Index: 1
@@ -487,7 +487,7 @@
   --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
   +++ b/utf	Thu Jan 01 00:00:04 1970 +0000
   @@ -0,0 +1,1 @@
-  +h\xc3\xb6mma! (esc)
+  +h=C3=B6mma!
   
 
 mime encoded mbox (base64):
@@ -2673,8 +2673,8 @@
   
   displaying [PATCH 2 of 6] utf-8 content ...
   MIME-Version: 1.0
-  Content-Type: text/plain; charset="us-ascii"
-  Content-Transfer-Encoding: 8bit
+  Content-Type: text/plain; charset="iso-8859-1"
+  Content-Transfer-Encoding: quoted-printable
   Subject: [PATCH 2 of 6] utf-8 content
   X-Mercurial-Node: 909a00e13e9d78b575aeee23dddbada46d5a143f
   X-Mercurial-Series-Index: 2
@@ -2707,7 +2707,7 @@
   --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
   +++ b/utf	Thu Jan 01 00:00:04 1970 +0000
   @@ -0,0 +1,1 @@
-  +h\xc3\xb6mma! (esc)
+  +h=C3=B6mma!
   
   displaying [PATCH 3 of 6] long line ...
   MIME-Version: 1.0
@@ -2758,8 +2758,8 @@
   
   displaying [PATCH 4 of 6] isolatin 8-bit encoding ...
   MIME-Version: 1.0
-  Content-Type: text/plain; charset="us-ascii"
-  Content-Transfer-Encoding: 8bit
+  Content-Type: text/plain; charset="iso-8859-1"
+  Content-Transfer-Encoding: quoted-printable
   Subject: [PATCH 4 of 6] isolatin 8-bit encoding
   X-Mercurial-Node: 240fb913fc1b7ff15ddb9f33e73d82bf5277c720
   X-Mercurial-Series-Index: 4
@@ -2785,7 +2785,7 @@
   --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
   +++ b/isolatin	Thu Jan 01 00:00:05 1970 +0000
   @@ -0,0 +1,1 @@
-  +h\xf6mma! (esc)
+  +h=F6mma!
   
   displaying [PATCH 5 of 6] Added tag zero, zero.foo for changeset 8580ff50825a ...
   MIME-Version: 1.0