contrib/byteify-strings.py
author Yuya Nishihara <yuya@tcha.org>
Thu, 31 May 2018 22:23:30 +0900
changeset 38385 a2976c27dac4
parent 38384 1d9c97db465f
child 38386 9f42e4a83676
permissions -rwxr-xr-x
byteify-strings: add basic command interface
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
     1
#!/usr/bin/env python3
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
     2
#
38384
1d9c97db465f byteify-strings: fork py3 code transformer to make it a standalone command
Yuya Nishihara <yuya@tcha.org>
parents: 36617
diff changeset
     3
# byteify-strings.py - transform string literals to be Python 3 safe
27220
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     4
#
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     5
# Copyright 2015 Gregory Szorc <gregory.szorc@gmail.com>
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     6
#
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     7
# This software may be used and distributed according to the terms of the
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     8
# GNU General Public License version 2 or any later version.
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
     9
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
    10
from __future__ import absolute_import
4374d819ccd5 mercurial: implement import hook for handling C/Python modules
Gregory Szorc <gregory.szorc@gmail.com>
parents: 0
diff changeset
    11
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
    12
import argparse
38384
1d9c97db465f byteify-strings: fork py3 code transformer to make it a standalone command
Yuya Nishihara <yuya@tcha.org>
parents: 36617
diff changeset
    13
import io
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
    14
import sys
38384
1d9c97db465f byteify-strings: fork py3 code transformer to make it a standalone command
Yuya Nishihara <yuya@tcha.org>
parents: 36617
diff changeset
    15
import token
1d9c97db465f byteify-strings: fork py3 code transformer to make it a standalone command
Yuya Nishihara <yuya@tcha.org>
parents: 36617
diff changeset
    16
import tokenize
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    17
38384
1d9c97db465f byteify-strings: fork py3 code transformer to make it a standalone command
Yuya Nishihara <yuya@tcha.org>
parents: 36617
diff changeset
    18
if True:
29800
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
    19
    def replacetokens(tokens, fullname):
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    20
        """Transform a stream of tokens from raw to Python 3.
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    21
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    22
        Returns a generator of possibly rewritten tokens.
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    23
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    24
        The input token list may be mutated as part of processing. However,
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    25
        its changes do not necessarily match the output token stream.
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    26
        """
29800
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
    27
        futureimpline = False
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    28
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    29
        # The following utility functions access the tokens list and i index of
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    30
        # the for i, t enumerate(tokens) loop below
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    31
        def _isop(j, *o):
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    32
            """Assert that tokens[j] is an OP with one of the given values"""
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    33
            try:
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    34
                return tokens[j].type == token.OP and tokens[j].string in o
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    35
            except IndexError:
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    36
                return False
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    37
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    38
        def _findargnofcall(n):
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    39
            """Find arg n of a call expression (start at 0)
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    40
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    41
            Returns index of the first token of that argument, or None if
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    42
            there is not that many arguments.
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    43
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    44
            Assumes that token[i + 1] is '('.
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    45
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    46
            """
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    47
            nested = 0
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    48
            for j in range(i + 2, len(tokens)):
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    49
                if _isop(j, ')', ']', '}'):
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    50
                    # end of call, tuple, subscription or dict / set
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    51
                    nested -= 1
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    52
                    if nested < 0:
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    53
                        return None
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    54
                elif n == 0:
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    55
                    # this is the starting position of arg
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    56
                    return j
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    57
                elif _isop(j, '(', '[', '{'):
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    58
                    nested += 1
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    59
                elif _isop(j, ',') and nested == 0:
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    60
                    n -= 1
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    61
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    62
            return None
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    63
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    64
        def _ensureunicode(j):
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    65
            """Make sure the token at j is a unicode string
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    66
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    67
            This rewrites a string token to include the unicode literal prefix
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    68
            so the string transformer won't add the byte prefix.
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    69
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    70
            Ignores tokens that are not strings. Assumes bounds checking has
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    71
            already been done.
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    72
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    73
            """
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    74
            st = tokens[j]
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    75
            if st.type == token.STRING and st.string.startswith(("'", '"')):
30166
102e6ef5bb3a py3: use namedtuple._replace to produce new tokens
Martijn Pieters <mjpieters@fb.com>
parents: 30165
diff changeset
    76
                tokens[j] = st._replace(string='u%s' % st.string)
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
    77
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    78
        for i, t in enumerate(tokens):
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    79
            # Convert most string literals to byte literals. String literals
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    80
            # in Python 2 are bytes. String literals in Python 3 are unicode.
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    81
            # Most strings in Mercurial are bytes and unicode strings are rare.
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    82
            # Rather than rewrite all string literals to use ``b''`` to indicate
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    83
            # byte strings, we apply this token transformer to insert the ``b``
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    84
            # prefix nearly everywhere.
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    85
            if t.type == token.STRING:
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    86
                s = t.string
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    87
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    88
                # Preserve docstrings as string literals. This is inconsistent
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    89
                # with regular unprefixed strings. However, the
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    90
                # "from __future__" parsing (which allows a module docstring to
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    91
                # exist before it) doesn't properly handle the docstring if it
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    92
                # is b''' prefixed, leading to a SyntaxError. We leave all
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    93
                # docstrings as unprefixed to avoid this. This means Mercurial
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    94
                # components touching docstrings need to handle unicode,
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    95
                # unfortunately.
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    96
                if s[0:3] in ("'''", '"""'):
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    97
                    yield t
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    98
                    continue
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
    99
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   100
                # If the first character isn't a quote, it is likely a string
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   101
                # prefixing character (such as 'b', 'u', or 'r'. Ignore.
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   102
                if s[0] not in ("'", '"'):
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   103
                    yield t
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   104
                    continue
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   105
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   106
                # String literal. Prefix to make a b'' string.
30166
102e6ef5bb3a py3: use namedtuple._replace to produce new tokens
Martijn Pieters <mjpieters@fb.com>
parents: 30165
diff changeset
   107
                yield t._replace(string='b%s' % t.string)
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   108
                continue
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   109
29800
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   110
            # Insert compatibility imports at "from __future__ import" line.
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   111
            # No '\n' should be added to preserve line numbers.
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   112
            if (t.type == token.NAME and t.string == 'import' and
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   113
                all(u.type == token.NAME for u in tokens[i - 2:i]) and
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   114
                [u.string for u in tokens[i - 2:i]] == ['from', '__future__']):
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   115
                futureimpline = True
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   116
            if t.type == token.NEWLINE and futureimpline:
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   117
                futureimpline = False
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   118
                if fullname == 'mercurial.pycompat':
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   119
                    yield t
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   120
                    continue
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   121
                r, c = t.start
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   122
                l = (b'; from mercurial.pycompat import '
31843
526e4597cca5 py3: add pycompat.unicode and add it to importer
Pulkit Goyal <7895pulkit@gmail.com>
parents: 31445
diff changeset
   123
                     b'delattr, getattr, hasattr, setattr, xrange, '
526e4597cca5 py3: add pycompat.unicode and add it to importer
Pulkit Goyal <7895pulkit@gmail.com>
parents: 31445
diff changeset
   124
                     b'open, unicode\n')
29800
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   125
                for u in tokenize.tokenize(io.BytesIO(l).readline):
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   126
                    if u.type in (tokenize.ENCODING, token.ENDMARKER):
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   127
                        continue
30166
102e6ef5bb3a py3: use namedtuple._replace to produce new tokens
Martijn Pieters <mjpieters@fb.com>
parents: 30165
diff changeset
   128
                    yield u._replace(
102e6ef5bb3a py3: use namedtuple._replace to produce new tokens
Martijn Pieters <mjpieters@fb.com>
parents: 30165
diff changeset
   129
                        start=(r, c + u.start[1]), end=(r, c + u.end[1]))
29800
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   130
                continue
178c89e8519a py3: import builtin wrappers automagically by code transformer
Yuya Nishihara <yuya@tcha.org>
parents: 29550
diff changeset
   131
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   132
            # This looks like a function call.
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   133
            if t.type == token.NAME and _isop(i + 1, '('):
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   134
                fn = t.string
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   135
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   136
                # *attr() builtins don't accept byte strings to 2nd argument.
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   137
                if (fn in ('getattr', 'setattr', 'hasattr', 'safehasattr') and
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   138
                        not _isop(i - 1, '.')):
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   139
                    arg1idx = _findargnofcall(1)
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   140
                    if arg1idx is not None:
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   141
                        _ensureunicode(arg1idx)
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   142
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   143
                # .encode() and .decode() on str/bytes/unicode don't accept
30165
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   144
                # byte strings on Python 3.
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   145
                elif fn in ('encode', 'decode') and _isop(i - 1, '.'):
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   146
                    for argn in range(2):
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   147
                        argidx = _findargnofcall(argn)
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   148
                        if argidx is not None:
423377290a3a py3: refactor token parsing to handle call args properly
Martijn Pieters <mjpieters@fb.com>
parents: 30118
diff changeset
   149
                            _ensureunicode(argidx)
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   150
31445
83e080144faf py3: rewrite itervalues() as values() by importer
Yuya Nishihara <yuya@tcha.org>
parents: 31361
diff changeset
   151
                # It changes iteritems/values to items/values as they are not
30052
eaaedad68011 py3: switch to .items() using transformer
Pulkit Goyal <7895pulkit@gmail.com>
parents: 30051
diff changeset
   152
                # present in Python 3 world.
31445
83e080144faf py3: rewrite itervalues() as values() by importer
Yuya Nishihara <yuya@tcha.org>
parents: 31361
diff changeset
   153
                elif fn in ('iteritems', 'itervalues'):
83e080144faf py3: rewrite itervalues() as values() by importer
Yuya Nishihara <yuya@tcha.org>
parents: 31361
diff changeset
   154
                    yield t._replace(string=fn[4:])
30052
eaaedad68011 py3: switch to .items() using transformer
Pulkit Goyal <7895pulkit@gmail.com>
parents: 30051
diff changeset
   155
                    continue
eaaedad68011 py3: switch to .items() using transformer
Pulkit Goyal <7895pulkit@gmail.com>
parents: 30051
diff changeset
   156
29550
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   157
            # Emit unmodified token.
1c22400db72d mercurial: implement a source transforming module loader on Python 3
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29490
diff changeset
   158
            yield t
38385
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   159
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   160
def process(fin, fout):
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   161
    tokens = tokenize.tokenize(fin.readline)
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   162
    tokens = replacetokens(list(tokens), fullname='<dummy>')
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   163
    fout.write(tokenize.untokenize(tokens))
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   164
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   165
def main():
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   166
    ap = argparse.ArgumentParser()
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   167
    ap.add_argument('files', metavar='FILE', nargs='+', help='source file')
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   168
    args = ap.parse_args()
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   169
    for fname in args.files:
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   170
        with open(fname, 'rb') as fin:
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   171
            fout = sys.stdout.buffer
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   172
            process(fin, fout)
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   173
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   174
if __name__ == '__main__':
a2976c27dac4 byteify-strings: add basic command interface
Yuya Nishihara <yuya@tcha.org>
parents: 38384
diff changeset
   175
    main()