contrib/hgfixes/fix_bytes.py
author Renato Cunha <renatoc@gmail.com>
Tue, 03 Aug 2010 13:52:48 -0300
changeset 11748 37a70a784397
parent 11747 40d5633889bb
child 17299 e51d4aedace9
permissions -rw-r--r--
py3kcompat: added a "compatibility layer" for py3k This patch adds some ugly constructs. The first of them is bytesformatter, a function that formats strings like when '%' is called. The main motivation for this function is py3k's strange behavior: >>> 'foo %s' % b'bar' "foo b'bar'" >>> b'foo %s' % b'bar' Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for %: 'bytes' and 'bytes' >>> b'foo %s' % 'bar' Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for %: 'bytes' and 'str' In other words, if we can't format bytes with bytes, and recall that all mercurial strings will be converted by a fixer, then things will break badly if we don't take a similar approach. The other addition with this patch is that the os.environ dictionary is monkeypatched to have bytes items. Hopefully this won't be needed in the future, as python 3.2 might get a os.environb dictionary that holds bytes items.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
11747
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     1
"""Fixer that changes plain strings to bytes strings."""
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     2
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     3
import re
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     4
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     5
from lib2to3 import fixer_base
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     6
from lib2to3.pgen2 import token
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     7
from lib2to3.fixer_util import Name
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     8
from lib2to3.pygram import python_symbols as syms
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     9
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    10
_re = re.compile(r'[rR]?[\'\"]')
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    11
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    12
# XXX: Implementing a blacklist in 2to3 turned out to be more troublesome than
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    13
# blacklisting some modules inside the fixers. So, this is what I came with.
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    14
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    15
blacklist = ['mercurial/demandimport.py',
11748
37a70a784397 py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents: 11747
diff changeset
    16
             'mercurial/py3kcompat.py', # valid python 3 already
11747
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    17
             'mercurial/i18n.py',
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    18
            ]
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    19
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    20
def isdocstring(node):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    21
    def isclassorfunction(ancestor):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    22
        symbols = (syms.funcdef, syms.classdef)
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    23
        # if the current node is a child of a function definition, a class
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    24
        # definition or a file, then it is a docstring
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    25
        if ancestor.type == syms.simple_stmt:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    26
            try:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    27
                while True:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    28
                    if ancestor.type in symbols:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    29
                        return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    30
                    ancestor = ancestor.parent
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    31
            except AttributeError:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    32
                return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    33
        return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    34
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    35
    def ismodule(ancestor):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    36
        # Our child is a docstring if we are a simple statement, and our
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    37
        # ancestor is file_input. In other words, our child is a lone string in
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    38
        # the source file.
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    39
        try:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    40
            if (ancestor.type == syms.simple_stmt and
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    41
                ancestor.parent.type == syms.file_input):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    42
                    return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    43
        except AttributeError:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    44
            return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    45
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    46
    def isdocassignment(ancestor):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    47
        # Assigning to __doc__, definitely a string
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    48
        try:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    49
            while True:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    50
                if (ancestor.type == syms.expr_stmt and
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    51
                    Name('__doc__') in ancestor.children):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    52
                        return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    53
                ancestor = ancestor.parent
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    54
        except AttributeError:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    55
            return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    56
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    57
    if ismodule(node.parent) or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    58
       isdocassignment(node.parent) or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    59
       isclassorfunction(node.parent):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    60
        return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    61
    return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    62
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    63
def shouldtransform(node):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    64
    specialnames = ['__main__']
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    65
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    66
    if node.value in specialnames:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    67
        return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    68
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    69
    ggparent = node.parent.parent.parent
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    70
    sggparent = str(ggparent)
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    71
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    72
    if 'getattr' in sggparent or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    73
       'hasattr' in sggparent or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    74
       'setattr' in sggparent or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    75
       'encode' in sggparent or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    76
       'decode' in sggparent:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    77
           return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    78
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    79
    return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    80
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    81
class FixBytes(fixer_base.BaseFix):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    82
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    83
    PATTERN = 'STRING'
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    84
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    85
    def transform(self, node, results):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    86
        if self.filename in blacklist:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    87
            return
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    88
        if node.type == token.STRING:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    89
            if _re.match(node.value):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    90
                if isdocstring(node):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    91
                    return
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    92
                if not shouldtransform(node):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    93
                    return
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    94
                new = node.clone()
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    95
                new.value = 'b' + new.value
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    96
                return new
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    97