contrib/hgfixes/fix_bytes.py
author Gregory Szorc <gregory.szorc@gmail.com>
Thu, 03 Dec 2015 21:37:01 -0800
changeset 27220 4374d819ccd5
parent 21637 48ef68004ec9
permissions -rw-r--r--
mercurial: implement import hook for handling C/Python modules There are a handful of modules that have both pure Python and C extension implementations. Currently, setup.py copies files from mercurial/pure/*.py to mercurial/ during the install process if C extensions are not available. This way, "import mercurial.X" will work whether C extensions are available or not. This approach has a few drawbacks. First, there aren't run-time checks verifying the C extensions are loaded when they should be. This could lead to accidental use of the slower pure Python modules. Second, the C extensions aren't compatible with PyPy and running Mercurial with PyPy requires installing Mercurial - you can't run ./hg from a source checkout. This makes developing while running PyPy somewhat difficult. This patch implements a PEP-302 import hook for finding and loading the modules with both C and Python implementations. When a module with dual implementations is requested for import, its import is handled by our import hook. The importer has a mechanism that controls what types of modules we allow to load. We call this loading behavior the "module load policy." There are 3 settings: * Only load C extensions * Only load pure Python * Try to load C and fall back to Python An environment variable allows overriding this policy at run time. This is mainly useful for developers and for performing actions against the source checkout (such as installing), which require overriding the default (strict) policy about requiring C extensions. The default mode for now is to allow both. This isn't proper and is technically backwards incompatible. However, it is necessary to implement a sane patch series that doesn't break the world during future bisections. The behavior will be corrected in future patch. We choose the main mercurial/__init__.py module for this code out of necessity: in a future world, if the custom module importer isn't registered, we'll fail to find/import certain modules when running from a pure installation. Without the magical import-time side-effects, *any* importer of mercurial.* modules would be required to call a function to register our importer. I'm not a fan of import time side effects and I initially attempted to do this. However, I was foiled by our own test harness, which has numerous `python` invoked scripts that "import mercurial" and fail because the importer isn't registered. Realizing this problem is probably present in random Python scripts that have been written over the years, I decided that sacrificing purity for backwards compatibility is necessary. Plus, if you are programming Python, "import" should probably "just work." It's worth noting that now that we have a custom module loader, it would be possible to hook up demand module proxies at this level instead of replacing __import__. We leave this work for another time, if it's even desired. This patch breaks importing in environments where Mercurial modules are loaded from a zip file (such as py2exe distributions). This will be addressed in a subsequent patch.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
11747
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     1
"""Fixer that changes plain strings to bytes strings."""
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     2
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     3
import re
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     4
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     5
from lib2to3 import fixer_base
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     6
from lib2to3.pgen2 import token
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     7
from lib2to3.fixer_util import Name
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     8
from lib2to3.pygram import python_symbols as syms
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
     9
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    10
_re = re.compile(r'[rR]?[\'\"]')
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    11
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    12
# XXX: Implementing a blacklist in 2to3 turned out to be more troublesome than
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    13
# blacklisting some modules inside the fixers. So, this is what I came with.
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    14
21637
48ef68004ec9 fix_bytes: loosen blacklist matching requirements
Gregory Szorc <gregory.szorc@gmail.com>
parents: 17299
diff changeset
    15
blacklist = ('mercurial/demandimport.py',
11748
37a70a784397 py3kcompat: added a "compatibility layer" for py3k
Renato Cunha <renatoc@gmail.com>
parents: 11747
diff changeset
    16
             'mercurial/py3kcompat.py', # valid python 3 already
11747
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    17
             'mercurial/i18n.py',
21637
48ef68004ec9 fix_bytes: loosen blacklist matching requirements
Gregory Szorc <gregory.szorc@gmail.com>
parents: 17299
diff changeset
    18
            )
11747
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    19
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    20
def isdocstring(node):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    21
    def isclassorfunction(ancestor):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    22
        symbols = (syms.funcdef, syms.classdef)
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    23
        # if the current node is a child of a function definition, a class
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    24
        # definition or a file, then it is a docstring
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    25
        if ancestor.type == syms.simple_stmt:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    26
            try:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    27
                while True:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    28
                    if ancestor.type in symbols:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    29
                        return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    30
                    ancestor = ancestor.parent
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    31
            except AttributeError:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    32
                return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    33
        return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    34
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    35
    def ismodule(ancestor):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    36
        # Our child is a docstring if we are a simple statement, and our
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    37
        # ancestor is file_input. In other words, our child is a lone string in
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    38
        # the source file.
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    39
        try:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    40
            if (ancestor.type == syms.simple_stmt and
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    41
                ancestor.parent.type == syms.file_input):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    42
                    return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    43
        except AttributeError:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    44
            return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    45
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    46
    def isdocassignment(ancestor):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    47
        # Assigning to __doc__, definitely a string
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    48
        try:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    49
            while True:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    50
                if (ancestor.type == syms.expr_stmt and
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    51
                    Name('__doc__') in ancestor.children):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    52
                        return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    53
                ancestor = ancestor.parent
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    54
        except AttributeError:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    55
            return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    56
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    57
    if ismodule(node.parent) or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    58
       isdocassignment(node.parent) or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    59
       isclassorfunction(node.parent):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    60
        return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    61
    return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    62
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    63
def shouldtransform(node):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    64
    specialnames = ['__main__']
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    65
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    66
    if node.value in specialnames:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    67
        return False
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    68
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    69
    ggparent = node.parent.parent.parent
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    70
    sggparent = str(ggparent)
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    71
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    72
    if 'getattr' in sggparent or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    73
       'hasattr' in sggparent or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    74
       'setattr' in sggparent or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    75
       'encode' in sggparent or \
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    76
       'decode' in sggparent:
17299
e51d4aedace9 check-code: indent 4 spaces in py files
Mads Kiilerich <mads@kiilerich.com>
parents: 11748
diff changeset
    77
        return False
11747
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    78
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    79
    return True
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    80
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    81
class FixBytes(fixer_base.BaseFix):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    82
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    83
    PATTERN = 'STRING'
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    84
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    85
    def transform(self, node, results):
21637
48ef68004ec9 fix_bytes: loosen blacklist matching requirements
Gregory Szorc <gregory.szorc@gmail.com>
parents: 17299
diff changeset
    86
        # The filename may be prefixed with a build directory.
48ef68004ec9 fix_bytes: loosen blacklist matching requirements
Gregory Szorc <gregory.szorc@gmail.com>
parents: 17299
diff changeset
    87
        if self.filename.endswith(blacklist):
11747
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    88
            return
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    89
        if node.type == token.STRING:
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    90
            if _re.match(node.value):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    91
                if isdocstring(node):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    92
                    return
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    93
                if not shouldtransform(node):
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    94
                    return
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    95
                new = node.clone()
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    96
                new.value = 'b' + new.value
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    97
                return new
40d5633889bb hgfixes: add a fixer to convert plain strings to bytestrings
Renato Cunha <renatoc@gmail.com>
parents:
diff changeset
    98