hgext/win32mbcs.py
author Cédric Duval <cedricduval@free.fr>
Mon, 22 Jun 2009 15:48:08 +0200
changeset 8894 868670dbc237
parent 8866 87c30fb7e8df
child 8932 f87884329419
permissions -rw-r--r--
extensions: improve the consistency of synopses Trying as much as possible to consistently: - use a present tense predicate followed by a direct object - verb referring directly to the functionality provided (ie. not "add command that does this" but simple "do that") - keep simple and to the point, leaving details for the long help (width is tight, possibly even more so for translations) Thanks to timeless, Martin Geisler, Rafael Villar Burke, Dan Villiom Podlaski Christiansen and others for the helpful suggestions.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
     1
# win32mbcs.py -- MBCS filename support for Mercurial
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     2
#
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     3
# Copyright (c) 2008 Shun-ichi Goto <shunichi.goto@gmail.com>
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     4
#
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
     5
# Version: 0.2
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     6
# Author:  Shun-ichi Goto <shunichi.goto@gmail.com>
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
     7
#
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 8001
diff changeset
     8
# This software may be used and distributed according to the terms of the
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 8001
diff changeset
     9
# GNU General Public License version 2, incorporated herein by reference.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    10
#
8228
eee2319c5895 add blank line after copyright notices and after header
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
    11
8894
868670dbc237 extensions: improve the consistency of synopses
Cédric Duval <cedricduval@free.fr>
parents: 8866
diff changeset
    12
'''allow the use of MBCS paths with problematic encoding
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    13
8001
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    14
Some MBCS encodings are not good for some path operations (i.e.
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    15
splitting path, case conversion, etc.) with its encoded bytes. We call
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    16
such a encoding (i.e. shift_jis and big5) as "problematic encoding".
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    17
This extension can be used to fix the issue with those encodings by
8665
e4ad46f9a004 win32mbcs: capitalize Unicode
Martin Geisler <mg@lazybytes.net>
parents: 8491
diff changeset
    18
wrapping some functions to convert to Unicode string before path
8001
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    19
operation.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    20
8668
aea3a23151bd fixed typos found in translatable strings
Martin Geisler <mg@lazybytes.net>
parents: 8667
diff changeset
    21
This extension is useful for:
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    22
 * Japanese Windows users using shift_jis encoding.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    23
 * Chinese Windows users using big5 encoding.
8001
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    24
 * All users who use a repository with one of problematic encodings on
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    25
   case-insensitive file system.
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    26
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    27
This extension is not needed for:
8667
594507755800 graphlog, win32mbcs: capitalize ASCII
Martin Geisler <mg@lazybytes.net>
parents: 8665
diff changeset
    28
 * Any user who use only ASCII chars in path.
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    29
 * Any user who do not use any of problematic encodings.
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    30
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    31
Note that there are some limitations on using this extension:
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    32
 * You should use single encoding in one repository.
8001
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    33
 * You should set same encoding for the repository by locale or
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    34
   HGENCODING.
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    35
8665
e4ad46f9a004 win32mbcs: capitalize Unicode
Martin Geisler <mg@lazybytes.net>
parents: 8491
diff changeset
    36
Path encoding conversion are done between Unicode and
8760
bf17aeafb869 Spell Mercurial as a proper noun
timeless <timeless@gmail.com>
parents: 8714
diff changeset
    37
encoding.encoding which is decided by Mercurial from current locale
8001
c0e3aca616de win32mbcs: word-wrap help texts at 70 characters
Martin Geisler <mg@daimi.au.dk>
parents: 7983
diff changeset
    38
setting or HGENCODING.
8894
868670dbc237 extensions: improve the consistency of synopses
Cédric Duval <cedricduval@free.fr>
parents: 8866
diff changeset
    39
'''
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    40
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    41
import os
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    42
from mercurial.i18n import _
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
    43
from mercurial import util, encoding
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    44
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    45
def decode(arg):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    46
    if isinstance(arg, str):
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
    47
        uarg = arg.decode(encoding.encoding)
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
    48
        if arg == uarg.encode(encoding.encoding):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    49
            return uarg
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    50
        raise UnicodeError("Not local encoding")
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    51
    elif isinstance(arg, tuple):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    52
        return tuple(map(decode, arg))
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    53
    elif isinstance(arg, list):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    54
        return map(decode, arg)
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    55
    return arg
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    56
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    57
def encode(arg):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    58
    if isinstance(arg, unicode):
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
    59
        return arg.encode(encoding.encoding)
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    60
    elif isinstance(arg, tuple):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    61
        return tuple(map(encode, arg))
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    62
    elif isinstance(arg, list):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    63
        return map(encode, arg)
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    64
    return arg
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    65
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    66
def wrapper(func, args):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    67
    # check argument is unicode, then call original
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    68
    for arg in args:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    69
        if isinstance(arg, unicode):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    70
            return func(*args)
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
    71
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    72
    try:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    73
        # convert arguments to unicode, call func, then convert back
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    74
        return encode(func(*decode(args)))
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    75
    except UnicodeError:
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
    76
        # If not encoded with encoding.encoding, report it then
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    77
        # continue with calling original function.
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    78
        raise util.Abort(_("[win32mbcs] filename conversion fail with"
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
    79
                         " %s encoding\n") % (encoding.encoding))
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    80
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    81
def wrapname(name):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    82
    idx = name.rfind('.')
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    83
    module = name[:idx]
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    84
    name = name[idx+1:]
8491
bd45047afaeb win32mbcs: remove silly eval
Matt Mackall <mpm@selenic.com>
parents: 8228
diff changeset
    85
    module = globals()[module]
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    86
    func = getattr(module, name)
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    87
    def f(*args):
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    88
        return wrapper(func, args)
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    89
    try:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    90
        f.__name__ = func.__name__                # fail with python23
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    91
    except Exception:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    92
        pass
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
    93
    setattr(module, name, f)
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    94
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    95
# List of functions to be wrapped.
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    96
# NOTE: os.path.dirname() and os.path.basename() are safe because
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    97
#       they use result of os.path.split()
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    98
funcs = '''os.path.join os.path.split os.path.splitext
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
    99
 os.path.splitunc os.path.normpath os.path.normcase os.makedirs
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   100
 util.endswithsep util.splitpath util.checkcase util.fspath'''
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   101
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   102
# codec and alias names of sjis and big5 to be faked.
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   103
problematic_encodings = '''big5 big5-tw csbig5 big5hkscs big5-hkscs
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   104
 hkscs cp932 932 ms932 mskanji ms-kanji shift_jis csshiftjis shiftjis
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   105
 sjis s_jis shift_jis_2004 shiftjis2004 sjis_2004 sjis2004
8714
505a96cbc923 Add cp950 as problematic encoding which is used in chinese windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents: 8668
diff changeset
   106
 shift_jisx0213 shiftjisx0213 sjisx0213 s_jisx0213 950 cp950 ms950 '''
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   107
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   108
def reposetup(ui, repo):
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
   109
    # TODO: decide use of config section for this extension
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
   110
    if not os.path.supports_unicode_filenames:
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
   111
        ui.warn(_("[win32mbcs] cannot activate on this platform.\n"))
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
   112
        return
5846
02884e56c217 New extension to support problematic MBCS on Windows.
Shun-ichi GOTO <shunichi.goto@gmail.com>
parents:
diff changeset
   113
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
   114
    # fake is only for relevant environment.
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
   115
    if encoding.encoding.lower() in problematic_encodings.split():
7877
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
   116
        for f in funcs.split():
eba7f12b0c51 cleanup: whitespace cleanup
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 7598
diff changeset
   117
            wrapname(f)
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
   118
        ui.debug(_("[win32mbcs] activated with encoding: %s\n")
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7877
diff changeset
   119
                 % encoding.encoding)
6887
304484c7e0ba Update win32mbcs extension
Shun-ichi Goto <shunichi.goto@gmail.com>
parents: 6210
diff changeset
   120