i18n/hggettext
author Martin Geisler <mg@lazybytes.net>
Thu, 21 May 2009 23:43:13 +0200
changeset 8542 de150a942ec8
child 8626 1fc1c77d4863
permissions -rwxr-xr-x
i18n: accurately generate hg.pot
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
8542
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     1
#!/usr/bin/env python
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     2
#
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     3
# hggettext - carefully extract docstrings for Mercurial
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     4
#
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     5
# Copyright 2009 Matt Mackall <mpm@selenic.com> and others
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     6
#
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     7
# This software may be used and distributed according to the terms of the
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     8
# GNU General Public License version 2, incorporated herein by reference.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
     9
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    10
# The normalize function is taken from pygettext which is distributed
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    11
# with Python under the Python License, which is GPL compatible.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    12
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    13
"""Extract docstrings from Mercurial commands.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    14
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    15
Compared to pygettext, this script knows about the cmdtable and table
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    16
dictionaries used by Mercurial, and will only extract docstrings from
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    17
functions mentioned therein.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    18
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    19
Use xgettext like normal to extract strings marked as translatable and
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    20
join the message cataloges to get the final catalog.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    21
"""
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    22
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    23
from mercurial import demandimport; demandimport.enable()
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    24
import sys, inspect
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    25
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    26
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    27
def escape(s):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    28
    # The order is important, the backslash must be escaped first
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    29
    # since the other replacements introduce new backslashes
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    30
    # themselves.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    31
    s = s.replace('\\', '\\\\')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    32
    s = s.replace('\n', '\\n')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    33
    s = s.replace('\r', '\\r')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    34
    s = s.replace('\t', '\\t')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    35
    s = s.replace('"', '\\"')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    36
    return s
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    37
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    38
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    39
def normalize(s):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    40
    # This converts the various Python string types into a format that
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    41
    # is appropriate for .po files, namely much closer to C style.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    42
    lines = s.split('\n')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    43
    if len(lines) == 1:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    44
        s = '"' + escape(s) + '"'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    45
    else:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    46
        if not lines[-1]:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    47
            del lines[-1]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    48
            lines[-1] = lines[-1] + '\n'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    49
        lines = map(escape, lines)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    50
        lineterm = '\\n"\n"'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    51
        s = '""\n"' + lineterm.join(lines) + '"'
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    52
    return s
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    53
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    54
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    55
def poentry(path, lineno, s):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    56
    return ('#: %s:%d\n' % (path, lineno) +
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    57
            'msgid %s\n' % normalize(s) +
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    58
            'msgstr ""\n')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    59
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    60
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    61
def offset(src, doc, name, default):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    62
    """Compute offset or issue a warning on stdout."""
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    63
    # Backslashes in doc appear doubled in src.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    64
    end = src.find(doc.replace('\\', '\\\\'))
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    65
    if end == -1:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    66
        # This can happen if the docstring contains unnecessary escape
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    67
        # sequences such as \" in a triple-quoted string. The problem
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    68
        # is that \" is turned into " and so doc wont appear in src.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    69
        sys.stderr.write("warning: unknown offset in %s, assuming %d lines\n"
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    70
                         % (name, default))
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    71
        return default
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    72
    else:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    73
        return src.count('\n', 0, end)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    74
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    75
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    76
def importpath(path):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    77
    """Import a path like foo/bar/baz.py and return the baz module."""
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    78
    if path.endswith('.py'):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    79
        path = path[:-3]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    80
    if path.endswith('/__init__'):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    81
        path = path[:-9]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    82
    path = path.replace('/', '.')
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    83
    mod = __import__(path)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    84
    for comp in path.split('.')[1:]:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    85
        mod = getattr(mod, comp)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    86
    return mod
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    87
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    88
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    89
def docstrings(path):
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    90
    """Extract docstrings from path.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    91
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    92
    This respects the Mercurial cmdtable/table convention and will
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    93
    only extract docstrings from functions mentioned in these tables.
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    94
    """
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    95
    mod = importpath(path)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    96
    if mod.__doc__:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    97
        src = open(path).read()
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    98
        lineno = 1 + offset(src, mod.__doc__, path, 7)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
    99
        print poentry(path, lineno, mod.__doc__)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   100
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   101
    cmdtable = getattr(mod, 'cmdtable', {})
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   102
    if not cmdtable:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   103
        # Maybe we are processing mercurial.commands?
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   104
        cmdtable = getattr(mod, 'table', {})
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   105
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   106
    for entry in cmdtable.itervalues():
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   107
        func = entry[0]
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   108
        if func.__doc__:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   109
            src = inspect.getsource(func)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   110
            name = "%s.%s" % (path, func.__name__)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   111
            lineno = func.func_code.co_firstlineno
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   112
            lineno += offset(src, func.__doc__, name, 1)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   113
            print poentry(path, lineno, func.__doc__)
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   114
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   115
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   116
if __name__ == "__main__":
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   117
    for path in sys.argv[1:]:
de150a942ec8 i18n: accurately generate hg.pot
Martin Geisler <mg@lazybytes.net>
parents:
diff changeset
   118
        docstrings(path)