hgext/highlight/highlight.py
author Anton Shestakov <av6@dwimlabs.net>
Wed, 22 Jul 2015 10:19:17 +0800
branchstable
changeset 25867 a74e9806d17d
parent 23613 7b8ff3fd11d3
child 25899 c35ee1bbbbdc
permissions -rw-r--r--
highlight: produce correct markup when there's a blank line just before EOF Due to how the colorized output from pygments was stripped of <pre> elements, when there was an empty line at the end of a file, highlight extension produced an incorrect markup (no closing tags from the fileline/annotateline template). It wasn't usually noticeable, because browsers were smart enough to see where the missing tags should've been, but in monoblue style it resulted in the last line having twice the normal height. Instead of awkwardly trying to strip outer <pre></pre> tags, let's make the formatter with nowrap=True, which should do what we need in pygments since at least 0.5 (2006-10-30). Example from monoblue style: Before: <div class="source"> <div style="font-family:monospace" class="parity0"> <pre><a class="linenr" href="#l1" id="l1"> 1</a> </pre> </div> <div style="font-family:monospace" class="parity1"> <pre><a class="linenr" href="#l2" id="l2"> 2</a> </div> Now: <div class="source"> <div style="font-family:monospace" class="parity0"> <pre><a class="linenr" href="#l1" id="l1"> 1</a> </pre> </div> <div style="font-family:monospace" class="parity1"> <pre><a class="linenr" href="#l2" id="l2"> 2</a> </pre> </div> </div> (Notice the missing </pre></div> now in place)
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
8251
7fc30044b514 highlight: add copyright and license header
Martin Geisler <mg@lazybytes.net>
parents: 7948
diff changeset
     1
# highlight.py - highlight extension implementation file
7fc30044b514 highlight: add copyright and license header
Martin Geisler <mg@lazybytes.net>
parents: 7948
diff changeset
     2
#
7fc30044b514 highlight: add copyright and license header
Martin Geisler <mg@lazybytes.net>
parents: 7948
diff changeset
     3
#  Copyright 2007-2009 Adam Hupp <adam@hupp.org> and others
7fc30044b514 highlight: add copyright and license header
Martin Geisler <mg@lazybytes.net>
parents: 7948
diff changeset
     4
#
7fc30044b514 highlight: add copyright and license header
Martin Geisler <mg@lazybytes.net>
parents: 7948
diff changeset
     5
# This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 9424
diff changeset
     6
# GNU General Public License version 2 or any later version.
6938
ce94b3236ea4 highlight: split code to improve startup times
Patrick Mezard <pmezard@gmail.com>
parents: 6666
diff changeset
     7
#
ce94b3236ea4 highlight: split code to improve startup times
Patrick Mezard <pmezard@gmail.com>
parents: 6666
diff changeset
     8
# The original module was split in an interface and an implementation
ce94b3236ea4 highlight: split code to improve startup times
Patrick Mezard <pmezard@gmail.com>
parents: 6666
diff changeset
     9
# file to defer pygments loading and speedup extension setup.
5532
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    10
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    11
from mercurial import demandimport
10394
4612cded5176 fix coding style (reported by pylint)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 10263
diff changeset
    12
demandimport.ignore.extend(['pkgutil', 'pkg_resources', '__main__'])
7948
de377b1a9a84 move encoding bits from util to encoding
Matt Mackall <mpm@selenic.com>
parents: 7120
diff changeset
    13
from mercurial import util, encoding
5532
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    14
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    15
from pygments import highlight
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    16
from pygments.util import ClassNotFound
6199
0068809347d7 highlight: forgot import
Brendan Cully <brendan@kublai.com>
parents: 6198
diff changeset
    17
from pygments.lexers import guess_lexer, guess_lexer_for_filename, TextLexer
5532
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    18
from pygments.formatters import HtmlFormatter
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    19
6485
938319418d8c highlight: Generate pygments style sheet dynamically
Isaac Jurado <diptongo@gmail.com>
parents: 6394
diff changeset
    20
SYNTAX_CSS = ('\n<link rel="stylesheet" href="{url}highlightcss" '
5533
6cf7d7fe7d3d highlight: clean up coding style a little
Bryan O'Sullivan <bos@serpentine.com>
parents: 5532
diff changeset
    21
              'type="text/css" />')
5532
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    22
6394
55bc0a035e1f highlight: some small cleanups
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6393
diff changeset
    23
def pygmentize(field, fctx, style, tmpl):
55bc0a035e1f highlight: some small cleanups
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 6393
diff changeset
    24
5532
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    25
    # append a <link ...> to the syntax highlighting css
10959
d1f4657f55e4 highlight: fix to work with caching templater
Matt Mackall <mpm@selenic.com>
parents: 10394
diff changeset
    26
    old_header = tmpl.load('header')
5532
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    27
    if SYNTAX_CSS not in old_header:
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    28
        new_header =  old_header + SYNTAX_CSS
5616
88ca3e0fb6e5 highlight: adapt to hgweb_mode refactoring
Christian Ebert <blacktrash@gmx.net>
parents: 5533
diff changeset
    29
        tmpl.cache['header'] = new_header
5532
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    30
6194
fe54e7501de1 highlight: bail out if file is binary
Brendan Cully <brendan@kublai.com>
parents: 6193
diff changeset
    31
    text = fctx.data()
fe54e7501de1 highlight: bail out if file is binary
Brendan Cully <brendan@kublai.com>
parents: 6193
diff changeset
    32
    if util.binary(text):
fe54e7501de1 highlight: bail out if file is binary
Brendan Cully <brendan@kublai.com>
parents: 6193
diff changeset
    33
        return
fe54e7501de1 highlight: bail out if file is binary
Brendan Cully <brendan@kublai.com>
parents: 6193
diff changeset
    34
23613
7b8ff3fd11d3 highlight: ignore Unicode's extra linebreaks (issue4291)
Matt Mackall <mpm@selenic.com>
parents: 19169
diff changeset
    35
    # str.splitlines() != unicode.splitlines() because "reasons"
7b8ff3fd11d3 highlight: ignore Unicode's extra linebreaks (issue4291)
Matt Mackall <mpm@selenic.com>
parents: 19169
diff changeset
    36
    for c in "\x0c\x1c\x1d\x1e":
7b8ff3fd11d3 highlight: ignore Unicode's extra linebreaks (issue4291)
Matt Mackall <mpm@selenic.com>
parents: 19169
diff changeset
    37
        if c in text:
7b8ff3fd11d3 highlight: ignore Unicode's extra linebreaks (issue4291)
Matt Mackall <mpm@selenic.com>
parents: 19169
diff changeset
    38
            text = text.replace(c, '')
7b8ff3fd11d3 highlight: ignore Unicode's extra linebreaks (issue4291)
Matt Mackall <mpm@selenic.com>
parents: 19169
diff changeset
    39
9424
799373ff2554 highlight: fixes garbled text in non-UTF-8 environment
Yuya Nishihara <yuya@tcha.org>
parents: 8360
diff changeset
    40
    # Pygments is best used with Unicode strings:
799373ff2554 highlight: fixes garbled text in non-UTF-8 environment
Yuya Nishihara <yuya@tcha.org>
parents: 8360
diff changeset
    41
    # <http://pygments.org/docs/unicode/>
799373ff2554 highlight: fixes garbled text in non-UTF-8 environment
Yuya Nishihara <yuya@tcha.org>
parents: 8360
diff changeset
    42
    text = text.decode(encoding.encoding, 'replace')
7120
db7557359636 highlight: convert text to local before passing to pygmentize (issue1341)
Christian Ebert <blacktrash@gmx.net>
parents: 6938
diff changeset
    43
6193
2344da8eb9b4 highlight: support annotate, and reduce layering violations.
Brendan Cully <brendan@kublai.com>
parents: 5991
diff changeset
    44
    # To get multi-line strings right, we can't format line-by-line
2344da8eb9b4 highlight: support annotate, and reduce layering violations.
Brendan Cully <brendan@kublai.com>
parents: 5991
diff changeset
    45
    try:
19169
bcdfb6078b9f highlight: fix page layout with empty first and last lines
Alexander Plavin <me@aplavin.ru>
parents: 18054
diff changeset
    46
        lexer = guess_lexer_for_filename(fctx.path(), text[:1024],
bcdfb6078b9f highlight: fix page layout with empty first and last lines
Alexander Plavin <me@aplavin.ru>
parents: 18054
diff changeset
    47
                                         stripnl=False)
6494
c30849d4c8ba highlight: backward compatibility with pygments 0.5.1
Benoit Allard <benoit@aeteurope.nl>
parents: 6212
diff changeset
    48
    except (ClassNotFound, ValueError):
6198
358cc9cf54db highlight: guess by text when path name is ambiguous
Brendan Cully <brendan@kublai.com>
parents: 6197
diff changeset
    49
        try:
19169
bcdfb6078b9f highlight: fix page layout with empty first and last lines
Alexander Plavin <me@aplavin.ru>
parents: 18054
diff changeset
    50
            lexer = guess_lexer(text[:1024], stripnl=False)
6494
c30849d4c8ba highlight: backward compatibility with pygments 0.5.1
Benoit Allard <benoit@aeteurope.nl>
parents: 6212
diff changeset
    51
        except (ClassNotFound, ValueError):
19169
bcdfb6078b9f highlight: fix page layout with empty first and last lines
Alexander Plavin <me@aplavin.ru>
parents: 18054
diff changeset
    52
            lexer = TextLexer(stripnl=False)
6193
2344da8eb9b4 highlight: support annotate, and reduce layering violations.
Brendan Cully <brendan@kublai.com>
parents: 5991
diff changeset
    53
25867
a74e9806d17d highlight: produce correct markup when there's a blank line just before EOF
Anton Shestakov <av6@dwimlabs.net>
parents: 23613
diff changeset
    54
    formatter = HtmlFormatter(nowrap=True, style=style)
5532
40a06e39f010 extension for synax highlighting in the hgweb file revision view
Adam Hupp <adam@hupp.org>
parents:
diff changeset
    55
6193
2344da8eb9b4 highlight: support annotate, and reduce layering violations.
Brendan Cully <brendan@kublai.com>
parents: 5991
diff changeset
    56
    colorized = highlight(text, lexer, formatter)
9424
799373ff2554 highlight: fixes garbled text in non-UTF-8 environment
Yuya Nishihara <yuya@tcha.org>
parents: 8360
diff changeset
    57
    coloriter = (s.encode(encoding.encoding, 'replace')
799373ff2554 highlight: fixes garbled text in non-UTF-8 environment
Yuya Nishihara <yuya@tcha.org>
parents: 8360
diff changeset
    58
                 for s in colorized.splitlines())
6193
2344da8eb9b4 highlight: support annotate, and reduce layering violations.
Brendan Cully <brendan@kublai.com>
parents: 5991
diff changeset
    59
8360
acc202b71619 templater: provide the standard template filters by default
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 8251
diff changeset
    60
    tmpl.filters['colorize'] = lambda x: coloriter.next()
6193
2344da8eb9b4 highlight: support annotate, and reduce layering violations.
Brendan Cully <brendan@kublai.com>
parents: 5991
diff changeset
    61
2344da8eb9b4 highlight: support annotate, and reduce layering violations.
Brendan Cully <brendan@kublai.com>
parents: 5991
diff changeset
    62
    oldl = tmpl.cache[field]
2344da8eb9b4 highlight: support annotate, and reduce layering violations.
Brendan Cully <brendan@kublai.com>
parents: 5991
diff changeset
    63
    newl = oldl.replace('line|escape', 'line|colorize')
2344da8eb9b4 highlight: support annotate, and reduce layering violations.
Brendan Cully <brendan@kublai.com>
parents: 5991
diff changeset
    64
    tmpl.cache[field] = newl