doc/docchecker
author Simon Sapin <simon.sapin@octobus.net>
Fri, 21 Jan 2022 17:54:03 +0100
changeset 48745 94e36b230990
parent 45830 c102b704edb5
child 48875 6000f5b25c9b
permissions -rwxr-xr-x
status: prefer relative paths in Rust code … when the repository root is under the current directory, so the kernel needs to traverse fewer directory in every call to `read_dir` or `symlink_metadata`. Better yet would be to use libc functions like `openat` and `fstatat` to remove such repeated traversals entirely, but the standard library does not provide APIs based on those. Maybe with a crate like https://crates.io/crates/openat instead? Benchmarks of `rhg status` show that this patch is neutral in some configurations, and makes the command up to ~20% faster in others. Below is semi-arbitrary subset of results. The four numeric columns are: time (in seconds) with this changeset’s parent, time with this changeset, time difference (negative is better), time ratio (less than 1 is better). ``` mercurial-dirstate-v1 | default-plain-clean.no-iu.pbr | 0.0061 -> 0.0059: -0.0002 (0.97) mercurial-dirstate-v2 | default-plain-clean.no-iu.pbr | 0.0029 -> 0.0028: -0.0001 (0.97) mozilla-dirstate-v1 | default-plain-clean.no-iu.pbr | 0.2110 -> 0.2102: -0.0007 (1.00) mozilla-dirstate-v2 | default-copies-clean.ignored.pbr | 0.0489 -> 0.0401: -0.0088 (0.82) mozilla-dirstate-v2 | default-copies-clean.no-iu.pbr | 0.0479 -> 0.0393: -0.0085 (0.82) mozilla-dirstate-v2 | default-copies-large.all.pbr | 0.1262 -> 0.1210: -0.0051 (0.96) mozilla-dirstate-v2 | default-copies-small.ignored-unknown.pbr | 0.1262 -> 0.1200: -0.0062 (0.95) mozilla-dirstate-v2 | default-copies-small.ignored.pbr | 0.0536 -> 0.0417: -0.0119 (0.78) mozilla-dirstate-v2 | default-copies-small.no-iu.pbr | 0.0482 -> 0.0393: -0.0089 (0.81) mozilla-dirstate-v2 | default-plain-clean.ignored.pbr | 0.0518 -> 0.0402: -0.0116 (0.78) mozilla-dirstate-v2 | default-plain-clean.no-iu.pbr | 0.0481 -> 0.0392: -0.0088 (0.82) mozilla-dirstate-v2 | default-plain-large.all.pbr | 0.1271 -> 0.1218: -0.0052 (0.96) mozilla-dirstate-v2 | default-plain-small.ignored-unknown.pbr | 0.1225 -> 0.1202: -0.0022 (0.98) mozilla-dirstate-v2 | default-plain-small.ignored.pbr | 0.0510 -> 0.0418: -0.0092 (0.82) mozilla-dirstate-v2 | default-plain-small.no-iu.pbr | 0.0480 -> 0.0394: -0.0086 (0.82) netbeans-dirstate-v1 | default-plain-clean.no-iu.pbr | 0.1442 -> 0.1422: -0.0020 (0.99) netbeans-dirstate-v2 | default-plain-clean.no-iu.pbr | 0.0325 -> 0.0282: -0.0043 (0.87) ``` Differential Revision: https://phab.mercurial-scm.org/D12175
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
45830
c102b704edb5 global: use python3 in shebangs
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43691
diff changeset
     1
#!/usr/bin/env python3
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     2
#
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     3
# docchecker - look for problematic markup
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     4
#
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     5
# Copyright 2016 timeless <timeless@mozdev.org> and others
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     6
#
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     7
# This software may be used and distributed according to the terms of the
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     8
# GNU General Public License version 2 or any later version.
29168
8f2805ce93d9 py3: make doc/docchecker use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28811
diff changeset
     9
29169
c9ab5a0bc7c5 py3: make doc/docchecker use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 29168
diff changeset
    10
from __future__ import absolute_import, print_function
29168
8f2805ce93d9 py3: make doc/docchecker use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28811
diff changeset
    11
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    12
import os
29168
8f2805ce93d9 py3: make doc/docchecker use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28811
diff changeset
    13
import re
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    14
import sys
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    15
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    16
try:
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    17
    import msvcrt
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    18
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    19
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    20
    msvcrt.setmode(sys.stderr.fileno(), os.O_BINARY)
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    21
except ImportError:
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    22
    pass
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    23
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    24
stdout = getattr(sys.stdout, 'buffer', sys.stdout)
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    25
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    26
leadingline = re.compile(br'(^\s*)(\S.*)$')
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    27
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    28
checks = [
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    29
    (
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    30
        br""":hg:`[^`]*'[^`]*`""",
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    31
        b"""warning: please avoid nesting ' in :hg:`...`""",
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    32
    ),
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    33
    (br'\w:hg:`', b'warning: please have a space before :hg:'),
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    34
    (
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    35
        br"""(?:[^a-z][^'.])hg ([^,;"`]*'(?!hg)){2}""",
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    36
        b'''warning: please use " instead of ' for hg ... "..."''',
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    37
    ),
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    38
]
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    39
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    40
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    41
def check(line):
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    42
    messages = []
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    43
    for match, msg in checks:
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    44
        if re.search(match, line):
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    45
            messages.append(msg)
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    46
    if messages:
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    47
        stdout.write(b'%s\n' % line)
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    48
        for msg in messages:
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    49
            stdout.write(b'%s\n' % msg)
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    50
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    51
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    52
def work(file):
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    53
    (llead, lline) = (b'', b'')
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    54
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    55
    for line in file:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    56
        # this section unwraps lines
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    57
        match = leadingline.match(line)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    58
        if not match:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    59
            check(lline)
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    60
            (llead, lline) = (b'', b'')
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    61
            continue
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    62
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    63
        lead, line = match.group(1), match.group(2)
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    64
        if lead == llead:
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    65
            if lline != b'':
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    66
                lline += b' ' + line
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    67
            else:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    68
                lline = line
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    69
        else:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    70
            check(lline)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    71
            (llead, lline) = (lead, line)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    72
    check(lline)
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    73
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    74
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    75
def main():
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    76
    for f in sys.argv[1:]:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    77
        try:
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    78
            with open(f, 'rb') as file:
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    79
                work(file)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    80
        except BaseException as e:
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    81
            sys.stdout.write(r"failed to process %s: %s\n" % (f, e))
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    82
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    83
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    84
main()