mercurial/hgweb/request.py
author Pierre-Yves David <pierre-yves.david@octobus.net>
Fri, 15 Mar 2024 01:31:57 +0100
branchstable
changeset 51505 c9ceb4f60256
parent 50820 9ed281bbf864
permissions -rw-r--r--
phases: avoid N² behavior in `advanceboundary` We allowed duplicated entries in the deque, which each entry could potentially insert all its ancestors. So advancing boundary for the full repository would mean each revision would walk all its ancestors, resulting in O(N²) iteration. For repository of any decent size, N² is quickly insane. We introduce a simple set to avoid this and get back to reasonable performance.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
2391
d351a3be3371 Fixing up comment headers for split up code.
Eric Hopper <hopper@omnifarious.org>
parents: 2355
diff changeset
     1
# hgweb/request.py - An http request from either CGI or the standalone server.
131
c9d51742471c moving hgweb to mercurial subdir
jake@edge2.net
parents:
diff changeset
     2
#
238
3b92f8fe47ae hgweb.py: kill #! line, clean up copyright notice
mpm@selenic.com
parents: 222
diff changeset
     3
# Copyright 21 May 2005 - (c) 2005 Jake Edge <jake@edge2.net>
46819
d4ba4d51f85f contributor: change mentions of mpm to olivia
Raphaël Gomès <rgomes@octobus.net>
parents: 45942
diff changeset
     4
# Copyright 2005, 2006 Olivia Mackall <olivia@selenic.com>
131
c9d51742471c moving hgweb to mercurial subdir
jake@edge2.net
parents:
diff changeset
     5
#
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 7742
diff changeset
     6
# This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 10261
diff changeset
     7
# GNU General Public License version 2 or any later version.
131
c9d51742471c moving hgweb to mercurial subdir
jake@edge2.net
parents:
diff changeset
     8
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
     9
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
    10
# import wsgiref.validate
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
    11
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
    12
from ..thirdparty import attr
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
    13
from .. import (
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
    14
    error,
34514
528b21b853aa request: coerce content-type to native str
Augie Fackler <augie@google.com>
parents: 34513
diff changeset
    15
    pycompat,
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
    16
    util,
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
    17
)
46907
ffd3e823a7e5 urlutil: extract `url` related code from `util` into the new module
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
    18
from ..utils import (
ffd3e823a7e5 urlutil: extract `url` related code from `util` into the new module
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
    19
    urlutil,
ffd3e823a7e5 urlutil: extract `url` related code from `util` into the new module
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
    20
)
138
c77a679e9cfa Revamped templated hgweb
mpm@selenic.com
parents: 137
diff changeset
    21
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
    22
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48913
diff changeset
    23
class multidict:
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    24
    """A dict like object that can store multiple values for a key.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    25
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    26
    Used to store parsed request parameters.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    27
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    28
    This is inspired by WebOb's class of the same name.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    29
    """
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
    30
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    31
    def __init__(self):
36997
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36911
diff changeset
    32
        self._items = {}
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    33
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    34
    def __getitem__(self, key):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    35
        """Returns the last set value for a key."""
36997
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36911
diff changeset
    36
        return self._items[key][-1]
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    37
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    38
    def __setitem__(self, key, value):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    39
        """Replace a values for a key with a new value."""
36997
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36911
diff changeset
    40
        self._items[key] = [value]
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    41
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    42
    def __delitem__(self, key):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    43
        """Delete all values for a key."""
36997
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36911
diff changeset
    44
        del self._items[key]
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    45
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    46
    def __contains__(self, key):
36997
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36911
diff changeset
    47
        return key in self._items
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    48
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    49
    def __len__(self):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    50
        return len(self._items)
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    51
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    52
    def get(self, key, default=None):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    53
        try:
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    54
            return self.__getitem__(key)
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    55
        except KeyError:
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    56
            return default
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    57
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    58
    def add(self, key, value):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    59
        """Add a new value for a key. Does not replace existing values."""
36997
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36911
diff changeset
    60
        self._items.setdefault(key, []).append(value)
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    61
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    62
    def getall(self, key):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    63
        """Obtains all values for a key."""
36997
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36911
diff changeset
    64
        return self._items.get(key, [])
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    65
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    66
    def getone(self, key):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    67
        """Obtain a single value for a key.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    68
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    69
        Raises KeyError if key not defined or it has multiple values set.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    70
        """
36997
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36911
diff changeset
    71
        vals = self._items[key]
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    72
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    73
        if len(vals) > 1:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    74
            raise KeyError(b'multiple values for %r' % key)
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    75
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    76
        return vals[0]
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    77
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    78
    def asdictoflists(self):
48913
f254fc73d956 global: bulk replace simple pycompat.iteritems(x) with x.items()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48905
diff changeset
    79
        return {k: list(v) for k, v in self._items.items()}
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
    80
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
    81
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    82
@attr.s(frozen=True)
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48913
diff changeset
    83
class parsedrequest:
36857
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36856
diff changeset
    84
    """Represents a parsed WSGI request.
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36856
diff changeset
    85
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36856
diff changeset
    86
    Contains both parsed parameters as well as a handle on the input stream.
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36856
diff changeset
    87
    """
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    88
36848
16292bbda39c hgweb: store and use request method on parsed request
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36847
diff changeset
    89
    # Request method.
16292bbda39c hgweb: store and use request method on parsed request
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36847
diff changeset
    90
    method = attr.ib()
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    91
    # Full URL for this request.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    92
    url = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    93
    # URL without any path components. Just <proto>://<host><port>.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    94
    baseurl = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    95
    # Advertised URL. Like ``url`` and ``baseurl`` but uses SERVER_NAME instead
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    96
    # of HTTP: Host header for hostname. This is likely what clients used.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    97
    advertisedurl = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
    98
    advertisedbaseurl = attr.ib()
36867
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36866
diff changeset
    99
    # URL scheme (part before ``://``). e.g. ``http`` or ``https``.
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36866
diff changeset
   100
    urlscheme = attr.ib()
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36866
diff changeset
   101
    # Value of REMOTE_USER, if set, or None.
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36866
diff changeset
   102
    remoteuser = attr.ib()
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36866
diff changeset
   103
    # Value of REMOTE_HOST, if set, or None.
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36866
diff changeset
   104
    remotehost = attr.ib()
36899
e67a2e05fa8a hgweb: clarify that apppath begins with a forward slash
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36898
diff changeset
   105
    # Relative WSGI application path. If defined, will begin with a
e67a2e05fa8a hgweb: clarify that apppath begins with a forward slash
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36898
diff changeset
   106
    # ``/``.
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   107
    apppath = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   108
    # List of path parts to be used for dispatch.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   109
    dispatchparts = attr.ib()
36898
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36897
diff changeset
   110
    # URL path component (no query string) used for dispatch. Can be
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36897
diff changeset
   111
    # ``None`` to signal no path component given to the request, an
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36897
diff changeset
   112
    # empty string to signal a request to the application's root URL,
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36897
diff changeset
   113
    # or a string not beginning with ``/`` containing the requested
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36897
diff changeset
   114
    # path under the application.
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   115
    dispatchpath = attr.ib()
36868
8ddb5c354906 hgweb: expose repo name on parsedrequest
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
   116
    # The name of the repository being accessed.
8ddb5c354906 hgweb: expose repo name on parsedrequest
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
   117
    reponame = attr.ib()
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   118
    # Raw query string (part after "?" in URL).
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   119
    querystring = attr.ib()
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
   120
    # multidict of query string parameters.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
   121
    qsparams = attr.ib()
36814
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   122
    # wsgiref.headers.Headers instance. Operates like a dict with case
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   123
    # insensitive keys.
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   124
    headers = attr.ib()
36857
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36856
diff changeset
   125
    # Request body input stream.
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36856
diff changeset
   126
    bodyfh = attr.ib()
36909
84110a1d0f7d hgweb: store the raw WSGI environment dict
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36908
diff changeset
   127
    # WSGI environment dict, unmodified.
84110a1d0f7d hgweb: store the raw WSGI environment dict
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36908
diff changeset
   128
    rawenv = attr.ib()
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   129
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   130
37818
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   131
def parserequestfromenv(env, reponame=None, altbaseurl=None, bodyfh=None):
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   132
    """Parse URL components from environment variables.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   133
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   134
    WSGI defines request attributes via environment variables. This function
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   135
    parses the environment variables into a data structure.
36897
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   136
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   137
    If ``reponame`` is defined, the leading path components matching that
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   138
    string are effectively shifted from ``PATH_INFO`` to ``SCRIPT_NAME``.
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   139
    This simulates the world view of a WSGI application that processes
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   140
    requests from the base URL of a repo.
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   141
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   142
    If ``altbaseurl`` (typically comes from ``web.baseurl`` config option)
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   143
    is defined, it is used - instead of the WSGI environment variables - for
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   144
    constructing URL components up to and including the WSGI application path.
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   145
    For example, if the current WSGI application is at ``/repo`` and a request
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   146
    is made to ``/rev/@`` with this argument set to
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   147
    ``http://myserver:9000/prefix``, the URL and path components will resolve as
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   148
    if the request were to ``http://myserver:9000/prefix/rev/@``. In other
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   149
    words, ``wsgi.url_scheme``, ``SERVER_NAME``, ``SERVER_PORT``, and
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   150
    ``SCRIPT_NAME`` are all effectively replaced by components from this URL.
37818
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   151
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   152
    ``bodyfh`` can be used to specify a file object to read the request body
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   153
    from. If not defined, ``wsgi.input`` from the environment dict is used.
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   154
    """
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   155
    # PEP 3333 defines the WSGI spec and is a useful reference for this code.
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   156
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   157
    # We first validate that the incoming object conforms with the WSGI spec.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   158
    # We only want to be dealing with spec-conforming WSGI implementations.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   159
    # TODO enable this once we fix internal violations.
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   160
    # wsgiref.validate.check_environ(env)
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   161
48905
7eebe5630bcc hgweb: remove Python 3 conditional
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   162
    # PEP-0333 states that environment keys and values are native strings.
7eebe5630bcc hgweb: remove Python 3 conditional
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   163
    # The code points for the Unicode strings on Python 3 must be between
7eebe5630bcc hgweb: remove Python 3 conditional
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   164
    # \00000-\000FF. We deal with bytes in Mercurial, so mass convert string
7eebe5630bcc hgweb: remove Python 3 conditional
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   165
    # keys and values to bytes.
7eebe5630bcc hgweb: remove Python 3 conditional
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   166
    def tobytes(s):
7eebe5630bcc hgweb: remove Python 3 conditional
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   167
        if not isinstance(s, str):
7eebe5630bcc hgweb: remove Python 3 conditional
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48875
diff changeset
   168
            return s
50820
9ed281bbf864 hgweb: encode WSGI environment using the ISO-8859-1 codec
Manuel Jacob <me@manueljacob.de>
parents: 49846
diff changeset
   169
        return s.encode('iso8859-1')
45004
2632c1ed8f34 hgweb: encode WSGI environment like OS environment
Manuel Jacob <me@manueljacob.de>
parents: 45003
diff changeset
   170
48913
f254fc73d956 global: bulk replace simple pycompat.iteritems(x) with x.items()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48905
diff changeset
   171
    env = {tobytes(k): tobytes(v) for k, v in env.items()}
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   172
37616
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   173
    # Some hosting solutions are emulating hgwebdir, and dispatching directly
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   174
    # to an hgweb instance using this environment variable.  This was always
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   175
    # checked prior to d7fd203e36cc; keep doing so to avoid breaking them.
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   176
    if not reponame:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   177
        reponame = env.get(b'REPO_NAME')
37616
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   178
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   179
    if altbaseurl:
46907
ffd3e823a7e5 urlutil: extract `url` related code from `util` into the new module
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
   180
        altbaseurl = urlutil.url(altbaseurl)
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   181
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   182
    # https://www.python.org/dev/peps/pep-0333/#environ-variables defines
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   183
    # the environment variables.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   184
    # https://www.python.org/dev/peps/pep-0333/#url-reconstruction defines
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   185
    # how URLs are reconstructed.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   186
    fullurl = env[b'wsgi.url_scheme'] + b'://'
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   187
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   188
    if altbaseurl and altbaseurl.scheme:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   189
        advertisedfullurl = altbaseurl.scheme + b'://'
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   190
    else:
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   191
        advertisedfullurl = fullurl
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   192
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   193
    def addport(s, port):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   194
        if s.startswith(b'https://'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   195
            if port != b'443':
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   196
                s += b':' + port
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   197
        else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   198
            if port != b'80':
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   199
                s += b':' + port
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   200
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   201
        return s
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   202
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   203
    if env.get(b'HTTP_HOST'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   204
        fullurl += env[b'HTTP_HOST']
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   205
    else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   206
        fullurl += env[b'SERVER_NAME']
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   207
        fullurl = addport(fullurl, env[b'SERVER_PORT'])
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   208
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   209
    if altbaseurl and altbaseurl.host:
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   210
        advertisedfullurl += altbaseurl.host
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   211
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   212
        if altbaseurl.port:
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   213
            port = altbaseurl.port
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   214
        elif altbaseurl.scheme == b'http' and not altbaseurl.port:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   215
            port = b'80'
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   216
        elif altbaseurl.scheme == b'https' and not altbaseurl.port:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   217
            port = b'443'
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   218
        else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   219
            port = env[b'SERVER_PORT']
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   220
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   221
        advertisedfullurl = addport(advertisedfullurl, port)
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   222
    else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   223
        advertisedfullurl += env[b'SERVER_NAME']
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   224
        advertisedfullurl = addport(advertisedfullurl, env[b'SERVER_PORT'])
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   225
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   226
    baseurl = fullurl
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   227
    advertisedbaseurl = advertisedfullurl
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   228
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   229
    fullurl += util.urlreq.quote(env.get(b'SCRIPT_NAME', b''))
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   230
    fullurl += util.urlreq.quote(env.get(b'PATH_INFO', b''))
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   231
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   232
    if altbaseurl:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   233
        path = altbaseurl.path or b''
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   234
        if path and not path.startswith(b'/'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   235
            path = b'/' + path
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   236
        advertisedfullurl += util.urlreq.quote(path)
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   237
    else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   238
        advertisedfullurl += util.urlreq.quote(env.get(b'SCRIPT_NAME', b''))
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   239
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   240
    advertisedfullurl += util.urlreq.quote(env.get(b'PATH_INFO', b''))
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   241
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   242
    if env.get(b'QUERY_STRING'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   243
        fullurl += b'?' + env[b'QUERY_STRING']
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   244
        advertisedfullurl += b'?' + env[b'QUERY_STRING']
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   245
36897
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   246
    # If ``reponame`` is defined, that must be a prefix on PATH_INFO
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   247
    # that represents the repository being dispatched to. When computing
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   248
    # the dispatch info, we ignore these leading path components.
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   249
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   250
    if altbaseurl:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   251
        apppath = altbaseurl.path or b''
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   252
        if apppath and not apppath.startswith(b'/'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   253
            apppath = b'/' + apppath
36900
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36899
diff changeset
   254
    else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   255
        apppath = env.get(b'SCRIPT_NAME', b'')
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   256
36897
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   257
    if reponame:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   258
        repoprefix = b'/' + reponame.strip(b'/')
36808
0031e972ded2 hgweb: use the parsed application path directly
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36806
diff changeset
   259
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   260
        if not env.get(b'PATH_INFO'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   261
            raise error.ProgrammingError(b'reponame requires PATH_INFO')
36897
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   262
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   263
        if not env[b'PATH_INFO'].startswith(repoprefix):
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   264
            raise error.ProgrammingError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   265
                b'PATH_INFO does not begin with repo '
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   266
                b'name: %s (%s)' % (env[b'PATH_INFO'], reponame)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   267
            )
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   268
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   269
        dispatchpath = env[b'PATH_INFO'][len(repoprefix) :]
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   270
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   271
        if dispatchpath and not dispatchpath.startswith(b'/'):
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   272
            raise error.ProgrammingError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   273
                b'reponame prefix of PATH_INFO does '
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   274
                b'not end at path delimiter: %s (%s)'
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   275
                % (env[b'PATH_INFO'], reponame)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   276
            )
36897
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36896
diff changeset
   277
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   278
        apppath = apppath.rstrip(b'/') + repoprefix
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   279
        dispatchparts = dispatchpath.strip(b'/').split(b'/')
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   280
        dispatchpath = b'/'.join(dispatchparts)
36898
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36897
diff changeset
   281
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   282
    elif b'PATH_INFO' in env:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   283
        if env[b'PATH_INFO'].strip(b'/'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   284
            dispatchparts = env[b'PATH_INFO'].strip(b'/').split(b'/')
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   285
            dispatchpath = b'/'.join(dispatchparts)
36898
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36897
diff changeset
   286
        else:
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36897
diff changeset
   287
            dispatchparts = []
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   288
            dispatchpath = b''
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   289
    else:
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   290
        dispatchparts = []
36898
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36897
diff changeset
   291
        dispatchpath = None
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   292
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   293
    querystring = env.get(b'QUERY_STRING', b'')
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   294
36809
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36808
diff changeset
   295
    # We store as a list so we have ordering information. We also store as
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36808
diff changeset
   296
    # a dict to facilitate fast lookup.
36862
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
   297
    qsparams = multidict()
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
   298
    for k, v in util.urlreq.parseqsl(querystring, keep_blank_values=True):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36861
diff changeset
   299
        qsparams.add(k, v)
36809
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36808
diff changeset
   300
36814
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   301
    # HTTP_* keys contain HTTP request headers. The Headers structure should
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   302
    # perform case normalization for us. We just rewrite underscore to dash
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   303
    # so keys match what likely went over the wire.
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   304
    headers = []
48913
f254fc73d956 global: bulk replace simple pycompat.iteritems(x) with x.items()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48905
diff changeset
   305
    for k, v in env.items():
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   306
        if k.startswith(b'HTTP_'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   307
            headers.append((k[len(b'HTTP_') :].replace(b'_', b'-'), v))
36814
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   308
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   309
    from . import wsgiheaders  # avoid cycle
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   310
36814
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   311
    headers = wsgiheaders.Headers(headers)
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36811
diff changeset
   312
36847
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
   313
    # This is kind of a lie because the HTTP header wasn't explicitly
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
   314
    # sent. But for all intents and purposes it should be OK to lie about
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
   315
    # this, since a consumer will either either value to determine how many
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
   316
    # bytes are available to read.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   317
    if b'CONTENT_LENGTH' in env and b'HTTP_CONTENT_LENGTH' not in env:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   318
        headers[b'Content-Length'] = env[b'CONTENT_LENGTH']
36847
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
   319
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   320
    if b'CONTENT_TYPE' in env and b'HTTP_CONTENT_TYPE' not in env:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   321
        headers[b'Content-Type'] = env[b'CONTENT_TYPE']
37049
55e901396005 hgweb: also set Content-Type header
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36997
diff changeset
   322
37818
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   323
    if bodyfh is None:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   324
        bodyfh = env[b'wsgi.input']
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   325
        if b'Content-Length' in headers:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   326
            bodyfh = util.cappedreader(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   327
                bodyfh, int(headers[b'Content-Length'] or b'0')
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   328
            )
36857
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36856
diff changeset
   329
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   330
    return parsedrequest(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   331
        method=env[b'REQUEST_METHOD'],
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   332
        url=fullurl,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   333
        baseurl=baseurl,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   334
        advertisedurl=advertisedfullurl,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   335
        advertisedbaseurl=advertisedbaseurl,
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   336
        urlscheme=env[b'wsgi.url_scheme'],
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   337
        remoteuser=env.get(b'REMOTE_USER'),
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   338
        remotehost=env.get(b'REMOTE_HOST'),
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   339
        apppath=apppath,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   340
        dispatchparts=dispatchparts,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   341
        dispatchpath=dispatchpath,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   342
        reponame=reponame,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   343
        querystring=querystring,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   344
        qsparams=qsparams,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   345
        headers=headers,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   346
        bodyfh=bodyfh,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   347
        rawenv=env,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   348
    )
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   349
36806
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36713
diff changeset
   350
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48913
diff changeset
   351
class offsettrackingwriter:
36875
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   352
    """A file object like object that is append only and tracks write count.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   353
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   354
    Instances are bound to a callable. This callable is called with data
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   355
    whenever a ``write()`` is attempted.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   356
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   357
    Instances track the amount of written data so they can answer ``tell()``
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   358
    requests.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   359
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   360
    The intent of this class is to wrap the ``write()`` function returned by
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   361
    a WSGI ``start_response()`` function. Since ``write()`` is a callable and
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   362
    not a file object, it doesn't implement other file object methods.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   363
    """
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   364
36875
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   365
    def __init__(self, writefn):
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   366
        self._write = writefn
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   367
        self._offset = 0
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   368
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   369
    def write(self, s):
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   370
        res = self._write(s)
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   371
        # Some Python objects don't report the number of bytes written.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   372
        if res is None:
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   373
            self._offset += len(s)
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   374
        else:
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   375
            self._offset += res
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   376
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   377
    def flush(self):
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   378
        pass
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   379
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   380
    def tell(self):
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   381
        return self._offset
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36868
diff changeset
   382
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   383
48946
642e31cb55f0 py3: use class X: instead of class X(object):
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48913
diff changeset
   384
class wsgiresponse:
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   385
    """Represents a response to a WSGI request.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   386
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   387
    A response consists of a status line, headers, and a body.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   388
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   389
    Consumers must populate the ``status`` and ``headers`` fields and
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   390
    make a call to a ``setbody*()`` method before the response can be
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   391
    issued.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   392
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   393
    When it is time to start sending the response over the wire,
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   394
    ``sendresponse()`` is called. It handles emitting the header portion
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   395
    of the response message. It then yields chunks of body data to be
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   396
    written to the peer. Typically, the WSGI application itself calls
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   397
    and returns the value from ``sendresponse()``.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   398
    """
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   399
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   400
    def __init__(self, req, startresponse):
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   401
        """Create an empty response tied to a specific request.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   402
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   403
        ``req`` is a ``parsedrequest``. ``startresponse`` is the
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   404
        ``start_response`` function passed to the WSGI application.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   405
        """
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   406
        self._req = req
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   407
        self._startresponse = startresponse
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   408
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   409
        self.status = None
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   410
        from . import wsgiheaders  # avoid cycle
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   411
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   412
        self.headers = wsgiheaders.Headers([])
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   413
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   414
        self._bodybytes = None
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   415
        self._bodygen = None
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   416
        self._bodywillwrite = False
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   417
        self._started = False
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   418
        self._bodywritefn = None
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   419
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   420
    def _verifybody(self):
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   421
        if (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   422
            self._bodybytes is not None
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   423
            or self._bodygen is not None
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   424
            or self._bodywillwrite
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   425
        ):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   426
            raise error.ProgrammingError(b'cannot define body multiple times')
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   427
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   428
    def setbodybytes(self, b):
36878
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   429
        """Define the response body as static bytes.
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   430
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   431
        The empty string signals that there is no response body.
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   432
        """
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   433
        self._verifybody()
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   434
        self._bodybytes = b
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   435
        self.headers[b'Content-Length'] = b'%d' % len(b)
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   436
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   437
    def setbodygen(self, gen):
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   438
        """Define the response body as a generator of bytes."""
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   439
        self._verifybody()
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   440
        self._bodygen = gen
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   441
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   442
    def setbodywillwrite(self):
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   443
        """Signal an intent to use write() to emit the response body.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   444
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   445
        **This is the least preferred way to send a body.**
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   446
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   447
        It is preferred for WSGI applications to emit a generator of chunks
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   448
        constituting the response body. However, some consumers can't emit
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   449
        data this way. So, WSGI provides a way to obtain a ``write(data)``
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   450
        function that can be used to synchronously perform an unbuffered
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   451
        write.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   452
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   453
        Calling this function signals an intent to produce the body in this
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   454
        manner.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   455
        """
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   456
        self._verifybody()
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   457
        self._bodywillwrite = True
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   458
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   459
    def sendresponse(self):
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   460
        """Send the generated response to the client.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   461
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   462
        Before this is called, ``status`` must be set and one of
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   463
        ``setbodybytes()`` or ``setbodygen()`` must be called.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   464
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   465
        Calling this method multiple times is not allowed.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   466
        """
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   467
        if self._started:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   468
            raise error.ProgrammingError(
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   469
                b'sendresponse() called multiple times'
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   470
            )
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   471
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   472
        self._started = True
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   473
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   474
        if not self.status:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   475
            raise error.ProgrammingError(b'status line not defined')
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   476
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   477
        if (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   478
            self._bodybytes is None
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   479
            and self._bodygen is None
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   480
            and not self._bodywillwrite
49846
fda5a4b853ab hgweb: skip body creation of HEAD for most requests
Joerg Sonnenberger <joerg@bec.de>
parents: 48946
diff changeset
   481
            and self._req.method != b'HEAD'
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   482
        ):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   483
            raise error.ProgrammingError(b'response body not defined')
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   484
36878
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   485
        # RFC 7232 Section 4.1 states that a 304 MUST generate one of
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   486
        # {Cache-Control, Content-Location, Date, ETag, Expires, Vary}
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   487
        # and SHOULD NOT generate other headers unless they could be used
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   488
        # to guide cache updates. Furthermore, RFC 7230 Section 3.3.2
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   489
        # states that no response body can be issued. Content-Length can
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   490
        # be sent. But if it is present, it should be the size of the response
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   491
        # that wasn't transferred.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   492
        if self.status.startswith(b'304 '):
36878
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   493
            # setbodybytes('') will set C-L to 0. This doesn't conform with the
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   494
            # spec. So remove it.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   495
            if self.headers.get(b'Content-Length') == b'0':
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   496
                del self.headers[b'Content-Length']
36878
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   497
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   498
            # Strictly speaking, this is too strict. But until it causes
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   499
            # problems, let's be strict.
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   500
            badheaders = {
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   501
                k
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   502
                for k in self.headers.keys()
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   503
                if k.lower()
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   504
                not in (
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   505
                    b'date',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   506
                    b'etag',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   507
                    b'expires',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   508
                    b'cache-control',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   509
                    b'content-location',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   510
                    b'content-security-policy',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   511
                    b'vary',
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   512
                )
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   513
            }
36878
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   514
            if badheaders:
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   515
                raise error.ProgrammingError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   516
                    b'illegal header on 304 response: %s'
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   517
                    % b', '.join(sorted(badheaders))
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   518
                )
36878
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   519
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   520
            if self._bodygen is not None or self._bodywillwrite:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   521
                raise error.ProgrammingError(
43117
8ff1ecfadcd1 cleanup: join string literals that are already on one line
Martin von Zweigbergk <martinvonz@google.com>
parents: 43106
diff changeset
   522
                    b"must use setbodybytes('') with 304 responses"
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   523
                )
36878
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36876
diff changeset
   524
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   525
        # Various HTTP clients (notably httplib) won't read the HTTP response
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   526
        # until the HTTP request has been sent in full. If servers (us) send a
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   527
        # response before the HTTP request has been fully sent, the connection
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   528
        # may deadlock because neither end is reading.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   529
        #
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   530
        # We work around this by "draining" the request data before
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   531
        # sending any response in some conditions.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   532
        drain = False
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   533
        close = False
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   534
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   535
        # If the client sent Expect: 100-continue, we assume it is smart enough
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   536
        # to deal with the server sending a response before reading the request.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   537
        # (httplib doesn't do this.)
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   538
        if self._req.headers.get(b'Expect', b'').lower() == b'100-continue':
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   539
            pass
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   540
        # Only tend to request methods that have bodies. Strictly speaking,
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   541
        # we should sniff for a body. But this is fine for our existing
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   542
        # WSGI applications.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   543
        elif self._req.method not in (b'POST', b'PUT'):
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   544
            pass
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   545
        else:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   546
            # If we don't know how much data to read, there's no guarantee
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   547
            # that we can drain the request responsibly. The WSGI
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   548
            # specification only says that servers *should* ensure the
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   549
            # input stream doesn't overrun the actual request. So there's
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   550
            # no guarantee that reading until EOF won't corrupt the stream
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   551
            # state.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   552
            if not isinstance(self._req.bodyfh, util.cappedreader):
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   553
                close = True
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   554
            else:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   555
                # We /could/ only drain certain HTTP response codes. But 200 and
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   556
                # non-200 wire protocol responses both require draining. Since
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   557
                # we have a capped reader in place for all situations where we
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   558
                # drain, it is safe to read from that stream. We'll either do
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   559
                # a drain or no-op if we're already at EOF.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   560
                drain = True
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   561
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   562
        if close:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   563
            self.headers[b'Connection'] = b'Close'
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   564
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   565
        if drain:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   566
            assert isinstance(self._req.bodyfh, util.cappedreader)
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   567
            while True:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   568
                chunk = self._req.bodyfh.read(32768)
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   569
                if not chunk:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   570
                    break
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   571
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   572
        strheaders = [
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   573
            (pycompat.strurl(k), pycompat.strurl(v))
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   574
            for k, v in self.headers.items()
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   575
        ]
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   576
        write = self._startresponse(pycompat.sysstr(self.status), strheaders)
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   577
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   578
        if self._bodybytes:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   579
            yield self._bodybytes
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   580
        elif self._bodygen:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   581
            for chunk in self._bodygen:
40545
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   582
                # PEP-3333 says that output must be bytes. And some WSGI
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   583
                # implementations enforce this. We cast bytes-like types here
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   584
                # for convenience.
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   585
                if isinstance(chunk, bytearray):
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   586
                    chunk = bytes(chunk)
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   587
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   588
                yield chunk
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   589
        elif self._bodywillwrite:
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   590
            self._bodywritefn = write
49846
fda5a4b853ab hgweb: skip body creation of HEAD for most requests
Joerg Sonnenberger <joerg@bec.de>
parents: 48946
diff changeset
   591
        elif self._req.method == b'HEAD':
fda5a4b853ab hgweb: skip body creation of HEAD for most requests
Joerg Sonnenberger <joerg@bec.de>
parents: 48946
diff changeset
   592
            pass
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   593
        else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   594
            error.ProgrammingError(b'do not know how to send body')
36861
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36859
diff changeset
   595
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   596
    def getbodyfile(self):
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   597
        """Obtain a file object like object representing the response body.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   598
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   599
        For this to work, you must call ``setbodywillwrite()`` and then
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   600
        ``sendresponse()`` first. ``sendresponse()`` is a generator and the
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   601
        function won't run to completion unless the generator is advanced. The
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   602
        generator yields not items. The easiest way to consume it is with
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   603
        ``list(res.sendresponse())``, which should resolve to an empty list -
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   604
        ``[]``.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   605
        """
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   606
        if not self._bodywillwrite:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   607
            raise error.ProgrammingError(b'must call setbodywillwrite() first')
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   608
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   609
        if not self._started:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   610
            raise error.ProgrammingError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   611
                b'must call sendresponse() first; did '
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   612
                b'you remember to consume it since it '
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   613
                b'is a generator?'
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   614
            )
36876
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   615
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   616
        assert self._bodywritefn
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   617
        return offsettrackingwriter(self._bodywritefn)
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36875
diff changeset
   618
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   619
5566
d74fc8dec2b4 Less indirection in the WSGI web interface. This simplifies some code, and makes it more compliant with WSGI.
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5563
diff changeset
   620
def wsgiapplication(app_maker):
45942
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 45004
diff changeset
   621
    """For compatibility with old CGI scripts. A plain hgweb() or hgwebdir()
89a2afe31e82 formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents: 45004
diff changeset
   622
    can and should now be used as a WSGI application."""
5760
0145f9afb0e7 Removed tabs and trailing whitespace in python files
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5566
diff changeset
   623
    application = app_maker()
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   624
5760
0145f9afb0e7 Removed tabs and trailing whitespace in python files
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5566
diff changeset
   625
    def run_wsgi(env, respond):
5887
41a3fce17625 hgweb: return iterable, add deprecation note
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5886
diff changeset
   626
        return application(env, respond)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40545
diff changeset
   627
5760
0145f9afb0e7 Removed tabs and trailing whitespace in python files
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5566
diff changeset
   628
    return run_wsgi