mercurial/httpconnection.py
author Augie Fackler <augie@google.com>
Sun, 25 Feb 2018 23:51:32 -0500
changeset 36426 23d12524a202
parent 35358 8549ca7fcde1
child 36651 6b1eb4c610b4
permissions -rw-r--r--
http: drop custom http client logic Eight and a half years ago, as my starter bug on code.google.com, I investigated a mysterious "broken pipe" error from seemingly random clients[0]. That investigation revealed a tragic story: the Python standard library's httplib was (and remains) barely functional. During large POSTs, if a server responds early with an error (even a permission denied error!) the client only notices that the server closed the connection and everything breaks. Such server behavior is implicitly legal under RFC 2616 (the latest HTTP RFC as of when I was last working on this), and my understanding is that later RFCs have made it explicitly legal to respond early with any status code outside the 2xx range. I embarked, probably foolishly, on a journey to write a new http library with better overall behavior. The http library appears to work well in most cases, but it can get confused in the presence of proxies, and it depends on select(2) which limits its utility if a lot of file descriptors are open. I haven't touched the http library in almost two years, and in the interim the Python community has discovered a better way[1] of writing network code. In theory some day urllib3 will have its own home-grown http library built on h11[2], or we could do that. Either way, it's time to declare our current confusingly-named "http2" client logic and move on. I do hope to revisit this some day: it's still garbage that we can't even respond with a 401 or 403 without reading the entire POST body from the client, but the goalposts on writing a new http client library have moved substantially. We're almost certainly better off just switching to requests and eventually picking up their http fixes than trying to live with something that realistically only we'll ever use. Another approach would be to write an adapter so that Mercurial can use pycurl if it's installed. Neither of those approaches seem like they should be investigated prior to a release of Mercurial that works on Python 3: that's where the mindshare is going to be for any improvements to the state of the http client art. 0: http://web.archive.org/web/20130501031801/http://code.google.com/p/support/issues/detail?id=2716 1: http://sans-io.readthedocs.io/ 2: https://github.com/njsmith/h11 Differential Revision: https://phab.mercurial-scm.org/D2444

# httpconnection.py - urllib2 handler for new http support
#
# Copyright 2005, 2006, 2007, 2008 Matt Mackall <mpm@selenic.com>
# Copyright 2006, 2007 Alexis S. L. Carvalho <alexis@cecm.usp.br>
# Copyright 2006 Vadim Gelfer <vadim.gelfer@gmail.com>
# Copyright 2011 Google, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

from __future__ import absolute_import

import os

from .i18n import _
from . import (
    util,
)

urlerr = util.urlerr
urlreq = util.urlreq

# moved here from url.py to avoid a cycle
class httpsendfile(object):
    """This is a wrapper around the objects returned by python's "open".

    Its purpose is to send file-like objects via HTTP.
    It do however not define a __len__ attribute because the length
    might be more than Py_ssize_t can handle.
    """

    def __init__(self, ui, *args, **kwargs):
        self.ui = ui
        self._data = open(*args, **kwargs)
        self.seek = self._data.seek
        self.close = self._data.close
        self.write = self._data.write
        self.length = os.fstat(self._data.fileno()).st_size
        self._pos = 0
        self._total = self.length // 1024 * 2

    def read(self, *args, **kwargs):
        ret = self._data.read(*args, **kwargs)
        if not ret:
            self.ui.progress(_('sending'), None)
            return ret
        self._pos += len(ret)
        # We pass double the max for total because we currently have
        # to send the bundle twice in the case of a server that
        # requires authentication. Since we can't know until we try
        # once whether authentication will be required, just lie to
        # the user and maybe the push succeeds suddenly at 50%.
        self.ui.progress(_('sending'), self._pos // 1024,
                         unit=_('kb'), total=self._total)
        return ret

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        self.close()

# moved here from url.py to avoid a cycle
def readauthforuri(ui, uri, user):
    # Read configuration
    groups = {}
    for key, val in ui.configitems('auth'):
        if key in ('cookiefile',):
            continue

        if '.' not in key:
            ui.warn(_("ignoring invalid [auth] key '%s'\n") % key)
            continue
        group, setting = key.rsplit('.', 1)
        gdict = groups.setdefault(group, {})
        if setting in ('username', 'cert', 'key'):
            val = util.expandpath(val)
        gdict[setting] = val

    # Find the best match
    scheme, hostpath = uri.split('://', 1)
    bestuser = None
    bestlen = 0
    bestauth = None
    for group, auth in groups.iteritems():
        if user and user != auth.get('username', user):
            # If a username was set in the URI, the entry username
            # must either match it or be unset
            continue
        prefix = auth.get('prefix')
        if not prefix:
            continue
        p = prefix.split('://', 1)
        if len(p) > 1:
            schemes, prefix = [p[0]], p[1]
        else:
            schemes = (auth.get('schemes') or 'https').split()
        if (prefix == '*' or hostpath.startswith(prefix)) and \
            (len(prefix) > bestlen or (len(prefix) == bestlen and \
                not bestuser and 'username' in auth)) \
             and scheme in schemes:
            bestlen = len(prefix)
            bestauth = group, auth
            bestuser = auth.get('username')
            if user and not bestuser:
                auth['username'] = user
    return bestauth