wireproto: define and implement HTTP handshake to upgrade protocol
authorGregory Szorc <gregory.szorc@gmail.com>
Tue, 10 Apr 2018 14:29:15 -0700
changeset 37557 734515aca84d
parent 37556 b77aa48ba690
child 37558 8a73132214a3
wireproto: define and implement HTTP handshake to upgrade protocol When clients connect to repositories over HTTP, they issue a request to the well-known URL "?cmd=capabilities" to fetch the repository capabilities. This is the handshake portion of the HTTP protocol. This commit defines a mechanism to use that HTTP request to return information about modern server features. If a client sends an X-HgUpgrade-* header containing a list of client-supported API names, the server responds with a response containing information about available services. This includes the normal capabilities string. So if the server doesn't support any newer services, the client can easily fall back. By advertising supported services from clients, server operators can see and log what client support exists in the wild. This will also help with debugging. The response contains the base path to API services. We know there are potential issues with the <repo>/api/ URL space conflicting with hgwebdir and subrepos. By making the API URL dynamic from the perspective of the client, the URL for APIs is not subject to backwards compatibility concerns - at least as long as a ?cmd=capabilities request is made. We've also defined the ``cbor`` client capability for the X-HgProto-* header. This MUST be sent in order to get the modern response from "?cmd=capabilities". During implementation, I initially always sent an application/mercurial-cbor response. However, the handshake mechanism will be more future compatible if the client is in charge of which formats to request. We already perform content negotiation from X-HgProto-*, so keying off this for the capabilities response feels appropriate. In addition, I initially used application/cbor. However, it is conceivable that a non-Mercurial server could serve application/cbor. To rule out this possibility, I've invented a new media type that is Mercurial specific and can't be confused for generic CBOR. Differential Revision: https://phab.mercurial-scm.org/D3242
mercurial/debugcommands.py
mercurial/help/internals/wireprotocol.txt
mercurial/wireprotoserver.py
mercurial/wireprotov2server.py
tests/test-wireproto-command-capabilities.t
--- a/mercurial/debugcommands.py	Tue Apr 10 18:13:28 2018 -0700
+++ b/mercurial/debugcommands.py	Tue Apr 10 14:29:15 2018 -0700
@@ -33,6 +33,9 @@
     nullrev,
     short,
 )
+from .thirdparty import (
+    cbor,
+)
 from . import (
     bundle2,
     changegroup,
@@ -3045,9 +3048,14 @@
             req.get_method = lambda: method
 
             try:
-                opener.open(req).read()
+                res = opener.open(req)
+                body = res.read()
             except util.urlerr.urlerror as e:
                 e.read()
+                continue
+
+            if res.headers.get('Content-Type') == 'application/mercurial-cbor':
+                ui.write(_('cbor> %s\n') % stringutil.pprint(cbor.loads(body)))
 
         elif action == 'close':
             peer.close()
--- a/mercurial/help/internals/wireprotocol.txt	Tue Apr 10 18:13:28 2018 -0700
+++ b/mercurial/help/internals/wireprotocol.txt	Tue Apr 10 14:29:15 2018 -0700
@@ -42,8 +42,44 @@
 The client sends a ``capabilities`` command request (``?cmd=capabilities``)
 as soon as HTTP requests may be issued.
 
-The server responds with a capabilities string, which the client parses to
-learn about the server's abilities.
+By default, the server responds with a version 1 capabilities string, which
+the client parses to learn about the server's abilities. The ``Content-Type``
+for this response is ``application/mercurial-0.1`` or
+``application/mercurial-0.2`` depending on whether the client advertised
+support for version ``0.2`` in its request. (Clients aren't supposed to
+advertise support for ``0.2`` until the capabilities response indicates
+the server's support for that media type. However, a client could
+conceivably cache this metadata and issue the capabilities request in such
+a way to elicit an ``application/mercurial-0.2`` response.)
+
+Clients wishing to switch to a newer API service may send an
+``X-HgUpgrade-<X>`` header containing a space-delimited list of API service
+names the client is capable of speaking. The request MUST also include an
+``X-HgProto-<X>`` header advertising a known serialization format for the
+response. ``cbor`` is currently the only defined serialization format.
+
+If the request contains these headers, the response ``Content-Type`` MAY
+be for a different media type. e.g. ``application/mercurial-cbor`` if the
+client advertises support for CBOR.
+
+The response MUST be deserializable to a map with the following keys:
+
+apibase
+   URL path to API services, relative to the repository root. e.g. ``api/``.
+
+apis
+   A map of API service names to API descriptors. An API descriptor contains
+   more details about that API. In the case of the HTTP Version 2 Transport,
+   it will be the normal response to a ``capabilities`` command.
+
+   Only the services advertised by the client that are also available on
+   the server are advertised.
+
+v1capabilities
+   The capabilities string that would be returned by a version 1 response.
+
+The client can then inspect the server-advertised APIs and decide which
+API to use, including continuing to use the HTTP Version 1 Transport.
 
 HTTP Version 1 Transport
 ------------------------
@@ -123,6 +159,9 @@
 The content of the HTTP response body typically holds text describing the
 error.
 
+The ``application/mercurial-cbor`` media type indicates a CBOR payload
+and should be interpreted as identical to ``application/cbor``.
+
 Behavior of media types is further described in the ``Content Negotiation``
 section below.
 
@@ -1252,6 +1291,12 @@
    Indicates the client supports receiving ``application/mercurial-0.2``
    responses.
 
+cbor
+   Indicates the client supports receiving ``application/mercurial-cbor``
+   responses.
+
+   (Only intended to be used with version 2 transports.)
+
 comp
    Indicates compression formats the client can decode. Value is a list of
    comma delimited strings identifying compression formats ordered from
--- a/mercurial/wireprotoserver.py	Tue Apr 10 18:13:28 2018 -0700
+++ b/mercurial/wireprotoserver.py	Tue Apr 10 14:29:15 2018 -0700
@@ -12,6 +12,9 @@
 import threading
 
 from .i18n import _
+from .thirdparty import (
+    cbor,
+)
 from .thirdparty.zope import (
     interface as zi,
 )
@@ -230,6 +233,18 @@
 
     return True
 
+def _availableapis(repo):
+    apis = set()
+
+    # Registered APIs are made available via config options of the name of
+    # the protocol.
+    for k, v in API_HANDLERS.items():
+        section, option = v['config']
+        if repo.ui.configbool(section, option):
+            apis.add(k)
+
+    return apis
+
 def handlewsgiapirequest(rctx, req, res, checkperm):
     """Handle requests to /api/*."""
     assert req.dispatchparts[0] == b'api'
@@ -247,13 +262,7 @@
     # The URL space is /api/<protocol>/*. The structure of URLs under varies
     # by <protocol>.
 
-    # Registered APIs are made available via config options of the name of
-    # the protocol.
-    availableapis = set()
-    for k, v in API_HANDLERS.items():
-        section, option = v['config']
-        if repo.ui.configbool(section, option):
-            availableapis.add(k)
+    availableapis = _availableapis(repo)
 
     # Requests to /api/ list available APIs.
     if req.dispatchparts == [b'api']:
@@ -287,10 +296,21 @@
                                    req.dispatchparts[2:])
 
 # Maps API name to metadata so custom API can be registered.
+# Keys are:
+#
+# config
+#    Config option that controls whether service is enabled.
+# handler
+#    Callable receiving (rctx, req, res, checkperm, urlparts) that is called
+#    when a request to this API is received.
+# apidescriptor
+#    Callable receiving (req, repo) that is called to obtain an API
+#    descriptor for this service. The response must be serializable to CBOR.
 API_HANDLERS = {
     wireprotov2server.HTTPV2: {
         'config': ('experimental', 'web.api.http-v2'),
         'handler': wireprotov2server.handlehttpv2request,
+        'apidescriptor': wireprotov2server.httpv2apidescriptor,
     },
 }
 
@@ -327,6 +347,54 @@
     opts = {'level': ui.configint('server', 'zliblevel')}
     return HGTYPE, util.compengines['zlib'], opts
 
+def processcapabilitieshandshake(repo, req, res, proto):
+    """Called during a ?cmd=capabilities request.
+
+    If the client is advertising support for a newer protocol, we send
+    a CBOR response with information about available services. If no
+    advertised services are available, we don't handle the request.
+    """
+    # Fall back to old behavior unless the API server is enabled.
+    if not repo.ui.configbool('experimental', 'web.apiserver'):
+        return False
+
+    clientapis = decodevaluefromheaders(req, b'X-HgUpgrade')
+    protocaps = decodevaluefromheaders(req, b'X-HgProto')
+    if not clientapis or not protocaps:
+        return False
+
+    # We currently only support CBOR responses.
+    protocaps = set(protocaps.split(' '))
+    if b'cbor' not in protocaps:
+        return False
+
+    descriptors = {}
+
+    for api in sorted(set(clientapis.split()) & _availableapis(repo)):
+        handler = API_HANDLERS[api]
+
+        descriptorfn = handler.get('apidescriptor')
+        if not descriptorfn:
+            continue
+
+        descriptors[api] = descriptorfn(req, repo)
+
+    v1caps = wireproto.dispatch(repo, proto, 'capabilities')
+    assert isinstance(v1caps, wireprototypes.bytesresponse)
+
+    m = {
+        # TODO allow this to be configurable.
+        'apibase': 'api/',
+        'apis': descriptors,
+        'v1capabilities': v1caps.data,
+    }
+
+    res.status = b'200 OK'
+    res.headers[b'Content-Type'] = b'application/mercurial-cbor'
+    res.setbodybytes(cbor.dumps(m, canonical=True))
+
+    return True
+
 def _callhttp(repo, req, res, proto, cmd):
     # Avoid cycle involving hg module.
     from .hgweb import common as hgwebcommon
@@ -363,6 +431,12 @@
 
     proto.checkperm(wireproto.commands[cmd].permission)
 
+    # Possibly handle a modern client wanting to switch protocols.
+    if (cmd == 'capabilities' and
+        processcapabilitieshandshake(repo, req, res, proto)):
+
+        return
+
     rsp = wireproto.dispatch(repo, proto, cmd)
 
     if isinstance(rsp, bytes):
--- a/mercurial/wireprotov2server.py	Tue Apr 10 18:13:28 2018 -0700
+++ b/mercurial/wireprotov2server.py	Tue Apr 10 14:29:15 2018 -0700
@@ -365,6 +365,11 @@
     def checkperm(self, perm):
         raise NotImplementedError
 
+def httpv2apidescriptor(req, repo):
+    proto = httpv2protocolhandler(req, repo.ui)
+
+    return _capabilitiesv2(repo, proto)
+
 def _capabilitiesv2(repo, proto):
     """Obtain the set of capabilities for version 2 transports.
 
--- a/tests/test-wireproto-command-capabilities.t	Tue Apr 10 18:13:28 2018 -0700
+++ b/tests/test-wireproto-command-capabilities.t	Tue Apr 10 14:29:15 2018 -0700
@@ -1,11 +1,201 @@
   $ . $TESTDIR/wireprotohelpers.sh
 
   $ hg init server
-  $ enablehttpv2 server
+  $ hg -R server serve -p $HGPORT -d --pid-file hg.pid -E error.log
+  $ cat hg.pid > $DAEMON_PIDS
+
+A normal capabilities request is serviced for version 1
+
+  $ sendhttpraw << EOF
+  > httprequest GET ?cmd=capabilities
+  >     user-agent: test
+  > EOF
+  using raw connection to peer
+  s>     GET /?cmd=capabilities HTTP/1.1\r\n
+  s>     Accept-Encoding: identity\r\n
+  s>     user-agent: test\r\n
+  s>     host: $LOCALIP:$HGPORT\r\n (glob)
+  s>     \r\n
+  s> makefile('rb', None)
+  s>     HTTP/1.1 200 Script output follows\r\n
+  s>     Server: testing stub value\r\n
+  s>     Date: $HTTP_DATE$\r\n
+  s>     Content-Type: application/mercurial-0.1\r\n
+  s>     Content-Length: 458\r\n
+  s>     \r\n
+  s>     batch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash
+
+A proper request without the API server enabled returns the legacy response
+
+  $ sendhttpraw << EOF
+  > httprequest GET ?cmd=capabilities
+  >    user-agent: test
+  >    x-hgupgrade-1: foo
+  >    x-hgproto-1: cbor
+  > EOF
+  using raw connection to peer
+  s>     GET /?cmd=capabilities HTTP/1.1\r\n
+  s>     Accept-Encoding: identity\r\n
+  s>     user-agent: test\r\n
+  s>     x-hgproto-1: cbor\r\n
+  s>     x-hgupgrade-1: foo\r\n
+  s>     host: $LOCALIP:$HGPORT\r\n (glob)
+  s>     \r\n
+  s> makefile('rb', None)
+  s>     HTTP/1.1 200 Script output follows\r\n
+  s>     Server: testing stub value\r\n
+  s>     Date: $HTTP_DATE$\r\n
+  s>     Content-Type: application/mercurial-0.1\r\n
+  s>     Content-Length: 458\r\n
+  s>     \r\n
+  s>     batch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash
+
+Restart with just API server enabled. This enables serving the new format.
+
+  $ killdaemons.py
+  $ cat error.log
+
+  $ cat >> server/.hg/hgrc << EOF
+  > [experimental]
+  > web.apiserver = true
+  > EOF
+
   $ hg -R server serve -p $HGPORT -d --pid-file hg.pid -E error.log
   $ cat hg.pid > $DAEMON_PIDS
 
-capabilities request returns an array of capability strings
+X-HgUpgrade-<N> without CBOR advertisement uses legacy response
+
+  $ sendhttpraw << EOF
+  > httprequest GET ?cmd=capabilities
+  >    user-agent: test
+  >    x-hgupgrade-1: foo bar
+  > EOF
+  using raw connection to peer
+  s>     GET /?cmd=capabilities HTTP/1.1\r\n
+  s>     Accept-Encoding: identity\r\n
+  s>     user-agent: test\r\n
+  s>     x-hgupgrade-1: foo bar\r\n
+  s>     host: $LOCALIP:$HGPORT\r\n (glob)
+  s>     \r\n
+  s> makefile('rb', None)
+  s>     HTTP/1.1 200 Script output follows\r\n
+  s>     Server: testing stub value\r\n
+  s>     Date: $HTTP_DATE$\r\n
+  s>     Content-Type: application/mercurial-0.1\r\n
+  s>     Content-Length: 458\r\n
+  s>     \r\n
+  s>     batch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash
+
+X-HgUpgrade-<N> without known serialization in X-HgProto-<N> uses legacy response
+
+  $ sendhttpraw << EOF
+  > httprequest GET ?cmd=capabilities
+  >    user-agent: test
+  >    x-hgupgrade-1: foo bar
+  >    x-hgproto-1: some value
+  > EOF
+  using raw connection to peer
+  s>     GET /?cmd=capabilities HTTP/1.1\r\n
+  s>     Accept-Encoding: identity\r\n
+  s>     user-agent: test\r\n
+  s>     x-hgproto-1: some value\r\n
+  s>     x-hgupgrade-1: foo bar\r\n
+  s>     host: $LOCALIP:$HGPORT\r\n (glob)
+  s>     \r\n
+  s> makefile('rb', None)
+  s>     HTTP/1.1 200 Script output follows\r\n
+  s>     Server: testing stub value\r\n
+  s>     Date: $HTTP_DATE$\r\n
+  s>     Content-Type: application/mercurial-0.1\r\n
+  s>     Content-Length: 458\r\n
+  s>     \r\n
+  s>     batch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash
+
+X-HgUpgrade-<N> + X-HgProto-<N> headers trigger new response format
+
+  $ sendhttpraw << EOF
+  > httprequest GET ?cmd=capabilities
+  >    user-agent: test
+  >    x-hgupgrade-1: foo bar
+  >    x-hgproto-1: cbor
+  > EOF
+  using raw connection to peer
+  s>     GET /?cmd=capabilities HTTP/1.1\r\n
+  s>     Accept-Encoding: identity\r\n
+  s>     user-agent: test\r\n
+  s>     x-hgproto-1: cbor\r\n
+  s>     x-hgupgrade-1: foo bar\r\n
+  s>     host: $LOCALIP:$HGPORT\r\n (glob)
+  s>     \r\n
+  s> makefile('rb', None)
+  s>     HTTP/1.1 200 OK\r\n
+  s>     Server: testing stub value\r\n
+  s>     Date: $HTTP_DATE$\r\n
+  s>     Content-Type: application/mercurial-cbor\r\n
+  s>     Content-Length: 496\r\n
+  s>     \r\n
+  s>     \xa3Dapis\xa0GapibaseDapi/Nv1capabilitiesY\x01\xcabatch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash
+  cbor> {b'apibase': b'api/', b'apis': {}, b'v1capabilities': b'batch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash'}
+
+Restart server to enable HTTPv2
+
+  $ killdaemons.py
+  $ enablehttpv2 server
+  $ hg -R server serve -p $HGPORT -d --pid-file hg.pid -E error.log
+
+Only requested API services are returned
+
+  $ sendhttpraw << EOF
+  > httprequest GET ?cmd=capabilities
+  >    user-agent: test
+  >    x-hgupgrade-1: foo bar
+  >    x-hgproto-1: cbor
+  > EOF
+  using raw connection to peer
+  s>     GET /?cmd=capabilities HTTP/1.1\r\n
+  s>     Accept-Encoding: identity\r\n
+  s>     user-agent: test\r\n
+  s>     x-hgproto-1: cbor\r\n
+  s>     x-hgupgrade-1: foo bar\r\n
+  s>     host: $LOCALIP:$HGPORT\r\n (glob)
+  s>     \r\n
+  s> makefile('rb', None)
+  s>     HTTP/1.1 200 OK\r\n
+  s>     Server: testing stub value\r\n
+  s>     Date: $HTTP_DATE$\r\n
+  s>     Content-Type: application/mercurial-cbor\r\n
+  s>     Content-Length: 496\r\n
+  s>     \r\n
+  s>     \xa3Dapis\xa0GapibaseDapi/Nv1capabilitiesY\x01\xcabatch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash
+  cbor> {b'apibase': b'api/', b'apis': {}, b'v1capabilities': b'batch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash'}
+
+Request for HTTPv2 service returns information about it
+
+  $ sendhttpraw << EOF
+  > httprequest GET ?cmd=capabilities
+  >    user-agent: test
+  >    x-hgupgrade-1: exp-http-v2-0001 foo bar
+  >    x-hgproto-1: cbor
+  > EOF
+  using raw connection to peer
+  s>     GET /?cmd=capabilities HTTP/1.1\r\n
+  s>     Accept-Encoding: identity\r\n
+  s>     user-agent: test\r\n
+  s>     x-hgproto-1: cbor\r\n
+  s>     x-hgupgrade-1: exp-http-v2-0001 foo bar\r\n
+  s>     host: $LOCALIP:$HGPORT\r\n (glob)
+  s>     \r\n
+  s> makefile('rb', None)
+  s>     HTTP/1.1 200 OK\r\n
+  s>     Server: testing stub value\r\n
+  s>     Date: $HTTP_DATE$\r\n
+  s>     Content-Type: application/mercurial-cbor\r\n
+  s>     Content-Length: 879\r\n
+  s>     \r\n
+  s>     \xa3Dapis\xa1Pexp-http-v2-0001\xa2Hcommands\xa7Eheads\xa2Dargs\xa1Jpubliconly\xf4Kpermissions\x81DpullEknown\xa2Dargs\xa1Enodes\x81HdeadbeefKpermissions\x81DpullFlookup\xa2Dargs\xa1CkeyCfooKpermissions\x81DpullGpushkey\xa2Dargs\xa4CkeyCkeyCnewCnewColdColdInamespaceBnsKpermissions\x81DpushHlistkeys\xa2Dargs\xa1InamespaceBnsKpermissions\x81DpullIbranchmap\xa2Dargs\xa0Kpermissions\x81DpullLcapabilities\xa2Dargs\xa0Kpermissions\x81DpullKcompression\x82\xa1DnameDzstd\xa1DnameDzlibGapibaseDapi/Nv1capabilitiesY\x01\xcabatch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash
+  cbor> {b'apibase': b'api/', b'apis': {b'exp-http-v2-0001': {b'commands': {b'branchmap': {b'args': {}, b'permissions': [b'pull']}, b'capabilities': {b'args': {}, b'permissions': [b'pull']}, b'heads': {b'args': {b'publiconly': False}, b'permissions': [b'pull']}, b'known': {b'args': {b'nodes': [b'deadbeef']}, b'permissions': [b'pull']}, b'listkeys': {b'args': {b'namespace': b'ns'}, b'permissions': [b'pull']}, b'lookup': {b'args': {b'key': b'foo'}, b'permissions': [b'pull']}, b'pushkey': {b'args': {b'key': b'key', b'namespace': b'ns', b'new': b'new', b'old': b'old'}, b'permissions': [b'push']}}, b'compression': [{b'name': b'zstd'}, {b'name': b'zlib'}]}}, b'v1capabilities': b'batch branchmap $USUAL_BUNDLE2_CAPS_SERVER$ changegroupsubset compression=$BUNDLE2_COMPRESSIONS$ getbundle httpheader=1024 httpmediatype=0.1rx,0.1tx,0.2tx known lookup pushkey streamreqs=generaldelta,revlogv1 unbundle=HG10GZ,HG10BZ,HG10UN unbundlehash'}
+
+capabilities command returns expected info
 
   $ sendhttpv2peer << EOF
   > command capabilities