mercurial: hgext/clonebundles.py@753b9d43ca81

internals: document compression negotiation As part of adding zstd support to all of the things, we'll need to teach the wire protocol to support non-zlib compression formats. This commit documents how we'll implement that. To understand how we arrived at this proposal, let's look at how things are done today. The wire protocol today doesn't have a unified format. Instead, there is a limited facility for differentiating replies as successful or not. And, each command essentially defines its own response format. A significant deficiency in the current protocol is the lack of payload framing over the SSH transport. In the HTTP transport, chunked transfer is used and the end of an HTTP response body (and the end of a Mercurial command response) can be identified by a 0 length chunk. This is how HTTP chunked transfer works. But in the SSH transport, there is no such framing, at least for certain responses (notably the response to "getbundle" requests). Clients can't simply read until end of stream because the socket is persistent and reused for multiple requests. Clients need to know when they've encountered the end of a request but there is nothing simple for them to key off of to detect this. So what happens is the client must decode the payload (as opposed to being dumb and forwarding frames/packets). This means the payload itself needs to support identifying end of stream. In some cases (bundle2), it also means the payload can encode "error" or "interrupt" events telling the client to e.g. abort processing. The lack of framing on the SSH transport and the transfer of its responsibilities to e.g. bundle2 is a massive layering violation and a wart on the protocol architecture. It needs to be fixed someday by inventing a proper framing protocol. So about compression. The client transport abstractions have a "_callcompressable()" API. This API is called to invoke a remote command that will send a compressible response. The response is essentially a "streaming" response (no framing data at the Mercurial layer) that is fed into a decompressor. On the HTTP transport, the decompressor is zlib and only zlib. There is currently no mechanism for the client to specify an alternate compression format. And, clients don't advertise what compression formats they support or ask the server to send a specific compression format. Instead, it is assumed that non-error responses to "compressible" commands are zlib compressed. On the SSH transport, there is no compression at the Mercurial protocol layer. Instead, compression must be handled by SSH itself (e.g. `ssh -C`) or within the payload data (e.g. bundle compression). For the HTTP transport, adding new compression formats is pretty straightforward. Once you know what decompressor to use, you can stream data into the decompressor until you reach a 0 size HTTP chunk, at which point you are at end of stream. So our wire protocol changes for the HTTP transport are pretty straightforward: the client and server advertise what compression formats they support and an appropriate compression format is chosen. We introduce a new HTTP media type to hold compressed payloads. The header of the payload defines the compression format being used. Whoever is on the receiving end can sniff the first few bytes route to an appropriate decompressor. Support for multiple compression formats is advertised on both server and client. The server advertises a "compression" capability saying which compression formats it supports and in what order they are preferred. Clients advertise their support for multiple compression formats and media types via the introduced "X-HgProto" request header. Strictly speaking, servers don't need to advertise which compression formats they support. But doing so allows clients to fail fast if they don't support any of the formats the server does. This is useful in situations like sending bundles, where the client may have to perform expensive computation before sending data to the server. Rather than simply advertise a list of supported compression formats, we introduce an additional "httpmediatype" server capability advertising which media types the server supports. This means servers are explicit about what formats they exchange. IMO, this is superior to inferring support from other capabilities (like "compression"). By advertising compression support on each request in the "X-HgProto" header and media type and direction at the server level, we are able to gradually transition existing commands/responses to the new media type and possibly compression. Contrast with the old world, where we only supported a single media type and the use of compression was built-in to the semantics of the command on both client and server. In the new world, if "application/mercurial-0.2" is supported, compression is supported. It's that simple. It's worth noting that we explicitly don't use "Accept," "Accept-Encoding," "Content-Encoding," or "Transfer-Encoding" for content negotiation and compression. People knowledgeable of the HTTP specifications will say that we should use these because that's what they are designed to be used for. They have a point and I sympathize with the argument. Earlier versions of this commit even defined supported media types in the "Accept" header. However, my years of experience rolling out services leveraging HTTP has taught me to not trust the HTTP layer, especially if you are going outside the normal spec (such as using a custom "Content-Encoding" value to represent zstd streams). I've seen load balancers, proxies, and other network devices do very bad and unexpected things to HTTP messages (like insisting zlib compressed content is decoded and then re-encoded at a different compression level or even stripping compression completely). I've found that the best way to avoid surprises when writing protocols on top of HTTP is to use HTTP as a dumb transport as much as possible to minimize the chances that an "intelligent" agent between endpoints will muck with your data. While the widespread use of TLS is mitigating many intermediate network agents interfering with HTTP, there are still problems at the edges, with e.g. the origin HTTP server needing to convert HTTP to and from WSGI and buggy or feature-lacking HTTP client implementations. I've found the best way to avoid these problems is to avoid using headers like "Content-Encoding" and to bake as much logic as possible into media types and HTTP message bodies. The protocol changes in this commit do rely on a custom HTTP request header and the "Content-Type" headers. But we used them before, so we shouldn't be increasing our exposure to "bad" HTTP agents. For the SSH transport, we can't easily implement content negotiation to determine compression formats because the SSH transport has no content negotiation capabilities today. And without a framing protocol, we don't know how much data to feed into a decompressor. So in order to implement compression support on the SSH transport, we'd need to invent a mechanism to represent content types and an outer framing protocol to stream data robustly. While I'm fully capable of doing that, it is a lot of work and not something that should be undertaken lightly. My opinion is that if we're going to change the SSH transport protocol, we should take a long hard look at implementing a grand unified protocol that attempts to address all the deficiencies with the existing protocol. While I want this to happen, that would be massive scope bloat standing in the way of zstd support. So, I've decided to take the easy solution: the SSH transport will not gain support for multiple compression formats. Keep in mind it doesn't support *any* compression today. So essentially nothing is changing on the SSH front.


# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

"""advertise pre-generated bundles to seed clones

"clonebundles" is a server-side extension used to advertise the existence
of pre-generated, externally hosted bundle files to clients that are
cloning so that cloning can be faster, more reliable, and require less
resources on the server.

Cloning can be a CPU and I/O intensive operation on servers. Traditionally,
the server, in response to a client's request to clone, dynamically generates
a bundle containing the entire repository content and sends it to the client.
There is no caching on the server and the server will have to redundantly
generate the same outgoing bundle in response to each clone request. For
servers with large repositories or with high clone volume, the load from
clones can make scaling the server challenging and costly.

This extension provides server operators the ability to offload potentially
expensive clone load to an external service. Here's how it works.

1. A server operator establishes a mechanism for making bundle files available
   on a hosting service where Mercurial clients can fetch them.
2. A manifest file listing available bundle URLs and some optional metadata
   is added to the Mercurial repository on the server.
3. A client initiates a clone against a clone bundles aware server.
4. The client sees the server is advertising clone bundles and fetches the
   manifest listing available bundles.
5. The client filters and sorts the available bundles based on what it
   supports and prefers.
6. The client downloads and applies an available bundle from the
   server-specified URL.
7. The client reconnects to the original server and performs the equivalent
   of :hg:`pull` to retrieve all repository data not in the bundle. (The
   repository could have been updated between when the bundle was created
   and when the client started the clone.)

Instead of the server generating full repository bundles for every clone
request, it generates full bundles once and they are subsequently reused to
bootstrap new clones. The server may still transfer data at clone time.
However, this is only data that has been added/changed since the bundle was
created. For large, established repositories, this can reduce server load for
clones to less than 1% of original.

To work, this extension requires the following of server operators:

* Generating bundle files of repository content (typically periodically,
  such as once per day).
* A file server that clients have network access to and that Python knows
  how to talk to through its normal URL handling facility (typically an
  HTTP server).
* A process for keeping the bundles manifest in sync with available bundle
  files.

Strictly speaking, using a static file hosting server isn't required: a server
operator could use a dynamic service for retrieving bundle data. However,
static file hosting services are simple and scalable and should be sufficient
for most needs.

Bundle files can be generated with the :hg:`bundle` command. Typically
:hg:`bundle --all` is used to produce a bundle of the entire repository.

:hg:`debugcreatestreamclonebundle` can be used to produce a special
*streaming clone bundle*. These are bundle files that are extremely efficient
to produce and consume (read: fast). However, they are larger than
traditional bundle formats and require that clients support the exact set
of repository data store formats in use by the repository that created them.
Typically, a newer server can serve data that is compatible with older clients.
However, *streaming clone bundles* don't have this guarantee. **Server
operators need to be aware that newer versions of Mercurial may produce
streaming clone bundles incompatible with older Mercurial versions.**

A server operator is responsible for creating a ``.hg/clonebundles.manifest``
file containing the list of available bundle files suitable for seeding
clones. If this file does not exist, the repository will not advertise the
existence of clone bundles when clients connect.

The manifest file contains a newline (\n) delimited list of entries.

Each line in this file defines an available bundle. Lines have the format:

    <URL> [<key>=<value>[ <key>=<value>]]

That is, a URL followed by an optional, space-delimited list of key=value
pairs describing additional properties of this bundle. Both keys and values
are URI encoded.

Keys in UPPERCASE are reserved for use by Mercurial and are defined below.
All non-uppercase keys can be used by site installations. An example use
for custom properties is to use the *datacenter* attribute to define which
data center a file is hosted in. Clients could then prefer a server in the
data center closest to them.

The following reserved keys are currently defined:

BUNDLESPEC
   A "bundle specification" string that describes the type of the bundle.

   These are string values that are accepted by the "--type" argument of
   :hg:`bundle`.

   The values are parsed in strict mode, which means they must be of the
   "<compression>-<type>" form. See
   mercurial.exchange.parsebundlespec() for more details.

   :hg:`debugbundle --spec` can be used to print the bundle specification
   string for a bundle file. The output of this command can be used verbatim
   for the value of ``BUNDLESPEC`` (it is already escaped).

   Clients will automatically filter out specifications that are unknown or
   unsupported so they won't attempt to download something that likely won't
   apply.

   The actual value doesn't impact client behavior beyond filtering:
   clients will still sniff the bundle type from the header of downloaded
   files.

   **Use of this key is highly recommended**, as it allows clients to
   easily skip unsupported bundles. If this key is not defined, an old
   client may attempt to apply a bundle that it is incapable of reading.

REQUIRESNI
   Whether Server Name Indication (SNI) is required to connect to the URL.
   SNI allows servers to use multiple certificates on the same IP. It is
   somewhat common in CDNs and other hosting providers. Older Python
   versions do not support SNI. Defining this attribute enables clients
   with older Python versions to filter this entry without experiencing
   an opaque SSL failure at connection time.

   If this is defined, it is important to advertise a non-SNI fallback
   URL or clients running old Python releases may not be able to clone
   with the clonebundles facility.

   Value should be "true".

Manifests can contain multiple entries. Assuming metadata is defined, clients
will filter entries from the manifest that they don't support. The remaining
entries are optionally sorted by client preferences
(``experimental.clonebundleprefers`` config option). The client then attempts
to fetch the bundle at the first URL in the remaining list.

**Errors when downloading a bundle will fail the entire clone operation:
clients do not automatically fall back to a traditional clone.** The reason
for this is that if a server is using clone bundles, it is probably doing so
because the feature is necessary to help it scale. In other words, there
is an assumption that clone load will be offloaded to another service and
that the Mercurial server isn't responsible for serving this clone load.
If that other service experiences issues and clients start mass falling back to
the original Mercurial server, the added clone load could overwhelm the server
due to unexpected load and effectively take it offline. Not having clients
automatically fall back to cloning from the original server mitigates this
scenario.

Because there is no automatic Mercurial server fallback on failure of the
bundle hosting service, it is important for server operators to view the bundle
hosting service as an extension of the Mercurial server in terms of
availability and service level agreements: if the bundle hosting service goes
down, so does the ability for clients to clone. Note: clients will see a
message informing them how to bypass the clone bundles facility when a failure
occurs. So server operators should prepare for some people to follow these
instructions when a failure occurs, thus driving more load to the original
Mercurial server when the bundle hosting service fails.
"""

from __future__ import absolute_import

from mercurial import (
    extensions,
    wireproto,
)

testedwith = 'ships-with-hg-core'

def capabilities(orig, repo, proto):
    caps = orig(repo, proto)

    # Only advertise if a manifest exists. This does add some I/O to requests.
    # But this should be cheaper than a wasted network round trip due to
    # missing file.
    if repo.opener.exists('clonebundles.manifest'):
        caps.append('clonebundles')

    return caps

def extsetup(ui):
    extensions.wrapfunction(wireproto, '_capabilities', capabilities)

author	Gregory Szorc <gregory.szorc@gmail.com>
	Sat, 24 Dec 2016 13:56:36 -0700
changeset 30760	753b9d43ca81
parent 29841	d5883fd055c6
child 31146	16d8bec0177d
permissions	-rw-r--r--