view mercurial/help/internals/wireprotocol.txt @ 38510:deff7cf7eefd

wireprotov2: change frame type and name for command response There was hole at frame type value 3. And the frame is better named as a command response. Differential Revision: https://phab.mercurial-scm.org/D3384
author Gregory Szorc <gregory.szorc@gmail.com>
date Sat, 14 Apr 2018 14:37:23 -0700
parents e8fba6d578f0
children 3ea8323d6f95
line wrap: on
line source

The Mercurial wire protocol is a request-response based protocol
with multiple wire representations.

Each request is modeled as a command name, a dictionary of arguments, and
optional raw input. Command arguments and their types are intrinsic
properties of commands. So is the response type of the command. This means
clients can't always send arbitrary arguments to servers and servers can't
return multiple response types.

The protocol is synchronous and does not support multiplexing (concurrent
commands).

Handshake
=========

It is required or common for clients to perform a *handshake* when connecting
to a server. The handshake serves the following purposes:

* Negotiating protocol/transport level options
* Allows the client to learn about server capabilities to influence
  future requests
* Ensures the underlying transport channel is in a *clean* state

An important goal of the handshake is to allow clients to use more modern
wire protocol features. By default, clients must assume they are talking
to an old version of Mercurial server (possibly even the very first
implementation). So, clients should not attempt to call or utilize modern
wire protocol features until they have confirmation that the server
supports them. The handshake implementation is designed to allow both
ends to utilize the latest set of features and capabilities with as
few round trips as possible.

The handshake mechanism varies by transport and protocol and is documented
in the sections below.

HTTP Protocol
=============

Handshake
---------

The client sends a ``capabilities`` command request (``?cmd=capabilities``)
as soon as HTTP requests may be issued.

By default, the server responds with a version 1 capabilities string, which
the client parses to learn about the server's abilities. The ``Content-Type``
for this response is ``application/mercurial-0.1`` or
``application/mercurial-0.2`` depending on whether the client advertised
support for version ``0.2`` in its request. (Clients aren't supposed to
advertise support for ``0.2`` until the capabilities response indicates
the server's support for that media type. However, a client could
conceivably cache this metadata and issue the capabilities request in such
a way to elicit an ``application/mercurial-0.2`` response.)

Clients wishing to switch to a newer API service may send an
``X-HgUpgrade-<X>`` header containing a space-delimited list of API service
names the client is capable of speaking. The request MUST also include an
``X-HgProto-<X>`` header advertising a known serialization format for the
response. ``cbor`` is currently the only defined serialization format.

If the request contains these headers, the response ``Content-Type`` MAY
be for a different media type. e.g. ``application/mercurial-cbor`` if the
client advertises support for CBOR.

The response MUST be deserializable to a map with the following keys:

apibase
   URL path to API services, relative to the repository root. e.g. ``api/``.

apis
   A map of API service names to API descriptors. An API descriptor contains
   more details about that API. In the case of the HTTP Version 2 Transport,
   it will be the normal response to a ``capabilities`` command.

   Only the services advertised by the client that are also available on
   the server are advertised.

v1capabilities
   The capabilities string that would be returned by a version 1 response.

The client can then inspect the server-advertised APIs and decide which
API to use, including continuing to use the HTTP Version 1 Transport.

HTTP Version 1 Transport
------------------------

Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are
sent to the base URL of the repository with the command name sent in
the ``cmd`` query string parameter. e.g.
``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET``
or ``POST`` depending on the command and whether there is a request
body.

Command arguments can be sent multiple ways.

The simplest is part of the URL query string using ``x-www-form-urlencoded``
encoding (see Python's ``urllib.urlencode()``. However, many servers impose
length limitations on the URL. So this mechanism is typically only used if
the server doesn't support other mechanisms.

If the server supports the ``httpheader`` capability, command arguments can
be sent in HTTP request headers named ``X-HgArg-<N>`` where ``<N>`` is an
integer starting at 1. A ``x-www-form-urlencoded`` representation of the
arguments is obtained. This full string is then split into chunks and sent
in numbered ``X-HgArg-<N>`` headers. The maximum length of each HTTP header
is defined by the server in the ``httpheader`` capability value, which defaults
to ``1024``. The server reassembles the encoded arguments string by
concatenating the ``X-HgArg-<N>`` headers then URL decodes them into a
dictionary.

The list of ``X-HgArg-<N>`` headers should be added to the ``Vary`` request
header to instruct caches to take these headers into consideration when caching
requests.

If the server supports the ``httppostargs`` capability, the client
may send command arguments in the HTTP request body as part of an
HTTP POST request. The command arguments will be URL encoded just like
they would for sending them via HTTP headers. However, no splitting is
performed: the raw arguments are included in the HTTP request body.

The client sends a ``X-HgArgs-Post`` header with the string length of the
encoded arguments data. Additional data may be included in the HTTP
request body immediately following the argument data. The offset of the
non-argument data is defined by the ``X-HgArgs-Post`` header. The
``X-HgArgs-Post`` header is not required if there is no argument data.

Additional command data can be sent as part of the HTTP request body. The
default ``Content-Type`` when sending data is ``application/mercurial-0.1``.
A ``Content-Length`` header is currently always sent.

Example HTTP requests::

    GET /repo?cmd=capabilities
    X-HgArg-1: foo=bar&baz=hello%20world

The request media type should be chosen based on server support. If the
``httpmediatype`` server capability is present, the client should send
the newest mutually supported media type. If this capability is absent,
the client must assume the server only supports the
``application/mercurial-0.1`` media type.

The ``Content-Type`` HTTP response header identifies the response as coming
from Mercurial and can also be used to signal an error has occurred.

The ``application/mercurial-*`` media types indicate a generic Mercurial
data type.

The ``application/mercurial-0.1`` media type is raw Mercurial data. It is the
predecessor of the format below.

The ``application/mercurial-0.2`` media type is compression framed Mercurial
data. The first byte of the payload indicates the length of the compression
format identifier that follows. Next are N bytes indicating the compression
format. e.g. ``zlib``. The remaining bytes are compressed according to that
compression format. The decompressed data behaves the same as with
``application/mercurial-0.1``.

The ``application/hg-error`` media type indicates a generic error occurred.
The content of the HTTP response body typically holds text describing the
error.

The ``application/mercurial-cbor`` media type indicates a CBOR payload
and should be interpreted as identical to ``application/cbor``.

Behavior of media types is further described in the ``Content Negotiation``
section below.

Clients should issue a ``User-Agent`` request header that identifies the client.
The server should not use the ``User-Agent`` for feature detection.

A command returning a ``string`` response issues a
``application/mercurial-0.*`` media type and the HTTP response body contains
the raw string value (after compression decoding, if used). A
``Content-Length`` header is typically issued, but not required.

A command returning a ``stream`` response issues a
``application/mercurial-0.*`` media type and the HTTP response is typically
using *chunked transfer* (``Transfer-Encoding: chunked``).

HTTP Version 2 Transport
------------------------

**Experimental - feature under active development**

Version 2 of the HTTP protocol is exposed under the ``/api/*`` URL space.
It's final API name is not yet formalized.

Commands are triggered by sending HTTP POST requests against URLs of the
form ``<permission>/<command>``, where ``<permission>`` is ``ro`` or
``rw``, meaning read-only and read-write, respectively and ``<command>``
is a named wire protocol command.

Non-POST request methods MUST be rejected by the server with an HTTP
405 response.

Commands that modify repository state in meaningful ways MUST NOT be
exposed under the ``ro`` URL prefix. All available commands MUST be
available under the ``rw`` URL prefix.

Server adminstrators MAY implement blanket HTTP authentication keyed
off the URL prefix. For example, a server may require authentication
for all ``rw/*`` URLs and let unauthenticated requests to ``ro/*``
URL proceed. A server MAY issue an HTTP 401, 403, or 407 response
in accordance with RFC 7235. Clients SHOULD recognize the HTTP Basic
(RFC 7617) and Digest (RFC 7616) authentication schemes. Clients SHOULD
make an attempt to recognize unknown schemes using the
``WWW-Authenticate`` response header on a 401 response, as defined by
RFC 7235.

Read-only commands are accessible under ``rw/*`` URLs so clients can
signal the intent of the operation very early in the connection
lifecycle. For example, a ``push`` operation - which consists of
various read-only commands mixed with at least one read-write command -
can perform all commands against ``rw/*`` URLs so that any server-side
authentication requirements are discovered upon attempting the first
command - not potentially several commands into the exchange. This
allows clients to fail faster or prompt for credentials as soon as the
exchange takes place. This provides a better end-user experience.

Requests to unknown commands or URLS result in an HTTP 404.
TODO formally define response type, how error is communicated, etc.

HTTP request and response bodies use the *Unified Frame-Based Protocol*
(defined below) for media exchange. The entirety of the HTTP message
body is 0 or more frames as defined by this protocol.

Clients and servers MUST advertise the ``TBD`` media type via the
``Content-Type`` request and response headers. In addition, clients MUST
advertise this media type value in their ``Accept`` request header in all
requests.
TODO finalize the media type. For now, it is defined in wireprotoserver.py.

Servers receiving requests without an ``Accept`` header SHOULD respond with
an HTTP 406.

Servers receiving requests with an invalid ``Content-Type`` header SHOULD
respond with an HTTP 415.

The command to run is specified in the POST payload as defined by the
*Unified Frame-Based Protocol*. This is redundant with data already
encoded in the URL. This is by design, so server operators can have
better understanding about server activity from looking merely at
HTTP access logs.

In most circumstances, the command specified in the URL MUST match
the command specified in the frame-based payload or the server will
respond with an error. The exception to this is the special
``multirequest`` URL. (See below.) In addition, HTTP requests
are limited to one command invocation. The exception is the special
``multirequest`` URL.

The ``multirequest`` command endpoints (``ro/multirequest`` and
``rw/multirequest``) are special in that they allow the execution of
*any* command and allow the execution of multiple commands. If the
HTTP request issues multiple commands across multiple frames, all
issued commands will be processed by the server. Per the defined
behavior of the *Unified Frame-Based Protocol*, commands may be
issued interleaved and responses may come back in a different order
than they were issued. Clients MUST be able to deal with this.

SSH Protocol
============

Handshake
---------

For all clients, the handshake consists of the client sending 1 or more
commands to the server using version 1 of the transport. Servers respond
to commands they know how to respond to and send an empty response (``0\n``)
for unknown commands (per standard behavior of version 1 of the transport).
Clients then typically look for a response to the newest sent command to
determine which transport version to use and what the available features for
the connection and server are.

Preceding any response from client-issued commands, the server may print
non-protocol output. It is common for SSH servers to print banners, message
of the day announcements, etc when clients connect. It is assumed that any
such *banner* output will precede any Mercurial server output. So clients
must be prepared to handle server output on initial connect that isn't
in response to any client-issued command and doesn't conform to Mercurial's
wire protocol. This *banner* output should only be on stdout. However,
some servers may send output on stderr.

Pre 0.9.1 clients issue a ``between`` command with the ``pairs`` argument
having the value
``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``.

The ``between`` command has been supported since the original Mercurial
SSH server. Requesting the empty range will return a ``\n`` string response,
which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline
followed by the value, which happens to be a newline).

For pre 0.9.1 clients and all servers, the exchange looks like::

   c: between\n
   c: pairs 81\n
   c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
   s: 1\n
   s: \n

0.9.1+ clients send a ``hello`` command (with no arguments) before the
``between`` command. The response to this command allows clients to
discover server capabilities and settings.

An example exchange between 0.9.1+ clients and a ``hello`` aware server looks
like::

   c: hello\n
   c: between\n
   c: pairs 81\n
   c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
   s: 324\n
   s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
   s: 1\n
   s: \n

And a similar scenario but with servers sending a banner on connect::

   c: hello\n
   c: between\n
   c: pairs 81\n
   c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
   s: welcome to the server\n
   s: if you find any issues, email someone@somewhere.com\n
   s: 324\n
   s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
   s: 1\n
   s: \n

Note that output from the ``hello`` command is terminated by a ``\n``. This is
part of the response payload and not part of the wire protocol adding a newline
after responses. In other words, the length of the response contains the
trailing ``\n``.

Clients supporting version 2 of the SSH transport send a line beginning
with ``upgrade`` before the ``hello`` and ``between`` commands. The line
(which isn't a well-formed command line because it doesn't consist of a
single command name) serves to both communicate the client's intent to
switch to transport version 2 (transports are version 1 by default) as
well as to advertise the client's transport-level capabilities so the
server may satisfy that request immediately.

The upgrade line has the form:

    upgrade <token> <transport capabilities>

That is the literal string ``upgrade`` followed by a space, followed by
a randomly generated string, followed by a space, followed by a string
denoting the client's transport capabilities.

The token can be anything. However, a random UUID is recommended. (Use
of version 4 UUIDs is recommended because version 1 UUIDs can leak the
client's MAC address.)

The transport capabilities string is a URL/percent encoded string
containing key-value pairs defining the client's transport-level
capabilities. The following capabilities are defined:

proto
   A comma-delimited list of transport protocol versions the client
   supports. e.g. ``ssh-v2``.

If the server does not recognize the ``upgrade`` line, it should issue
an empty response and continue processing the ``hello`` and ``between``
commands. Here is an example handshake between a version 2 aware client
and a non version 2 aware server:

   c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
   c: hello\n
   c: between\n
   c: pairs 81\n
   c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
   s: 0\n
   s: 324\n
   s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n
   s: 1\n
   s: \n

(The initial ``0\n`` line from the server indicates an empty response to
the unknown ``upgrade ..`` command/line.)

If the server recognizes the ``upgrade`` line and is willing to satisfy that
upgrade request, it replies to with a payload of the following form:

   upgraded <token> <transport name>\n

This line is the literal string ``upgraded``, a space, the token that was
specified by the client in its ``upgrade ...`` request line, a space, and the
name of the transport protocol that was chosen by the server. The transport
name MUST match one of the names the client specified in the ``proto`` field
of its ``upgrade ...`` request line.

If a server issues an ``upgraded`` response, it MUST also read and ignore
the lines associated with the ``hello`` and ``between`` command requests
that were issued by the server. It is assumed that the negotiated transport
will respond with equivalent requested information following the transport
handshake.

All data following the ``\n`` terminating the ``upgraded`` line is the
domain of the negotiated transport. It is common for the data immediately
following to contain additional metadata about the state of the transport and
the server. However, this isn't strictly speaking part of the transport
handshake and isn't covered by this section.

Here is an example handshake between a version 2 aware client and a version
2 aware server:

   c:  upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2
   c:  hello\n
   c:  between\n
   c:  pairs 81\n
   c:  0000000000000000000000000000000000000000-0000000000000000000000000000000000000000
   s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
   s: <additional transport specific data>

The client-issued token that is echoed in the response provides a more
resilient mechanism for differentiating *banner* output from Mercurial
output. In version 1, properly formatted banner output could get confused
for Mercurial server output. By submitting a randomly generated token
that is then present in the response, the client can look for that token
in response lines and have reasonable certainty that the line did not
originate from a *banner* message.

SSH Version 1 Transport
-----------------------

The SSH transport (version 1) is a custom text-based protocol suitable for
use over any bi-directional stream transport. It is most commonly used with
SSH.

A SSH transport server can be started with ``hg serve --stdio``. The stdin,
stderr, and stdout file descriptors of the started process are used to exchange
data. When Mercurial connects to a remote server over SSH, it actually starts
a ``hg serve --stdio`` process on the remote server.

Commands are issued by sending the command name followed by a trailing newline
``\n`` to the server. e.g. ``capabilities\n``.

Command arguments are sent in the following format::

    <argument> <length>\n<value>

That is, the argument string name followed by a space followed by the
integer length of the value (expressed as a string) followed by a newline
(``\n``) followed by the raw argument value.

Dictionary arguments are encoded differently::

    <argument> <# elements>\n
    <key1> <length1>\n<value1>
    <key2> <length2>\n<value2>
    ...

Non-argument data is sent immediately after the final argument value. It is
encoded in chunks::

    <length>\n<data>

Each command declares a list of supported arguments and their types. If a
client sends an unknown argument to the server, the server should abort
immediately. The special argument ``*`` in a command's definition indicates
that all argument names are allowed.

The definition of supported arguments and types is initially made when a
new command is implemented. The client and server must initially independently
agree on the arguments and their types. This initial set of arguments can be
supplemented through the presence of *capabilities* advertised by the server.

Each command has a defined expected response type.

A ``string`` response type is a length framed value. The response consists of
the string encoded integer length of a value followed by a newline (``\n``)
followed by the value. Empty values are allowed (and are represented as
``0\n``).

A ``stream`` response type consists of raw bytes of data. There is no framing.

A generic error response type is also supported. It consists of a an error
message written to ``stderr`` followed by ``\n-\n``. In addition, ``\n`` is
written to ``stdout``.

If the server receives an unknown command, it will send an empty ``string``
response.

The server terminates if it receives an empty command (a ``\n`` character).

If the server announces support for the ``protocaps`` capability, the client
should issue a ``protocaps`` command after the initial handshake to annonunce
its own capabilities. The client capabilities are persistent.

SSH Version 2 Transport
-----------------------

**Experimental and under development**

Version 2 of the SSH transport behaves identically to version 1 of the SSH
transport with the exception of handshake semantics. See above for how
version 2 of the SSH transport is negotiated.

Immediately following the ``upgraded`` line signaling a switch to version
2 of the SSH protocol, the server automatically sends additional details
about the capabilities of the remote server. This has the form:

   <integer length of value>\n
   capabilities: ...\n

e.g.

   s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n
   s: 240\n
   s: capabilities: known getbundle batch ...\n

Following capabilities advertisement, the peers communicate using version
1 of the SSH transport.

Unified Frame-Based Protocol
============================

**Experimental and under development**

The *Unified Frame-Based Protocol* is a communications protocol between
Mercurial peers. The protocol aims to be mostly transport agnostic
(works similarly on HTTP, SSH, etc).

To operate the protocol, a bi-directional, half-duplex pipe supporting
ordered sends and receives is required. That is, each peer has one pipe
for sending data and another for receiving.

All data is read and written in atomic units called *frames*. These
are conceptually similar to TCP packets. Higher-level functionality
is built on the exchange and processing of frames.

All frames are associated with a *stream*. A *stream* provides a
unidirectional grouping of frames. Streams facilitate two goals:
content encoding and parallelism. There is a dedicated section on
streams below.

The protocol is request-response based: the client issues requests to
the server, which issues replies to those requests. Server-initiated
messaging is not currently supported, but this specification carves
out room to implement it.

All frames are associated with a numbered request. Frames can thus
be logically grouped by their request ID.

Frames begin with an 8 octet header followed by a variable length
payload::

    +------------------------------------------------+
    |                 Length (24)                    |
    +--------------------------------+---------------+
    |         Request ID (16)        | Stream ID (8) |
    +------------------+-------------+---------------+
    | Stream Flags (8) |
    +-----------+------+
    | Type (4)  |
    +-----------+
    | Flags (4) |
    +===========+===================================================|
    |                     Frame Payload (0...)                    ...
    +---------------------------------------------------------------+

The length of the frame payload is expressed as an unsigned 24 bit
little endian integer. Values larger than 65535 MUST NOT be used unless
given permission by the server as part of the negotiated capabilities
during the handshake. The frame header is not part of the advertised
frame length. The payload length is the over-the-wire length. If there
is content encoding applied to the payload as part of the frame's stream,
the length is the output of that content encoding, not the input.

The 16-bit ``Request ID`` field denotes the integer request identifier,
stored as an unsigned little endian integer. Odd numbered requests are
client-initiated. Even numbered requests are server-initiated. This
refers to where the *request* was initiated - not where the *frame* was
initiated, so servers will send frames with odd ``Request ID`` in
response to client-initiated requests. Implementations are advised to
start ordering request identifiers at ``1`` and ``0``, increment by
``2``, and wrap around if all available numbers have been exhausted.

The 8-bit ``Stream ID`` field denotes the stream that the frame is
associated with. Frames belonging to a stream may have content
encoding applied and the receiver may need to decode the raw frame
payload to obtain the original data. Odd numbered IDs are
client-initiated. Even numbered IDs are server-initiated.

The 8-bit ``Stream Flags`` field defines stream processing semantics.
See the section on streams below.

The 4-bit ``Type`` field denotes the type of frame being sent.

The 4-bit ``Flags`` field defines special, per-type attributes for
the frame.

The sections below define the frame types and their behavior.

Command Request (``0x01``)
--------------------------

This frame contains a request to run a command.

The payload consists of a CBOR map defining the command request. The
bytestring keys of that map are:

name
   Name of the command that should be executed (bytestring).
args
   Map of bytestring keys to various value types containing the named
   arguments to this command.

   Each command defines its own set of argument names and their expected
   types.

This frame type MUST ONLY be sent from clients to servers: it is illegal
for a server to send this frame to a client.

The following flag values are defined for this type:

0x01
   New command request. When set, this frame represents the beginning
   of a new request to run a command. The ``Request ID`` attached to this
   frame MUST NOT be active.
0x02
   Command request continuation. When set, this frame is a continuation
   from a previous command request frame for its ``Request ID``. This
   flag is set when the CBOR data for a command request does not fit
   in a single frame.
0x04
   Additional frames expected. When set, the command request didn't fit
   into a single frame and additional CBOR data follows in a subsequent
   frame.
0x08
   Command data frames expected. When set, command data frames are
   expected to follow the final command request frame for this request.

``0x01`` MUST be set on the initial command request frame for a
``Request ID``.

``0x01`` or ``0x02`` MUST be set to indicate this frame's role in
a series of command request frames.

If command data frames are to be sent, ``0x08`` MUST be set on ALL
command request frames.

Command Data (``0x02``)
-----------------------

This frame contains raw data for a command.

Most commands can be executed by specifying arguments. However,
arguments have an upper bound to their length. For commands that
accept data that is beyond this length or whose length isn't known
when the command is initially sent, they will need to stream
arbitrary data to the server. This frame type facilitates the sending
of this data.

The payload of this frame type consists of a stream of raw data to be
consumed by the command handler on the server. The format of the data
is command specific.

The following flag values are defined for this type:

0x01
   Command data continuation. When set, the data for this command
   continues into a subsequent frame.

0x02
   End of data. When set, command data has been fully sent to the
   server. The command has been fully issued and no new data for this
   command will be sent. The next frame will belong to a new command.

Command Response Data (``0x03``)
--------------------------------

This frame contains response data to an issued command.

Response data ALWAYS consists of a series of 0 or more CBOR encoded
values. A CBOR value may be using indefinite length encoding. And the
bytes constituting the value may span several frames.

The following flag values are defined for this type:

0x01
   Data continuation. When set, an additional frame containing response data
   will follow.
0x02
   End of data. When set, the response data has been fully sent and
   no additional frames for this response will be sent.

The ``0x01`` flag is mutually exclusive with the ``0x02`` flag.

Error Response (``0x05``)
-------------------------

An error occurred when processing a request. This could indicate
a protocol-level failure or an application level failure depending
on the flags for this message type.

The payload for this type is an error message that should be
displayed to the user.

The following flag values are defined for this type:

0x01
   The error occurred at the transport/protocol level. If set, the
   connection should be closed.
0x02
   The error occurred at the application level. e.g. invalid command.

Human Output Side-Channel (``0x06``)
------------------------------------

This frame contains a message that is intended to be displayed to
people. Whereas most frames communicate machine readable data, this