Mercurial > hg > mercurial-source
view mercurial/help/internals/wireprotocol.txt @ 38510:deff7cf7eefd
wireprotov2: change frame type and name for command response
There was hole at frame type value 3. And the frame is better
named as a command response.
Differential Revision: https://phab.mercurial-scm.org/D3384
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Sat, 14 Apr 2018 14:37:23 -0700 |
parents | e8fba6d578f0 |
children | 3ea8323d6f95 |
line wrap: on
line source
The Mercurial wire protocol is a request-response based protocol with multiple wire representations. Each request is modeled as a command name, a dictionary of arguments, and optional raw input. Command arguments and their types are intrinsic properties of commands. So is the response type of the command. This means clients can't always send arbitrary arguments to servers and servers can't return multiple response types. The protocol is synchronous and does not support multiplexing (concurrent commands). Handshake ========= It is required or common for clients to perform a *handshake* when connecting to a server. The handshake serves the following purposes: * Negotiating protocol/transport level options * Allows the client to learn about server capabilities to influence future requests * Ensures the underlying transport channel is in a *clean* state An important goal of the handshake is to allow clients to use more modern wire protocol features. By default, clients must assume they are talking to an old version of Mercurial server (possibly even the very first implementation). So, clients should not attempt to call or utilize modern wire protocol features until they have confirmation that the server supports them. The handshake implementation is designed to allow both ends to utilize the latest set of features and capabilities with as few round trips as possible. The handshake mechanism varies by transport and protocol and is documented in the sections below. HTTP Protocol ============= Handshake --------- The client sends a ``capabilities`` command request (``?cmd=capabilities``) as soon as HTTP requests may be issued. By default, the server responds with a version 1 capabilities string, which the client parses to learn about the server's abilities. The ``Content-Type`` for this response is ``application/mercurial-0.1`` or ``application/mercurial-0.2`` depending on whether the client advertised support for version ``0.2`` in its request. (Clients aren't supposed to advertise support for ``0.2`` until the capabilities response indicates the server's support for that media type. However, a client could conceivably cache this metadata and issue the capabilities request in such a way to elicit an ``application/mercurial-0.2`` response.) Clients wishing to switch to a newer API service may send an ``X-HgUpgrade-<X>`` header containing a space-delimited list of API service names the client is capable of speaking. The request MUST also include an ``X-HgProto-<X>`` header advertising a known serialization format for the response. ``cbor`` is currently the only defined serialization format. If the request contains these headers, the response ``Content-Type`` MAY be for a different media type. e.g. ``application/mercurial-cbor`` if the client advertises support for CBOR. The response MUST be deserializable to a map with the following keys: apibase URL path to API services, relative to the repository root. e.g. ``api/``. apis A map of API service names to API descriptors. An API descriptor contains more details about that API. In the case of the HTTP Version 2 Transport, it will be the normal response to a ``capabilities`` command. Only the services advertised by the client that are also available on the server are advertised. v1capabilities The capabilities string that would be returned by a version 1 response. The client can then inspect the server-advertised APIs and decide which API to use, including continuing to use the HTTP Version 1 Transport. HTTP Version 1 Transport ------------------------ Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are sent to the base URL of the repository with the command name sent in the ``cmd`` query string parameter. e.g. ``https://example.com/repo?cmd=capabilities``. The HTTP method is ``GET`` or ``POST`` depending on the command and whether there is a request body. Command arguments can be sent multiple ways. The simplest is part of the URL query string using ``x-www-form-urlencoded`` encoding (see Python's ``urllib.urlencode()``. However, many servers impose length limitations on the URL. So this mechanism is typically only used if the server doesn't support other mechanisms. If the server supports the ``httpheader`` capability, command arguments can be sent in HTTP request headers named ``X-HgArg-<N>`` where ``<N>`` is an integer starting at 1. A ``x-www-form-urlencoded`` representation of the arguments is obtained. This full string is then split into chunks and sent in numbered ``X-HgArg-<N>`` headers. The maximum length of each HTTP header is defined by the server in the ``httpheader`` capability value, which defaults to ``1024``. The server reassembles the encoded arguments string by concatenating the ``X-HgArg-<N>`` headers then URL decodes them into a dictionary. The list of ``X-HgArg-<N>`` headers should be added to the ``Vary`` request header to instruct caches to take these headers into consideration when caching requests. If the server supports the ``httppostargs`` capability, the client may send command arguments in the HTTP request body as part of an HTTP POST request. The command arguments will be URL encoded just like they would for sending them via HTTP headers. However, no splitting is performed: the raw arguments are included in the HTTP request body. The client sends a ``X-HgArgs-Post`` header with the string length of the encoded arguments data. Additional data may be included in the HTTP request body immediately following the argument data. The offset of the non-argument data is defined by the ``X-HgArgs-Post`` header. The ``X-HgArgs-Post`` header is not required if there is no argument data. Additional command data can be sent as part of the HTTP request body. The default ``Content-Type`` when sending data is ``application/mercurial-0.1``. A ``Content-Length`` header is currently always sent. Example HTTP requests:: GET /repo?cmd=capabilities X-HgArg-1: foo=bar&baz=hello%20world The request media type should be chosen based on server support. If the ``httpmediatype`` server capability is present, the client should send the newest mutually supported media type. If this capability is absent, the client must assume the server only supports the ``application/mercurial-0.1`` media type. The ``Content-Type`` HTTP response header identifies the response as coming from Mercurial and can also be used to signal an error has occurred. The ``application/mercurial-*`` media types indicate a generic Mercurial data type. The ``application/mercurial-0.1`` media type is raw Mercurial data. It is the predecessor of the format below. The ``application/mercurial-0.2`` media type is compression framed Mercurial data. The first byte of the payload indicates the length of the compression format identifier that follows. Next are N bytes indicating the compression format. e.g. ``zlib``. The remaining bytes are compressed according to that compression format. The decompressed data behaves the same as with ``application/mercurial-0.1``. The ``application/hg-error`` media type indicates a generic error occurred. The content of the HTTP response body typically holds text describing the error. The ``application/mercurial-cbor`` media type indicates a CBOR payload and should be interpreted as identical to ``application/cbor``. Behavior of media types is further described in the ``Content Negotiation`` section below. Clients should issue a ``User-Agent`` request header that identifies the client. The server should not use the ``User-Agent`` for feature detection. A command returning a ``string`` response issues a ``application/mercurial-0.*`` media type and the HTTP response body contains the raw string value (after compression decoding, if used). A ``Content-Length`` header is typically issued, but not required. A command returning a ``stream`` response issues a ``application/mercurial-0.*`` media type and the HTTP response is typically using *chunked transfer* (``Transfer-Encoding: chunked``). HTTP Version 2 Transport ------------------------ **Experimental - feature under active development** Version 2 of the HTTP protocol is exposed under the ``/api/*`` URL space. It's final API name is not yet formalized. Commands are triggered by sending HTTP POST requests against URLs of the form ``<permission>/<command>``, where ``<permission>`` is ``ro`` or ``rw``, meaning read-only and read-write, respectively and ``<command>`` is a named wire protocol command. Non-POST request methods MUST be rejected by the server with an HTTP 405 response. Commands that modify repository state in meaningful ways MUST NOT be exposed under the ``ro`` URL prefix. All available commands MUST be available under the ``rw`` URL prefix. Server adminstrators MAY implement blanket HTTP authentication keyed off the URL prefix. For example, a server may require authentication for all ``rw/*`` URLs and let unauthenticated requests to ``ro/*`` URL proceed. A server MAY issue an HTTP 401, 403, or 407 response in accordance with RFC 7235. Clients SHOULD recognize the HTTP Basic (RFC 7617) and Digest (RFC 7616) authentication schemes. Clients SHOULD make an attempt to recognize unknown schemes using the ``WWW-Authenticate`` response header on a 401 response, as defined by RFC 7235. Read-only commands are accessible under ``rw/*`` URLs so clients can signal the intent of the operation very early in the connection lifecycle. For example, a ``push`` operation - which consists of various read-only commands mixed with at least one read-write command - can perform all commands against ``rw/*`` URLs so that any server-side authentication requirements are discovered upon attempting the first command - not potentially several commands into the exchange. This allows clients to fail faster or prompt for credentials as soon as the exchange takes place. This provides a better end-user experience. Requests to unknown commands or URLS result in an HTTP 404. TODO formally define response type, how error is communicated, etc. HTTP request and response bodies use the *Unified Frame-Based Protocol* (defined below) for media exchange. The entirety of the HTTP message body is 0 or more frames as defined by this protocol. Clients and servers MUST advertise the ``TBD`` media type via the ``Content-Type`` request and response headers. In addition, clients MUST advertise this media type value in their ``Accept`` request header in all requests. TODO finalize the media type. For now, it is defined in wireprotoserver.py. Servers receiving requests without an ``Accept`` header SHOULD respond with an HTTP 406. Servers receiving requests with an invalid ``Content-Type`` header SHOULD respond with an HTTP 415. The command to run is specified in the POST payload as defined by the *Unified Frame-Based Protocol*. This is redundant with data already encoded in the URL. This is by design, so server operators can have better understanding about server activity from looking merely at HTTP access logs. In most circumstances, the command specified in the URL MUST match the command specified in the frame-based payload or the server will respond with an error. The exception to this is the special ``multirequest`` URL. (See below.) In addition, HTTP requests are limited to one command invocation. The exception is the special ``multirequest`` URL. The ``multirequest`` command endpoints (``ro/multirequest`` and ``rw/multirequest``) are special in that they allow the execution of *any* command and allow the execution of multiple commands. If the HTTP request issues multiple commands across multiple frames, all issued commands will be processed by the server. Per the defined behavior of the *Unified Frame-Based Protocol*, commands may be issued interleaved and responses may come back in a different order than they were issued. Clients MUST be able to deal with this. SSH Protocol ============ Handshake --------- For all clients, the handshake consists of the client sending 1 or more commands to the server using version 1 of the transport. Servers respond to commands they know how to respond to and send an empty response (``0\n``) for unknown commands (per standard behavior of version 1 of the transport). Clients then typically look for a response to the newest sent command to determine which transport version to use and what the available features for the connection and server are. Preceding any response from client-issued commands, the server may print non-protocol output. It is common for SSH servers to print banners, message of the day announcements, etc when clients connect. It is assumed that any such *banner* output will precede any Mercurial server output. So clients must be prepared to handle server output on initial connect that isn't in response to any client-issued command and doesn't conform to Mercurial's wire protocol. This *banner* output should only be on stdout. However, some servers may send output on stderr. Pre 0.9.1 clients issue a ``between`` command with the ``pairs`` argument having the value ``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``. The ``between`` command has been supported since the original Mercurial SSH server. Requesting the empty range will return a ``\n`` string response, which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline followed by the value, which happens to be a newline). For pre 0.9.1 clients and all servers, the exchange looks like:: c: between\n c: pairs 81\n c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 s: 1\n s: \n 0.9.1+ clients send a ``hello`` command (with no arguments) before the ``between`` command. The response to this command allows clients to discover server capabilities and settings. An example exchange between 0.9.1+ clients and a ``hello`` aware server looks like:: c: hello\n c: between\n c: pairs 81\n c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 s: 324\n s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n s: 1\n s: \n And a similar scenario but with servers sending a banner on connect:: c: hello\n c: between\n c: pairs 81\n c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 s: welcome to the server\n s: if you find any issues, email someone@somewhere.com\n s: 324\n s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n s: 1\n s: \n Note that output from the ``hello`` command is terminated by a ``\n``. This is part of the response payload and not part of the wire protocol adding a newline after responses. In other words, the length of the response contains the trailing ``\n``. Clients supporting version 2 of the SSH transport send a line beginning with ``upgrade`` before the ``hello`` and ``between`` commands. The line (which isn't a well-formed command line because it doesn't consist of a single command name) serves to both communicate the client's intent to switch to transport version 2 (transports are version 1 by default) as well as to advertise the client's transport-level capabilities so the server may satisfy that request immediately. The upgrade line has the form: upgrade <token> <transport capabilities> That is the literal string ``upgrade`` followed by a space, followed by a randomly generated string, followed by a space, followed by a string denoting the client's transport capabilities. The token can be anything. However, a random UUID is recommended. (Use of version 4 UUIDs is recommended because version 1 UUIDs can leak the client's MAC address.) The transport capabilities string is a URL/percent encoded string containing key-value pairs defining the client's transport-level capabilities. The following capabilities are defined: proto A comma-delimited list of transport protocol versions the client supports. e.g. ``ssh-v2``. If the server does not recognize the ``upgrade`` line, it should issue an empty response and continue processing the ``hello`` and ``between`` commands. Here is an example handshake between a version 2 aware client and a non version 2 aware server: c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2 c: hello\n c: between\n c: pairs 81\n c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 s: 0\n s: 324\n s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n s: 1\n s: \n (The initial ``0\n`` line from the server indicates an empty response to the unknown ``upgrade ..`` command/line.) If the server recognizes the ``upgrade`` line and is willing to satisfy that upgrade request, it replies to with a payload of the following form: upgraded <token> <transport name>\n This line is the literal string ``upgraded``, a space, the token that was specified by the client in its ``upgrade ...`` request line, a space, and the name of the transport protocol that was chosen by the server. The transport name MUST match one of the names the client specified in the ``proto`` field of its ``upgrade ...`` request line. If a server issues an ``upgraded`` response, it MUST also read and ignore the lines associated with the ``hello`` and ``between`` command requests that were issued by the server. It is assumed that the negotiated transport will respond with equivalent requested information following the transport handshake. All data following the ``\n`` terminating the ``upgraded`` line is the domain of the negotiated transport. It is common for the data immediately following to contain additional metadata about the state of the transport and the server. However, this isn't strictly speaking part of the transport handshake and isn't covered by this section. Here is an example handshake between a version 2 aware client and a version 2 aware server: c: upgrade 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a proto=ssh-v2 c: hello\n c: between\n c: pairs 81\n c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n s: <additional transport specific data> The client-issued token that is echoed in the response provides a more resilient mechanism for differentiating *banner* output from Mercurial output. In version 1, properly formatted banner output could get confused for Mercurial server output. By submitting a randomly generated token that is then present in the response, the client can look for that token in response lines and have reasonable certainty that the line did not originate from a *banner* message. SSH Version 1 Transport ----------------------- The SSH transport (version 1) is a custom text-based protocol suitable for use over any bi-directional stream transport. It is most commonly used with SSH. A SSH transport server can be started with ``hg serve --stdio``. The stdin, stderr, and stdout file descriptors of the started process are used to exchange data. When Mercurial connects to a remote server over SSH, it actually starts a ``hg serve --stdio`` process on the remote server. Commands are issued by sending the command name followed by a trailing newline ``\n`` to the server. e.g. ``capabilities\n``. Command arguments are sent in the following format:: <argument> <length>\n<value> That is, the argument string name followed by a space followed by the integer length of the value (expressed as a string) followed by a newline (``\n``) followed by the raw argument value. Dictionary arguments are encoded differently:: <argument> <# elements>\n <key1> <length1>\n<value1> <key2> <length2>\n<value2> ... Non-argument data is sent immediately after the final argument value. It is encoded in chunks:: <length>\n<data> Each command declares a list of supported arguments and their types. If a client sends an unknown argument to the server, the server should abort immediately. The special argument ``*`` in a command's definition indicates that all argument names are allowed. The definition of supported arguments and types is initially made when a new command is implemented. The client and server must initially independently agree on the arguments and their types. This initial set of arguments can be supplemented through the presence of *capabilities* advertised by the server. Each command has a defined expected response type. A ``string`` response type is a length framed value. The response consists of the string encoded integer length of a value followed by a newline (``\n``) followed by the value. Empty values are allowed (and are represented as ``0\n``). A ``stream`` response type consists of raw bytes of data. There is no framing. A generic error response type is also supported. It consists of a an error message written to ``stderr`` followed by ``\n-\n``. In addition, ``\n`` is written to ``stdout``. If the server receives an unknown command, it will send an empty ``string`` response. The server terminates if it receives an empty command (a ``\n`` character). If the server announces support for the ``protocaps`` capability, the client should issue a ``protocaps`` command after the initial handshake to annonunce its own capabilities. The client capabilities are persistent. SSH Version 2 Transport ----------------------- **Experimental and under development** Version 2 of the SSH transport behaves identically to version 1 of the SSH transport with the exception of handshake semantics. See above for how version 2 of the SSH transport is negotiated. Immediately following the ``upgraded`` line signaling a switch to version 2 of the SSH protocol, the server automatically sends additional details about the capabilities of the remote server. This has the form: <integer length of value>\n capabilities: ...\n e.g. s: upgraded 2e82ab3f-9ce3-4b4e-8f8c-6fd1c0e9e23a ssh-v2\n s: 240\n s: capabilities: known getbundle batch ...\n Following capabilities advertisement, the peers communicate using version 1 of the SSH transport. Unified Frame-Based Protocol ============================ **Experimental and under development** The *Unified Frame-Based Protocol* is a communications protocol between Mercurial peers. The protocol aims to be mostly transport agnostic (works similarly on HTTP, SSH, etc). To operate the protocol, a bi-directional, half-duplex pipe supporting ordered sends and receives is required. That is, each peer has one pipe for sending data and another for receiving. All data is read and written in atomic units called *frames*. These are conceptually similar to TCP packets. Higher-level functionality is built on the exchange and processing of frames. All frames are associated with a *stream*. A *stream* provides a unidirectional grouping of frames. Streams facilitate two goals: content encoding and parallelism. There is a dedicated section on streams below. The protocol is request-response based: the client issues requests to the server, which issues replies to those requests. Server-initiated messaging is not currently supported, but this specification carves out room to implement it. All frames are associated with a numbered request. Frames can thus be logically grouped by their request ID. Frames begin with an 8 octet header followed by a variable length payload:: +------------------------------------------------+ | Length (24) | +--------------------------------+---------------+ | Request ID (16) | Stream ID (8) | +------------------+-------------+---------------+ | Stream Flags (8) | +-----------+------+ | Type (4) | +-----------+ | Flags (4) | +===========+===================================================| | Frame Payload (0...) ... +---------------------------------------------------------------+ The length of the frame payload is expressed as an unsigned 24 bit little endian integer. Values larger than 65535 MUST NOT be used unless given permission by the server as part of the negotiated capabilities during the handshake. The frame header is not part of the advertised frame length. The payload length is the over-the-wire length. If there is content encoding applied to the payload as part of the frame's stream, the length is the output of that content encoding, not the input. The 16-bit ``Request ID`` field denotes the integer request identifier, stored as an unsigned little endian integer. Odd numbered requests are client-initiated. Even numbered requests are server-initiated. This refers to where the *request* was initiated - not where the *frame* was initiated, so servers will send frames with odd ``Request ID`` in response to client-initiated requests. Implementations are advised to start ordering request identifiers at ``1`` and ``0``, increment by ``2``, and wrap around if all available numbers have been exhausted. The 8-bit ``Stream ID`` field denotes the stream that the frame is associated with. Frames belonging to a stream may have content encoding applied and the receiver may need to decode the raw frame payload to obtain the original data. Odd numbered IDs are client-initiated. Even numbered IDs are server-initiated. The 8-bit ``Stream Flags`` field defines stream processing semantics. See the section on streams below. The 4-bit ``Type`` field denotes the type of frame being sent. The 4-bit ``Flags`` field defines special, per-type attributes for the frame. The sections below define the frame types and their behavior. Command Request (``0x01``) -------------------------- This frame contains a request to run a command. The payload consists of a CBOR map defining the command request. The bytestring keys of that map are: name Name of the command that should be executed (bytestring). args Map of bytestring keys to various value types containing the named arguments to this command. Each command defines its own set of argument names and their expected types. This frame type MUST ONLY be sent from clients to servers: it is illegal for a server to send this frame to a client. The following flag values are defined for this type: 0x01 New command request. When set, this frame represents the beginning of a new request to run a command. The ``Request ID`` attached to this frame MUST NOT be active. 0x02 Command request continuation. When set, this frame is a continuation from a previous command request frame for its ``Request ID``. This flag is set when the CBOR data for a command request does not fit in a single frame. 0x04 Additional frames expected. When set, the command request didn't fit into a single frame and additional CBOR data follows in a subsequent frame. 0x08 Command data frames expected. When set, command data frames are expected to follow the final command request frame for this request. ``0x01`` MUST be set on the initial command request frame for a ``Request ID``. ``0x01`` or ``0x02`` MUST be set to indicate this frame's role in a series of command request frames. If command data frames are to be sent, ``0x08`` MUST be set on ALL command request frames. Command Data (``0x02``) ----------------------- This frame contains raw data for a command. Most commands can be executed by specifying arguments. However, arguments have an upper bound to their length. For commands that accept data that is beyond this length or whose length isn't known when the command is initially sent, they will need to stream arbitrary data to the server. This frame type facilitates the sending of this data. The payload of this frame type consists of a stream of raw data to be consumed by the command handler on the server. The format of the data is command specific. The following flag values are defined for this type: 0x01 Command data continuation. When set, the data for this command continues into a subsequent frame. 0x02 End of data. When set, command data has been fully sent to the server. The command has been fully issued and no new data for this command will be sent. The next frame will belong to a new command. Command Response Data (``0x03``) -------------------------------- This frame contains response data to an issued command. Response data ALWAYS consists of a series of 0 or more CBOR encoded values. A CBOR value may be using indefinite length encoding. And the bytes constituting the value may span several frames. The following flag values are defined for this type: 0x01 Data continuation. When set, an additional frame containing response data will follow. 0x02 End of data. When set, the response data has been fully sent and no additional frames for this response will be sent. The ``0x01`` flag is mutually exclusive with the ``0x02`` flag. Error Response (``0x05``) ------------------------- An error occurred when processing a request. This could indicate a protocol-level failure or an application level failure depending on the flags for this message type. The payload for this type is an error message that should be displayed to the user. The following flag values are defined for this type: 0x01 The error occurred at the transport/protocol level. If set, the connection should be closed. 0x02 The error occurred at the application level. e.g. invalid command. Human Output Side-Channel (``0x06``) ------------------------------------ This frame contains a message that is intended to be displayed to people. Whereas most frames communicate machine readable data, this