home -> developer -> Web -> AMP

Calculus

WaterkenTM Web

Abstract Messaging Protocol

2003-03-22

This specification defines an abstract messaging protocol for a distributed implementation of the WaterkenTM Web calculus. [web-amp]

Abstract

A remote edge in the WaterkenTM Web calculus is represented by an unguessable URI. An operation message sent using such a URI additionally specifies an unguessable URI for a return continuation on the client host. The return URI is used to implement reliable messaging between remote hosts.

Overview

Design goals

  1. The capability semantics of edges is preserved.
  2. A secure model for handling transmission failures is supported.
  3. Features that magnify a denial-of-service attack are not required.
  4. The protocol is easy to understand and use.
  5. A simple implementation of the protocol is possible.
  6. Interoperation in a heterogenous network environment is supported.

Capability semantics

The edges in a WaterkenTM Web have capability semantics. The remote referencing architecture must preserve these semantics.

Reliable message delivery

The protocol is designed to be used over unreliable networks. "Unreliable" means that messages might be lost or be delivered to a host more than once. The protocol is designed to ensure that a sent message will eventually be processed and that it will be processed at most once. This guarantee survives failure of network connections and temporary failure of both the client and server computers. Given this guarantee, client code need only consider two possible outcomes of a message send: the message is processed once; or the message is never processed because the target does not exist.

Limiting a denial-of-service attack

A "denial-of-service" attack occurs when a user consumes a large enough portion of a service's resources that other users are prevented from using the service. The attacker may generate a volume of service requests that makes up a significant portion of the service's maximum throughput. The attacker may send requests that consume a disproportionate amount of the service's resources as compared to "normal" requests. The first attack is a brute force attack upon the service's resource management. The second attack is an exploit of a flaw in the service's resource management.

Consonant with its design goals, the WaterkenTM Web protocol excludes features whose implementations require exploitable flaws in their resource management. Accomplishing this goal requires limiting both the amount of resources consumed by a request and the ability of the user to schedule the consumption of the resources. If a service serves multiple distinct users, every message processed by the service should consume a similar amount of resources. A user must not be able to schedule delayed processing of his requests to force sequential processing of a large group of requests. To ensure that these limitations can be met, the protocol is designed to consume a fixed amount of resources for each processed message. The protocol also gives a service great freedom in scheduling message processing.

By eliminating resource management flaws, a denial-of-service attack can only be successful if the attacker has resources comparable in size to the attacked service. [ddos]

Ease of use

The WaterkenTM Web calculus defines a generic interface that wraps the native interface of a service. If the native interface is easier to use than the generic one, programmers will prefer the native interface. To prevent this phenomenon, the protocol's useability must be great enough that the native protocol cannot compete based on ease of use.

Simple implementation

The WaterkenTM Web calculus defines a generic interface for accessing any type of service. As predicted by Metcalfe's Law, the value of such an interface grows with the number of services that implement the interface. Simplifying implementation of the interface facilitates this phenomenon.

Interoperation in a heterogenous network

A variety of network protocols exist for communicating between hosts. Some of these network protocols may be preferable in some situations and not in others. Some hosts may only support a subset of these protocols. To support interoperation in such a heterogenous network environment, this messaging protocol is designed to be independent of the underlying network protocol. Protocol independence is achieved by specifying both the information to transfer and how to act upon that information. Other WaterkenTM specifications specify how this information is transferred using a particular network protocol. These specifications handle issues such as connection negotiation and management.

Description

Implementing the WaterkenTM Web calculus across multiple hosts requires: a mechanism for implementing an edge that crosses the boundary from a client host to a server host; and a mechanism for reliably sending an operation along such an edge and receiving the return value. The implementation of a cross-host edge is described first. An explanation of reliable operation transmission follows.

Capability URI

A cross-host WaterkenTM Web edge is implemented by a capability URI. A capability URI must provide:

The URI scheme may identify the host by directly giving its location or by specifying a locating service which can be consulted to ascertain the host's current location. The use of a locating service is encouraged. Any URI scheme which meets these requirements can be used by this protocol.

URI lifetime

A cross-host edge does not prevent the target node from being garbage collected by its host. A host is only required to guarantee that the node lifetime is continuous. Once the host reports a node as deleted, it must forever report the same status. Deletion of a particular node is typically determined by the host's application logic.

A common design pattern is to treat each host as a separate space bank. All URIs exported by the space bank remain valid during the lifetime of the space bank. When the space bank is destroyed, so are all nodes in the space bank. The overall application design ensures that this large-grained resource management happens as a natural consequence of application interaction.

Sending an operation

A client sends an operation to a server by first reifying the operation in a message. A message is an <http://waterken.com/amp/Envelope> which specifies: the URI for the source node; the path to the target node; the operation; the argument list; and the URI for the return continuation. The message is sent to the server using the URI for the source node.

For each request message, the client host SHOULD create a continuation for receiving the operation return value. The URI for this node is specified by the 'reply_to' member. If specified, the 'reply_to' URI MUST be unique to the message. [REPLY-TO]

Generating the promise URI

To support message pipelining, providing a return continuation implicitly creates a promise for the operation return value. The client can use this promise in the construction of subsequent messages. [E]

A <http://waterken.com/doc/pointer/Pipeline> is a promise for an operation return value. The return value is identified by two URIs: one that specifies the eventual location of the return value on the server host; and another that specifies the eventual location of the return value on the client host. The server URI is specified by the 'promise' member. The client URI is contained in the 'super' member. Both URIs contain the same GUID. The promise GUID is generated according to the formula:

    promise_guid = to_base32(md5(to_ascii(source_guid + reply_to_guid)))

The input to the MD5 hash function is the concatenation of: the ASCII bytes representing the GUID in the source URI; and the ASCII bytes representing the 'reply_to' GUID, generated by the client.

The binary output of the MD5 hash function is used to generate the promise GUID. The promise GUID is the non-padded base32 encoding of the computed hash. [base32]

Responding to a request

A server MUST immediately respond to a received request. The server may execute the operation and respond with the operation return value, or the server may respond with a promise for the return value. The promise is an <http://waterken.com/doc/pointer/Embed> containing the URI that will eventually be assigned to the return value.

In either case, the response is a POST operation. The operation source is the return continuation from the request <http://waterken.com/amp/Envelope>. The operation path is empty. The sole argument in the argument list is the return value.

After processing a request, the server SHOULD bind the implicit promise GUID, generated according to the procedure specified in the Generating the promise URI section, to a locally held instance of the return value. The server MAY skip this step, but in this case, the benefits of message pipelining will be lost.

Redirecting a request

A server can redirect a request by returning a <http://waterken.com/amp/Redirect>. The client SHOULD re-dispatch the operation, using the specified 'src' node and 'path'.

Recovering from a lost request

If a client does not receive a response from a server, the client SHOULD simply resend the request <http://waterken.com/amp/Envelope>.

The client MUST obey any backoff requirements that the underlying network protocol specifies for retrying a connection.

Handling a duplicate request

The operations: GET; EXPECT; EXTRACT; and SETTLE are idempotent. If a server receives a duplicate request for one of these operations, it can simply process the request as if it were the first.

A POST operation might not be idempotent. A server MUST NOT appear to process a POST operation more than once. A POST request is uniquely identified by its 'reply_to' URI. If a server receives a duplicate POST request, it MUST respond to the request using the previously calculated return value, rather than executing the invocation again. The server MUST maintain this ability until the client confirms receipt of the return value. If the client has confirmed receipt of the return value, the server can ignore future duplicate requests. [idempotent]

If the source node of a POST request is not a local pass-by-reference node, the server MUST NOT process the request. The server MUST respond to the request by returning a <http://waterken.com/amp/Redirect>. If the source node exists on more than one host, the server cannot guarantee that the request will be processed only once. A POST initiated from a pass-by-copy node can only be processed by the host that originated the operation.

Recovering from a broken pipeline

A server may refuse to pipeline some or all requests. In this case, the server will respond to requests on the promise URI by returning a <http://waterken.com/amp/NotFound>. The client SHOULD handle this case by waiting for the promise to settle and then re-dispatching the operation on the resolved value of the promise. [multiple connections]

The above recovery procedure covers two possible cases: the server refused to pipeline the request, meaning that the promise URI was never created; or the promised node was deleted. The recovery procedure correctly handles both of these cases.

If the promise URI was never created, the request will never be processed. The re-dispatched operation takes the place of the rejected request.

In the other possible case, the promise URI was created, but the promised node was deleted before the request was processed. This situation may occur for multiple reasons, such as: the application logic for resource management is faulty, causing the promised node to be prematurely deleted; or a man-in-the-middle attacker intercepted the request response and then later resent the request, after the promised node was deleted. This case itself has two possible sub-cases: the promised node was pass-by-reference; or the promised node is pass-by-copy. If the promised node was pass-by-reference, the re-dispatched operation will be rejected just as the original operation was. If the promised object is pass-by-copy, the original request is guaranteed to produce no side-effects.

In all cases, re-dispatching the operation can have no visible side-effects.

Passing a promise to another host

A client host may pass a generated, or received, promise to a server host; however, the server host MAY reject the promise. If the server host is unwilling to accept the promise, it MUST respond to the request by returning a <http://waterken.com/amp/Rejection> that indicates the rejected promise. The client host SHOULD handle this case by waiting for the specified promise to settle. After the promise has settled, the client SHOULD rewrite the request, replacing the promise with its settled value. The rewritten request can then be retried.

Solving the race condition

Sending a request on a received promise creates a race condition. The sent request may arrive at the server host before the request that generates the promise URI. In this case, the server will reject the request by returning a <http://waterken.com/amp/NotFound>. The client host SHOULD handle this case by waiting until the promise is settled, and then resending the request to the server. If the server again rejects the request, the client SHOULD re-dispatch the operation on the settled value of the promise.

As with Recovering from a broken pipeline, the above recovery procedure covers multiple cases. From the point after the request is repeated, the logic is the same as before. The preceding step of resending the request, before starting the main recovery procedure, is necessary in order to prevent possible duplication of an operation.

Requirements for the network protocol

A network protocol that provides a binding for the WaterkenTM Web abstract messaging protocol must support:

  • a mechanism for locating the server host based on the source URI
  • a mechanism for authenticating the server host
  • a mechanism to protect the secrecy of a message while in transit between the client and server
  • a mechanism to send a message and receive the response

Footnotes

[web-amp] In informal discussions, this protocol may be referred to as the "web-amp". Otherwise, the protocol MUST be referred to as the "WaterkenTM Web abstract messaging protocol". Such references SHOULD link to this specification at: <http://www.waterken.com/dev/Web/AMP/>.

[ddos] Given the current state of the internet, this advantage is largely academic. Almost all computers on the internet are running operating systems that allow a remote attacker to take control of the computer. This means that a user can mount a distributed-denial-of-service attack on a service by using the computing resources of other users. In this way, it is easy to acquire the resources needed to mount a brute force denial-of-service attack on any service. Hopefully someday we will live in a world where script kiddies cannot unilaterally appropriate vast portions of the world's computing resources.

[REPLY-TO] The 'reply_to' request member is similar in purpose to the combination of the REPLY-TO and complaint continuations used in ACT 1.

[E] The message pipelining feature is inspired from that in E. See the E description of message pipelining.

[base32] To ensure compatibility with existing protocols and filesystems, the generated names must not rely on case-sensitivity. It is also desireable to keep the URI length as short as possible. The base32 encoding uses the alphabet { a-z, 2-7 }. See: Base Encoding of Data.

[idempotent] If a server can prove that a given invocation is idempotent, it need not guard against receiving duplicates.

[multiple connections] The recovery procedure assumes that all requests are sent on a single connection from the client to the server. If multiple connections are used to send requests from the client to the server, the first request that generates the pipeline may arrive after the request sent on the pipeline. In this case, the client MUST use the recovery procedure specified in Solving the race condition.

top

Copyright 2002 - 2003 Waterken Inc. All rights reserved.

Valid XHTML 1.0! Valid CSS!