home -> developer -> Web -> AMP

previous version Calculus Waterken Server capability URL

web-calculus

Abstract Messaging Protocol

2006-02-03

This specification defines an abstract messaging protocol, the web-amp, for a distributed implementation of the web-calculus.

Abstract

A remote edge in the web-calculus is represented by an unguessable URL. Non-idempotent operations additionally provide a message identifier used to implement reliable messaging between hosts.

Overview

Design goals

  1. The capability semantics of edges is preserved.
  2. A secure model for handling transmission failures is supported.
  3. Features that magnify a denial-of-service attack are not required.
  4. The protocol is easy to understand and use.
  5. A simple implementation of the protocol is possible.
  6. Interoperation in a heterogenous network environment is supported.

Capability semantics

The edges in the web-calculus have capability semantics. The remote referencing architecture must preserve these semantics.

Reliable message delivery

The protocol is designed to be used over unreliable networks. "Unreliable" means that messages might be lost or be delivered to a host more than once. The protocol is designed to ensure that, so long as the target exists, a sent message will eventually be processed and that it will be processed at most once. This guarantee survives failure of network connections and temporary failure of both the client and server computers. Given this guarantee, client code need only consider two possible outcomes of a message send: the message is processed once; or the message is never processed because the target does not exist.

Limiting a denial-of-service attack

A "denial-of-service" attack occurs when a user consumes a large enough portion of a service's resources that other users are prevented from using the service. The attacker may generate a volume of service requests that makes up a significant portion of the service's maximum throughput. The attacker may send requests that consume a disproportionate amount of the service's resources as compared to "normal" requests. The first attack is a brute force attack upon the service's resource management. The second attack is an exploit of a flaw in the service's resource management.

Consonant with its design goals, the web-amp excludes features whose implementations require exploitable flaws in their resource management. Accomplishing this goal requires limiting both the amount of resources consumed by a request and the ability of the user to schedule the consumption of the resources. If a service serves multiple distinct users, every message processed by the service should consume a similar amount of resources. A user must not be able to schedule delayed processing of his requests to force sequential processing of a large group of requests.

By eliminating resource management flaws, a denial-of-service attack can only be successful if the attacker has resources comparable in size to the attacked service. [ddos]

Ease of use

The web-calculus defines a generic interface that wraps the native interface of a service. If the native interface is easier to use than the generic one, programmers will prefer the native interface. To prevent this phenomenon, the protocol's useability must be great enough that the native protocol cannot compete based on ease of use.

Simple implementation

The web-calculus defines a generic interface for accessing any type of service. As predicted by Metcalfe's Law, the value of such an interface grows with the number of services that implement the interface. Simplifying implementation of the interface facilitates this phenomenon.

Interoperation in a heterogenous network

A variety of network protocols exist for communicating between hosts. Some of these network protocols may be preferable in some situations and not in others. Some hosts may only support a subset of these protocols. To support interoperation in such a heterogenous network environment, this messaging protocol is designed to be independent of the underlying network protocol. Protocol independence is achieved by specifying only the information to transfer and how to act upon that information. Other specifications specify how this information is transferred using a particular network protocol. These specifications handle issues such as connection negotiation and management.

Description

Implementing the web-calculus across multiple hosts requires: a mechanism for implementing an edge that crosses the boundary from a client host to a server host; and a mechanism for reliably sending an operation along such an edge and receiving the return value. The implementation of a cross-host edge is described first. An explanation of reliable operation transmission follows.

Capability URL

A cross-host web-calculus edge is implemented by a capability URL. A capability URL must provide:

Although any URL scheme which meets these requirements can be used by this protocol, use of a YURL scheme is recommended.

URL lifetime

A capability URL does not prevent the target edge from being garbage collected by its host. A host is only required to guarantee that the edge lifetime is continuous. Once the host reports an edge as deleted, it MUST forever report the same status. Deletion of a particular edge is typically determined by the host's application logic.

A common design pattern is to treat each host as a separate space bank. All URLs exported by the space bank remain valid during the lifetime of the space bank. When the space bank is destroyed, so are all edges in the space bank. The overall application design ensures that this large-grained resource management happens as a natural consequence of application interaction.

Settling a promise

An <http://web-calculus.org/pointer/Embed> is an edge that will eventually refer to a promised value. This value can be fetched using a GET operation on the edge. This process is called "settling" of the promise. The value returned by the GET operation is the "settled value" of the promise. If the promise has not yet settled, the GET operation will return another promise.

Sending an operation

A client sends an operation to a host by first reifying the operation in a message. A message is an <http://web-calculus.org/amp/Envelope> which specifies: the URL for the target edge; the operation; and the argument list. The message is routed to the host using information provided by the target URL.

For a non-idempotent operation, the client host SHOULD include an unguessable message identifier. If a message identifier is not provided, the client MUST have alternate means for preventing message replay attacks.

Generating the pipeline URL

To support message pipelining, providing a message identifier implicitly creates a pipeline promise for the invocation return value. The client can use this promise in the construction of subsequent messages. The promise is passed in place of the invocation argument. [E]

A <http://web-calculus.org/pointer/Pipeline> is a promise for an invocation return value. The 'super' member identifies a promise on the client host that will eventually settle to the invocation return value.

The server host may also hold a copy of the return value. The 'pipeline' member of a <Pipeline> is a reference to the return value held on the server host. The client generates the GUID for this pipeline URL according to the formula:

promise_guid = to_base32(sha1(to_ascii(mid)))

The input to the SHA-1 hash function is the ASCII bytes representing the 'mid' generated by the client. The pipeline GUID is the base32 encoding of the binary output of the SHA-1 hash function. [base32]

After processing a request, the server SHOULD also perform this calculation and bind the pipeline GUID to a locally held instance of the return value. If the server skips this step, the benefits of message pipelining are lost.

Recovering from a lost request

If a client does not receive a response from a server, the client MUST resend the request. The client MUST obey any backoff requirements that the underlying network protocol specifies for retrying a connection.

Handling a duplicate request

The GET operation is idempotent. If a server receives a duplicate GET request, it can simply process the request as if it were the first.

A POST operation might not be idempotent. If a server receives a duplicate POST request, it MUST respond to the request without causing any side-effects. Duplicate POST requests are requests specifying the same message identifier.

Recovering from a broken pipeline

A server may refuse to pipeline some or all requests. In this case, the server will reject requests on the pipeline URL by returning a <http://web-calculus.org/amp/NotFound>. The client MUST handle this case by re-dispatching the operation on the settled value of the promise. [multiple connections]

The above recovery procedure covers two possible cases: the server refused to pipeline the request, meaning that the pipeline URL was not created; or the promised edge was deleted. The recovery procedure correctly handles both of these cases.

If the pipeline URL was not created, the rejected request will never be processed. The re-dispatched operation takes the place of the rejected request.

If the promised edge was deleted before the rejected request was processed, the re-dispatched operation will also be rejected.

In all cases, re-dispatching the operation can have no visible side-effects.

Passing a promise to another host

A host may pass a generated, or received, promise to another host; however, the recipient host MAY reject the promise. If a host is unwilling to accept a promise, it MUST respond to the request by returning a <http://web-calculus.org/amp/Rejection> that indicates the rejected promise. The sender host MUST handle this case by resending the request using the settled value of the promise, instead of the promise.

Solving the race condition

Sending a request using a received pipeline URL creates a race condition. The sent request may arrive at the server host before the request that generates the pipeline URL. In this case, the server will reject the request by returning a <http://web-calculus.org/amp/NotFound>. The client host MUST handle this case by waiting until the promise is settled, and then resending the request to the server. If the server again rejects the request, the client MUST re-dispatch the operation on the settled value of the promise.

As with Recovering from a broken pipeline, the above recovery procedure covers multiple cases. From the point after the request is repeated, the logic is the same as before. The preceding step of resending the request, before starting the main recovery procedure, is necessary in order to prevent possible duplication of an operation. The client host can skip this step if the underlying network protocol prevents message replay attacks.

Footnotes

[ddos] Given the current state of the Internet, this advantage is largely academic. Almost all computers on the Internet are running operating systems that allow a remote attacker to take control of the computer. This means that a user can mount a distributed-denial-of-service attack on a service by using the computing resources of other users. In this way, it is easy to acquire the resources needed to mount a brute force denial-of-service attack on any service. Hopefully someday we will live in a world where script kiddies cannot unilaterally appropriate vast portions of the world's computing resources.

[E] The message pipelining feature is inspired from that in E. See the E description of message pipelining.

[base32] To ensure compatibility with existing protocols and filesystems, the generated names must not rely on case-sensitivity. It is also desireable to keep the URL length as short as possible. The base32 encoding uses the alphabet { a-z, 2-7 }. See: <http://www.waterken.com/dev/Enc/base32/>.

[idempotent] If a server can prove that an operation is idempotent, it need not guard against receiving duplicates.

[multiple connections] The recovery procedure assumes that all requests are sent on a single connection from the client to the server. If multiple connections are used to send requests from the client to the server, the request that generates the pipeline may arrive after the request sent on the pipeline. In this case, the client MUST use the recovery procedure specified in Solving the race condition.

top

Copyright 2002 - 2005 Waterken Inc. All rights reserved.

Valid XHTML 1.0! Valid CSS!