home -> developer -> Doc -> XML

Model W3C XML Standard

WaterkenTM Doc

XML Surface Syntax


This specification describes the mapping of the WaterkenTM Doc document model onto the XML surface syntax. The mapping provides interoperation between WaterkenTM Doc-based tools and XML-based tools.


The WaterkenTM Doc document model is mapped onto a defined subset of the XML grammar.


Design goals

  1. WaterkenTM Doc-based tools generate output that existing XML-based tools can manipulate.
  2. Existing XML-based tools can generate output destined for an application that uses WaterkenTM Doc-based tools.
  3. The shared syntax is a useable textual encoding of the WaterkenTM Doc document model.

Each Branch is represented as an XML element

The Branch Name is used as the XML element Name. The Branch Annotation is encoded as XML CharData preceding the element start-tag. The Branch child Node is encoded as the XML element content. The child Node's Schema is encoded in the XML element as an attribute with name 'schema'. The child Node's Annotation is encoded as XML CharData following the end-tag of the last child element in the XML element content.

Each root Node is referred to using a synthetic 'doc' Branch

For each root Node in a WaterkenTM Doc Document, a synthetic Branch is created for the XML representation. These synthetic root Branches use Name 'doc' and have an empty Annotation. Upon parsing of the XML representation, the root 'doc' elements will be stripped, yielding the root Nodes.



The grammar for the XML subset is:

document ::= element

element ::= EmptyElemTag | (STag content ETag)

EmptyElemTag ::= '<' Name (S SchemaAttribute)? S? '/>'

STag ::= '<' Name (S SchemaAttribute)? S? '>'

ETag ::= '</' Name S? '>'

content ::= CharData? ((element | Reference) CharData?)*

SchemaAttribute ::= 'schema' Eq AttValue


An XML document MUST contain a single element, whereas a WaterkenTM Doc document can contain a list of zero or more nodes. If the Document does not contain exactly one Node, the XML representation MUST be wrapped in a synthetic 'list' element. Upon parsing of the XML representation, the 'list' element will be stripped.


This is semantically the same as an STag followed immediately by an ETag.


The Branch Name and child Node Schema. A root Branch has Name 'doc'.


The list of Node Branches followed by the Node Annotation.

The CharData preceding each child element is the Annotation of the corresponding Branch.


Below is a simple example of an XML encoded representation of a WaterkenTM Doc document.

    <doc schema="http://example.com/project/NodeSchema">
    First branch comment <branch_name>child node data</branch_name>
    Root node comment

The document has a single root Node. The root Node Schema is: 'http://example.com/project/NodeSchema'. The root Node Annotation is: '\nRoot node comment\n'. The root Node has a single Branch named 'branch_name', with Annotation: '\nFirst branch comment '. The single Branch points to a child Node with Annotation: 'child node data'. The child Node has an implicit Schema and no Branches.


Copyright 2003 Waterken Inc. All rights reserved.

            XHTML 1.0! Valid CSS!