home -> developer -> Doc -> XML

Model W3C XML Standard

WaterkenTM Doc

XML Surface Syntax

2003-01-24

This specification describes the mapping of the WaterkenTM Doc document model onto the XML surface syntax. The mapping provides interoperation between WaterkenTM Doc-based tools and XML-based tools.

Abstract

The WaterkenTM Doc document model is mapped onto a defined subset of the XML grammar.

Overview

Design goals

  1. WaterkenTM Doc-based tools generate output that existing XML-based tools can manipulate.
  2. Existing XML-based tools can generate output destined for an application that uses WaterkenTM Doc-based tools.
  3. The shared syntax is a useable textual encoding of the WaterkenTM Doc document model.

Each Branch is represented as an XML element

The Branch Name is used as the XML element Name. The Branch Annotation is encoded as XML CharData preceding the element start-tag. The Branch child Node is encoded as the XML element content. The child Node's Schema is encoded in the XML element as an attribute with name 'schema'. The child Node's Annotation is encoded as XML CharData following the end-tag of the last child element in the XML element content.

Each root Node is referred to using a synthetic 'doc' Branch

For each root Node in a WaterkenTM Doc Document, a synthetic Branch is created for the XML representation. These synthetic root Branches use Name 'doc' and have an empty Annotation. Upon parsing of the XML representation, the root 'doc' elements will be stripped, yielding the root Nodes.

Description

Grammar

The grammar for the XML subset is:

document ::= element

element ::= EmptyElemTag | (STag content ETag)

EmptyElemTag ::= '<' Name (S SchemaAttribute)? S? '/>'

STag ::= '<' Name (S SchemaAttribute)? S? '>'

ETag ::= '</' Name S? '>'

content ::= CharData? ((element | Reference) CharData?)*

SchemaAttribute ::= 'schema' Eq AttValue

document

An XML document MUST contain a single element, whereas a WaterkenTM Doc document can contain a list of zero or more nodes. If the Document does not contain exactly one Node, the XML representation MUST be wrapped in a synthetic 'list' element. Upon parsing of the XML representation, the 'list' element will be stripped.

EmptyElemTag

This is semantically the same as an STag followed immediately by an ETag.

STag

The Branch Name and child Node Schema. A root Branch has Name 'doc'.

content

The list of Node Branches followed by the Node Annotation.

The CharData preceding each child element is the Annotation of the corresponding Branch.

Example

Below is a simple example of an XML encoded representation of a WaterkenTM Doc document.

    <doc schema="http://example.com/project/NodeSchema">
    First branch comment <branch_name>child node data</branch_name>
    Root node comment
    </doc>

The document has a single root Node. The root Node Schema is: 'http://example.com/project/NodeSchema'. The root Node Annotation is: '\nRoot node comment\n'. The root Node has a single Branch named 'branch_name', with Annotation: '\nFirst branch comment '. The single Branch points to a child Node with Annotation: 'child node data'. The child Node has an implicit Schema and no Branches.

top

Copyright 2003 Waterken Inc. All rights reserved.

Valid
            XHTML 1.0! Valid CSS!