######################################################################
####################### THE PXP PREPROCESSOR #########################
######################################################################

**********************************************************************
Declaration of charsets
**********************************************************************

<:pxp_charset< source="ENC1" representation="ENC2" >> ;;

This is a dummy expression evaluating to (). It has an important side-effect,
however: The character encodings of the preprocessor are set.

source="ENC1": Sets the encoding of the source code. Default is
               ISO-8859-1

representation="ENC2": Sets the encoding of the representation values.
                       Default is ISO-8859-1.

Example:

<:pxp_charset< representation="UTF-8" >>
  --> Changes the representation encoding to UTF-8.

**********************************************************************
XML expressions
**********************************************************************

The following kinds of XML expressions can be built:

<:pxp_text< TEXT >>
    Just another notation for string literals. This is useful to
    include constant XML text into your program sources, e.g.

    let t = <:pxp_text< <!ENTITY sample "A sample entity"> >> in
    let dtd = Pxp_dtd_parser.parse_dtd_entity config (from_string t)

    This is the same as 

    let t = " <!ENTITY sample \"A sample entity\"> " in ...

    but you need not to quote the double quotes.

<:pxp_tree< EXPR >>
    Builds a well-formed PXP tree. The variables "spec" and "dtd" are
    assumed to contain the specification and the DTD object. (The
    latter is required even for well-formed mode, but it may be empty.)

<:pxp_vtree< EXPR >>
    Builds a validated PXP tree. Only the elements created by EXPR
    are validated; injected subtrees are assumed to be already valid.
    The variables "spec" and "dtd" are assumed to contain the
    specification and the DTD object.

<:pxp_evlist< EXPR >>
    NOT YET IMPLEMENTED!
    Builds a list of PXP events. The list may contain:
    - E_start_tag
    - E_end_tag
    - E_char_data
    - E_pinstr
    - E_comment
    - CHECK: Super root node?

<:pxp_evpull< EXPR >>
    NOT YET IMPLEMENTED!
    Builds a pull-type generator for PXP events. (Type 'a -> event option)

SYNTAX OF EXPR:

- Elements:
  <name att1="val1" att2="val2">[ SUBNODES ]

  Note that there is no </name>; the list of subnodes is enclosed by
  square brackets.

  Special notations:

  - Instead of "name", there may be a string expression in parenthesis,
    e.g. <("myprefix:" ^ variable) att1="val1">

  - Instead of "val1", a string expression is allowed, too:
    <name att1="val1"^"suffix">

  - Instead of "att1", there may be a string expression in parenthesis:
    <name ("myprefix:" ^ variable)="val1">

  - The "empty tag" is allowed as in XML: <name/> as abbreviation for
    <name>[].

  - Instead of the square brackets, a node list expression is allowed,
    e.g. <name>( variable @ [ ... ])

  - As abbreviation, it is allowed to omit the square brackets when
    the list includes exactly one element:
    <a><b/>  ==  <a>[<b/>]

- Data nodes:
  Any string expression is a data node! Examples:

  <a>[ "ABC" ]              --> Element "a" with one data node "ABC"
  <a>[ "ABC" ^ "DEF" ]      --> Element "a" with one data node "ABCDEF"
  <a>[ "ABC" "DEF" ]        --> Element "a" with two data nodes "ABC" and "DEF"

  There is also the explicit data node:

  <a>[<*>"ABC"]            --> Element "a" with one data node "ABC"

  It is sometimes useful as typing hint.

- Processing instructions:
  <?> "TARGET" "VALUE"

- XML comments:
  <!> "Contents of the comment node"

- Super root node:
  <^>[ SUBNODES ]

- String expressions:

  - Literals are enclosed in double quotes. It is allowed to use
    numeric entities, and the predefined named entities, e.g.
    "I said: &quot;Hello&quot;"
    "The euro sign: &#8364;"

  - The only operator is ^ to concatenate strings, e.g.
    ("abc"^"def")^"ghi"

- Node list expressions:

  - Lists are created by bracket expressions: [ ITEM1 ITEM2 ... ]

  - The only operator is @ to concatenate lists, e.g.
    [ "ABC" <a/> ] @ variable

  - When the first token is "[", the whole expression may be a
    node list expression:

    <:pxp_tree< [ <a/> <b/> ] >>

- Identifiers:

  It is allowed to include O'Caml identifiers into node, node list,
  and string expressions, e.g.

  let s = "abc" in <:pxp_tree< <*>s >>
  (creates a data node with the contents of s)

  let s = "abc" in <:pxp_tree< <element att=s/> >>
  (creates an element where the attribute has the value of s)

  In doubt, the most general type is assumed, e.g.

  <:pxp_tree< <a>x >>

  Here, x must be a node list, although it could also be a string or
  a node. It is usually possible to give hints when a more specific
  type is needed, e.g.

  <:pxp_tree< <a>[x] >>       (x is a node)
  <:pxp_tree< <a><*>x >>      (x is a string)

- Antiquotations:

  To include arbitrary O'Caml expressions, put them into (: ... :),
  e.g.

  <:pxp_tree< <a att=(: string_of_int n :) /> >>

  They are allowed where a node, node list or string expression is
  expected. In addition to this, antiquotations can also occur
  in attribute lists:

  <:pxp_tree< <a att1="1" (: [ "att2", "2" ] :) /> >>

  In this case, a type of (string*string) list is assumed.

- Namespace control:

  In order to create trees or events with namespace properties,
  it is required to set the namespace scope. E.g.

  let dtd = new Pxp_dtd.dtd ... in
  let spec = default_namespace_spec in
  let mng = new Pxp_dtd.namespace_manager in
  dtd # set_namespace_manager mng;
  mng # add_namespace "p" "http://a_namespace";
  <:pxp_tree< <:autoscope> <p:element/> >>

  Of course, you need a namespace manager, and it must know all namespaces
  that are used (add_namespace). Furthermore, the notation
  "<:autoscope> EXPR" creates a scope object, and enables namespace
  mode within EXPR. There are three ways of creating or modifying
  scopes:

  <:autoscope> EXPR:
     Creates a namespace scope containing all namespaces of the namespace
     manager (usually enough if detailed control of namespace scoping
     is not necessary)

  <:emptyscope> EXPR:
     Creates an empty namespace scope

  <:scope prefix="URI" ...> EXPR:
     Modifies the current scope, and adds the pairs (prefix,URI) as
     found in the attribute list. More precisely, a new scope is
     created as child of the current scope.

     To set the default namespace, use "<:scope ("")="URI"> EXPR"
     (i.e. empty prefix name).

  The O'Caml variable "scope" contains the current namespace scope
  object. <:autoscope> and <:emptyscope> define "scope" for the
  code generated for EXPR, and <:scope> redefines "scope".

  Note that the prefixes set by <:scope> become only visible when
  the "display" method is called to print an XML tree. The "write"
  method ignores the scopes.

  In order to create XML nodes that all have the same namespace
  scope, it is possible to define the "scope" variable manually, e.g.

  let scope = new Pxp_dtd.namespace_scope_impl ... in
  let x1 = <:pxp_tree< <:scope> ... >> in
  let x2 = <:pxp_tree< <:scope> ... >>

  Here, <:scope> without attributes simply enables the namespace
  mode without changing the namespace scope found in "scope".

- Comments:

  The normal O'Caml comments (* ... *) are also allowed in PXP
  expressions.

**********************************************************************
Traps
**********************************************************************

- It is not checked whether the representation charset is the
  actually used charset (e.g. as found in dtd#encoding).

- It is possible to create namespace-aware nodes that are not fully
  initialised. This is the case when "spec" is a namespace-aware
  specification, but <:scope> is missing. These nodes work only
  partially.

**********************************************************************
Examples
**********************************************************************

- Constant HTML page:

<:pxp_tree< 
  <html>[ <head>[ <title>"My page" ] 
          <body>[ <h1>"Headline" "Paragraph" ] 
        ] >>

- HTML page with placeholder:

let title = "My page" in
<:pxp_tree< 
  <html>[ <head>[ <title><*>title ] 
          <body>[ <h1><*>title "Paragraph" ] 
        ] >>

Note that we use "<*>title". Without "<*>", the variable "title"
would have type Pxp_document.node and not string.

- Placeholder in attribute:

let style = "font-weight:bold" in
<:pxp_tree< 
  <html>[ <head>[ <title>"My page" ] 
          <body>[ <h1 style=style>"My page" "Paragraph" ] 
        ] >>

- Iteration:

let data = [ "Text1"; "Text2"; "Text3" ] in
let make_item s = <:pxp_tree< <li><*>s >> in
<:pxp_tree<
  <ul>
    (: List.map make_item data :) >>

- A complete example with namespaces:

  let dtd = parse_dtd_entity default_namespace_config (from_string "") in
  let spec = default_namespace_spec in
  let mng = new namespace_manager in
  dtd # set_namespace_manager mng;
  mng # add_namespace "html" "http://www.w3.org/1999/xhtml";
  let scope = 
    new namespace_scope_impl dtd#namespace_manager None mng#as_declaration in
  let data = [ "Text1"; "Text2"; "Text3" ] in
  let make_item s = <:pxp_tree< <:scope><html:li><*>s >> in
  let ul_node =
    <:pxp_tree<
      <:scope>
	<html:ul>
	  (: List.map make_item data :) >> in
  <:pxp_tree< 
    <:scope> 
      <html:html>[ <html:head>[ <html:title>"My page" ] 
                   <html:body>[ ul_node ] ] >>

  When printed with "display", the XML text will use the prefix "html",
  e.g. "html:body". To enforce the usage of a default prefix, modify
  the line defining "scope" as follows:

  let scope = 
    new namespace_scope_impl dtd#namespace_manager None 
    [ "", "http://www.w3.org/1999/xhtml"]

  At least when generating trees, it is possible to omit
  <:scope>, and to set the scope afterwards:

  iter_tree ~pre:(fun n -> n#set_namespace_scope scope) tree

- How to include linefeeds in strings:

  <:pxp_tree< <*>"A line!&#10;" >>

  Or define a variable lf:

  let lf = "\n" in
  <:pxp_tree< <*>("A line!"^lf) >>
