[36] | 1 |
|
---|
| 2 | /*
|
---|
| 3 | *@@sourcefile xmldefs.c:
|
---|
| 4 | * this file is just for xdoc and contains glossary items for
|
---|
| 5 | * XML. It is never compiled.
|
---|
| 6 | *
|
---|
| 7 | *@@added V0.9.6 (2000-10-29) [umoeller]
|
---|
| 8 | */
|
---|
| 9 |
|
---|
| 10 | /*
|
---|
| 11 | * Copyright (C) 2001 Ulrich Mller.
|
---|
| 12 | * This file is part of the "XWorkplace helpers" source package.
|
---|
| 13 | * This is free software; you can redistribute it and/or modify
|
---|
| 14 | * it under the terms of the GNU General Public License as published
|
---|
| 15 | * by the Free Software Foundation, in version 2 as it comes in the
|
---|
| 16 | * "COPYING" file of the XWorkplace main distribution.
|
---|
| 17 | * This program is distributed in the hope that it will be useful,
|
---|
| 18 | * but WITHOUT ANY WARRANTY; without even the implied warranty of
|
---|
| 19 | * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
---|
| 20 | * GNU General Public License for more details.
|
---|
| 21 | */
|
---|
| 22 |
|
---|
| 23 | /*
|
---|
[38] | 24 | *@@gloss: expat expat
|
---|
| 25 | * Expat is one of the most well-known XML processors (parsers).
|
---|
| 26 | * I (umoeller) have ported expat to the XWorkplace Helpers
|
---|
| 27 | * library. See xmlparse.c for an introduction to expat. See
|
---|
| 28 | * xml.c for an introduction to XML support in the XWorkplace
|
---|
| 29 | * Helpers in general.
|
---|
[36] | 30 | */
|
---|
| 31 |
|
---|
[38] | 32 |
|
---|
[36] | 33 | /*
|
---|
[38] | 34 | *@@gloss: XML XML
|
---|
| 35 | * XML is the Extensible Markup Language, as defined by
|
---|
| 36 | * the W3C. XML isn't really a language, but a meta-language
|
---|
| 37 | * for describing markup languages. It is a simplified subset
|
---|
| 38 | * of SGML.
|
---|
| 39 | *
|
---|
| 40 | * You should be familiar with the following:
|
---|
| 41 | *
|
---|
| 42 | * -- XML parsers operate on XML @documents.
|
---|
| 43 | *
|
---|
| 44 | * -- Each XML document has both a physical and a logical
|
---|
| 45 | * structure.
|
---|
| 46 | *
|
---|
| 47 | * Physically, the document is composed of units called
|
---|
| 48 | * @entities.
|
---|
| 49 | *
|
---|
| 50 | * Logically, the document is composed of @markup and
|
---|
| 51 | * @content. Among other things, markup separates the content
|
---|
| 52 | * into @elements.
|
---|
| 53 | *
|
---|
| 54 | * -- The logical and physical structures must nest properly (be
|
---|
| 55 | * @well-formed) for each entity, which results in the entire
|
---|
| 56 | * XML document being well-formed as well.
|
---|
| 57 | */
|
---|
| 58 |
|
---|
| 59 | /*
|
---|
[36] | 60 | *@@gloss: entities entities
|
---|
[38] | 61 | * An "entity" is an XML storage unit. It's a very abstract
|
---|
| 62 | * concept, and the term doesn't make much sense, but it was
|
---|
| 63 | * in SGML already, and XML chose to inherit it.
|
---|
[36] | 64 | *
|
---|
[38] | 65 | * In the simplest case, an XML document has only one entity,
|
---|
| 66 | * which is an XML file (or memory buffer from wherever).
|
---|
[36] | 67 | * The document entity serves as the root of the entity tree
|
---|
| 68 | * and a starting-point for an XML processor. Unlike other
|
---|
| 69 | * entities, the document entity has no name and might well
|
---|
| 70 | * appear on a processor input stream without any identification
|
---|
| 71 | * at all.
|
---|
| 72 | *
|
---|
[38] | 73 | * Entities are defined to be either parsed or unparsed.
|
---|
| 74 | *
|
---|
[36] | 75 | * Other than that, there are @internal_entities,
|
---|
| 76 | * @external_entities, and @parameter_entities.
|
---|
| 77 | *
|
---|
| 78 | * See @entity_references for how to reference entities.
|
---|
| 79 | */
|
---|
| 80 |
|
---|
| 81 | /*
|
---|
| 82 | *@@gloss: entity_references entity references
|
---|
| 83 | * An "entity reference" refers to the content of a named
|
---|
| 84 | * entity (see: @entities). It is included in "&" and ";"
|
---|
| 85 | * characters.
|
---|
| 86 | *
|
---|
| 87 | * If you declare @internal_entities in the @DTD, referencing
|
---|
| 88 | * them allows for text replacements as in SGML:
|
---|
| 89 | *
|
---|
| 90 | + This document was prepared on &PrepDate;.
|
---|
| 91 | *
|
---|
| 92 | * The same works for @external_entities though. Assuming
|
---|
| 93 | * that "SecondFile" has been declared in the DTD to point
|
---|
| 94 | * to another file,
|
---|
| 95 | *
|
---|
| 96 | + See the following README: &SecondFile;
|
---|
| 97 | *
|
---|
| 98 | * would then insert the complete contents of the second
|
---|
| 99 | * file into the document. The XML processor will parse
|
---|
| 100 | * that file as if it were at that position in the original
|
---|
| 101 | * document.
|
---|
| 102 | *
|
---|
| 103 | * An entity is "included" when its replacement text
|
---|
| 104 | * is retrieved and processed, in place of the reference itself,
|
---|
| 105 | * as though it were part of the document at the location the
|
---|
| 106 | * reference was recognized.
|
---|
| 107 | * The replacement text may contain
|
---|
| 108 | * both @content and (except for @parameter_entities)
|
---|
| 109 | * @markup, which must be recognized in the usual way, except
|
---|
| 110 | * that the replacement text of entities used to escape markup
|
---|
| 111 | * delimiters (the entities amp, lt, gt, apos, quot) is always
|
---|
| 112 | * treated as data. (The string "AT&T;" expands to "AT&T;"
|
---|
| 113 | * and the remaining ampersand is not recognized as an
|
---|
| 114 | * entity-reference delimiter.) A @character_reference is
|
---|
| 115 | * included when the indicated character is processed in
|
---|
| 116 | * place of the reference itself.
|
---|
| 117 | *
|
---|
| 118 | * The following are forbidden, and constitute fatal errors:
|
---|
| 119 | *
|
---|
| 120 | * -- the appearance of a reference to an unparsed entity;
|
---|
| 121 | *
|
---|
| 122 | * -- the appearance of any character or general-entity reference
|
---|
| 123 | * in the @DTD except within an EntityValue or AttValue;
|
---|
| 124 | *
|
---|
| 125 | * -- a reference to an external entity in an attribute value.
|
---|
| 126 | */
|
---|
| 127 |
|
---|
| 128 | /*
|
---|
| 129 | *@@gloss: internal_entities internal entities
|
---|
| 130 | * An "internal entity" has no separate physical storage.
|
---|
| 131 | * Its contents appear in the document's @DTD as an
|
---|
| 132 | * @entity_declaration, like this:
|
---|
| 133 | *
|
---|
| 134 | + <!ENTITY PrepDate "Feb 11, 2001">
|
---|
| 135 | *
|
---|
| 136 | * This can later be referenced with @entity_references
|
---|
| 137 | * and allows you to define shortcuts for frequently typed
|
---|
| 138 | * text or text that is expected to change, such as the
|
---|
| 139 | * revision status of a document.
|
---|
| 140 | *
|
---|
| 141 | * XML has five built-in internal entities:
|
---|
| 142 | *
|
---|
| 143 | * -- "&amp;" refers to the ampersand ("&") character,
|
---|
| 144 | * which normally introduces @markup and can therefore
|
---|
| 145 | * only be literally used in @comments, @processing_instructions,
|
---|
| 146 | * or @CDATA sections. This is also legal within the literal
|
---|
| 147 | * entity value of declarations of internal entities.
|
---|
| 148 | *
|
---|
| 149 | * -- "&lt;" and "&gt;" refer to the angle brackets
|
---|
| 150 | * ("<", ">") which normally introduce @elements.
|
---|
| 151 | * They must be escaped unless used in a @CDATA section.
|
---|
| 152 | *
|
---|
[38] | 153 | * -- To allow values in @attributes to contain both single and double
|
---|
[36] | 154 | * quotes, the apostrophe or single-quote character (') may be
|
---|
| 155 | * represented as "&apos;", and the double-quote character
|
---|
| 156 | * (") as "&quot;".
|
---|
| 157 | *
|
---|
[38] | 158 | * A numeric @character_reference is a special case of an entity reference.
|
---|
[36] | 159 | *
|
---|
| 160 | * An internal entity is always parsed.
|
---|
| 161 | *
|
---|
| 162 | * Also see @entities.
|
---|
| 163 | */
|
---|
| 164 |
|
---|
| 165 | /*
|
---|
| 166 | *@@gloss: parameter_entities parameter entities
|
---|
| 167 | * Parameter entities can only be references in the @DTD.
|
---|
| 168 | * A parameter entity is identified by placing "% " (percent-space)
|
---|
| 169 | * in front of its name in the declaration. The percent sign is
|
---|
| 170 | * also used in references to parameter entities, instead of the
|
---|
| 171 | * ampersand. Parameter entity references are immediately expanded
|
---|
| 172 | * in the DTD and their replacement text is
|
---|
| 173 | * part of the declaration, whereas normal @entity_references are not
|
---|
| 174 | * expanded.
|
---|
| 175 | */
|
---|
| 176 |
|
---|
| 177 | /*
|
---|
| 178 | *@@gloss: external_entities external entities
|
---|
| 179 | * As opposed to @internal_entities, "external entities" refer
|
---|
| 180 | * to different storage.
|
---|
| 181 | *
|
---|
| 182 | * They must have a "system ID" with the URI specifying where
|
---|
| 183 | * the entity can be retrieved. Those URIs may be absolute
|
---|
| 184 | * or relative. Unless otherwise provided (e.g. by a special
|
---|
| 185 | * XML element type defined by a particular @DTD, or
|
---|
| 186 | * @processing_instructions defined by a particular application
|
---|
| 187 | * specification), relative URIs are relative to the location
|
---|
| 188 | * of the resource within which the entity declaration occurs.
|
---|
| 189 | *
|
---|
| 190 | * Optionally, external entities may specify a "public ID"
|
---|
| 191 | * as well. An XML processor attempting to retrieve the entity's
|
---|
| 192 | * content may use the public identifier to try to generate an
|
---|
| 193 | * alternative URI. If the processor is unable to do so, it must
|
---|
| 194 | * use the URI specified in the system literal. Before a match
|
---|
| 195 | * is attempted, all strings of @whitespace in the public
|
---|
| 196 | * identifier must be normalized to single space characters (#x20),
|
---|
| 197 | * and leading and trailing white space must be removed.
|
---|
| 198 | *
|
---|
| 199 | * An external entity is not always parsed.
|
---|
| 200 | *
|
---|
| 201 | * External entities allow an XML document to refer to an external
|
---|
| 202 | * file. External entities contain either text or binary data. If
|
---|
| 203 | * they contain text, the content of the external file is inserted
|
---|
| 204 | * at the point of reference and parsed as part of the referring
|
---|
| 205 | * document. Binary data is not parsed and may only be referenced
|
---|
| 206 | * in an attribute that has been declared as ENTITY or ENTITIES.
|
---|
| 207 | * Binary data is used to reference figures and
|
---|
| 208 | * other non-XML content in the document.
|
---|
| 209 | *
|
---|
| 210 | * Examples of external entity declarations:
|
---|
| 211 | +
|
---|
| 212 | + <!ENTITY open-hatch
|
---|
| 213 | + SYSTEM "http://www.textuality.com/boilerplate/OpenHatch.xml">
|
---|
| 214 | + <!ENTITY open-hatch
|
---|
| 215 | + PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
|
---|
| 216 | + "http://www.textuality.com/boilerplate/OpenHatch.xml">
|
---|
| 217 | + <!ENTITY hatch-pic
|
---|
| 218 | + SYSTEM "../grafix/OpenHatch.gif" NDATA gif >
|
---|
| 219 | *
|
---|
| 220 | * Character @encoding is processed on a per-external-entity basis.
|
---|
| 221 | * As a result, each external parsed entity in an XML document may
|
---|
| 222 | * use a different encoding for its characters.
|
---|
| 223 | *
|
---|
[249] | 224 | * In the document entity, the encoding declaration is part of the
|
---|
| 225 | * XML @text_declaration.
|
---|
[36] | 226 | *
|
---|
| 227 | * Also see @entities.
|
---|
| 228 | */
|
---|
| 229 |
|
---|
| 230 | /*
|
---|
| 231 | *@@gloss: external_parsed_entities external parsed entities
|
---|
| 232 | * An external parsed entity is an external entity that has
|
---|
| 233 | * been parsed, which is not necessarily the case.
|
---|
| 234 | *
|
---|
| 235 | * See @external_entities.
|
---|
| 236 | */
|
---|
| 237 |
|
---|
| 238 | /*
|
---|
| 239 | *@@gloss: markup markup
|
---|
| 240 | * XML "markup" encodes a description of the @document's storage
|
---|
| 241 | * layout and logical structure.
|
---|
| 242 | *
|
---|
| 243 | * Markup is either @elements, @entity_references, @comments, @CDATA
|
---|
[38] | 244 | * section delimiters, @DTD's, or @processing_instructions.
|
---|
[36] | 245 | *
|
---|
| 246 | * XML "text" consists of markup and @content.
|
---|
| 247 | */
|
---|
| 248 |
|
---|
| 249 | /*
|
---|
| 250 | *@@gloss: whitespace whitespace
|
---|
| 251 | * In XML, "whitespace" consists of one or more space (0x20)
|
---|
| 252 | * characters, carriage returns, line feeds, or tabs.
|
---|
| 253 | *
|
---|
| 254 | * Whitespace handling in XML can vary. In @markup, this is
|
---|
| 255 | * used to separate the various @entities of course. However,
|
---|
| 256 | * in @content (i.e. non-markup), an application may
|
---|
| 257 | * or may not be interested in white space. Whitespace
|
---|
| 258 | * handling can therefore be handled differently for each
|
---|
[38] | 259 | * element with the use of the special "xml:space" @attributes.
|
---|
[36] | 260 | */
|
---|
| 261 |
|
---|
| 262 | /*
|
---|
| 263 | *@@gloss: character_reference character reference
|
---|
| 264 | * Character references escape Unicode characters. They are
|
---|
| 265 | * a special case of @entity_references.
|
---|
| 266 | *
|
---|
| 267 | * They may be used to refer to a specific character in the
|
---|
| 268 | * ISO/IEC 10646 character set, for example one not directly
|
---|
| 269 | * accessible from available input devices.
|
---|
| 270 | *
|
---|
| 271 | * If the character reference thus begins with "&#x", the
|
---|
| 272 | * digits and letters up to the terminating ";" provide a
|
---|
| 273 | * hexadecimal representation of the character's code point in
|
---|
| 274 | * ISO/IEC 10646. If it begins just with "&#", the
|
---|
| 275 | * digits up to the terminating ";" provide a decimal
|
---|
| 276 | * representation of the character's code point.
|
---|
| 277 | */
|
---|
| 278 |
|
---|
| 279 | /*
|
---|
| 280 | *@@gloss: content content
|
---|
| 281 | * XML "text" consists of @markup and "content" (the XML spec
|
---|
| 282 | * calls this "character data"). Content is simply everything
|
---|
| 283 | * that is not markup.
|
---|
| 284 | *
|
---|
| 285 | * To access characters that would either otherwise be recognized
|
---|
| 286 | * as @markup or are difficult to reach via the keyboard, XML
|
---|
| 287 | * allows for using a @character_reference.
|
---|
| 288 | *
|
---|
| 289 | * Within @elements, content is any string of
|
---|
| 290 | * characters which does not contain the start-delimiter of
|
---|
| 291 | * any markup. In a @CDATA section, content is any
|
---|
| 292 | * string of characters not including the CDATA-section-close
|
---|
| 293 | * delimiter, "]]>".
|
---|
| 294 | *
|
---|
| 295 | * The character @encodings may vary between @external_parsed_entities.
|
---|
| 296 | */
|
---|
| 297 |
|
---|
| 298 | /*
|
---|
| 299 | *@@gloss: names names
|
---|
| 300 | * In XML, a "name" is a token beginning with a letter or one of a
|
---|
| 301 | * few punctuation characters, and continuing with letters,
|
---|
| 302 | * digits, hyphens, underscores, colons, or full stops,
|
---|
| 303 | * together known as name characters. The colon has a
|
---|
| 304 | * special meaning with XML namespaces.
|
---|
| 305 | */
|
---|
| 306 |
|
---|
| 307 | /*
|
---|
| 308 | *@@gloss: elements elements
|
---|
| 309 | * Elements are the most common form of XML @markup.
|
---|
| 310 | * They are identified by their @names.
|
---|
| 311 | *
|
---|
| 312 | * As opposed to HTML, there are two types of elements:
|
---|
| 313 | *
|
---|
| 314 | * A non-empty element starts and ends with a start-tag
|
---|
| 315 | * and an end-tag:
|
---|
| 316 | *
|
---|
| 317 | + <LI>...</LI>
|
---|
| 318 | *
|
---|
| 319 | * As opposed to HTML, an empty element must have an
|
---|
| 320 | * empty-element tag:
|
---|
| 321 | *
|
---|
| 322 | + <P /> <IMG align="left" src="http://www.w3.org/Icons/WWW/w3c_home" />
|
---|
| 323 | *
|
---|
[38] | 324 | * In addition, @attributes contains extra parameters to elements.
|
---|
[36] | 325 | * If the element has attributes, they must be in the start-tag
|
---|
| 326 | * (or empty-element tag).
|
---|
| 327 | *
|
---|
| 328 | * For non-empty elements, the text between the start-tag
|
---|
| 329 | * and end-tag is called the element's content and may
|
---|
| 330 | * contain other elements, character data, an entity
|
---|
| 331 | * reference, a @CDATA section, a processing instruction,
|
---|
| 332 | * or a comment.
|
---|
| 333 | *
|
---|
| 334 | * The XML specs break this into "content particles".
|
---|
| 335 | *
|
---|
| 336 | * An element has "mixed content" when it may contain
|
---|
| 337 | * @content, optionally interspersed with child
|
---|
| 338 | * elements. In this case, the types of the child
|
---|
| 339 | * elements may be constrained by a documents @DTD, but
|
---|
| 340 | * not their order or their number of occurrences.
|
---|
| 341 | */
|
---|
| 342 |
|
---|
| 343 | /*
|
---|
[38] | 344 | *@@gloss: attributes attributes
|
---|
[36] | 345 | * "Attributes" are name-value pairs that have been associated
|
---|
| 346 | * with @elements. Attributes can only appear in start-tags
|
---|
| 347 | * or empty-tags.
|
---|
| 348 | *
|
---|
| 349 | * Attributes are identified by their @names. Each such
|
---|
| 350 | * identifier may only appear once per element.
|
---|
| 351 | *
|
---|
| 352 | * As opposed to HTML, attribute values must be quoted (either
|
---|
| 353 | * in single or double quotes). You may use a @character_reference
|
---|
| 354 | * to escape quotes in attribute values.
|
---|
| 355 | *
|
---|
| 356 | * Example of an attribute:
|
---|
| 357 | *
|
---|
| 358 | + <IMG SRC="mypic.gif" />
|
---|
| 359 | *
|
---|
| 360 | * SRC="mypic.gif" is the attribute here.
|
---|
| 361 | *
|
---|
| 362 | * There are a few <B>special attributes</B> defined by XML.
|
---|
| 363 | * In @valid documents, these attributes, like any other,
|
---|
| 364 | * must be declared if they are used. These attributes are
|
---|
| 365 | * recursive, i.e. they are considered to apply to all elements
|
---|
| 366 | * within the content of the element where they are specified,
|
---|
| 367 | * unless overridden in a sub-element.
|
---|
| 368 | *
|
---|
| 369 | * -- "xml:space" may be attached to an element to signal
|
---|
| 370 | * that @whitespace should be preserved for this element.
|
---|
| 371 | *
|
---|
| 372 | * The value "default" signals that applications' default
|
---|
| 373 | * whitespace processing modes are acceptable for this
|
---|
| 374 | * element; the value "preserve" indicates the intent that
|
---|
| 375 | * applications preserve all the white space.
|
---|
| 376 | *
|
---|
| 377 | * -- "xml:lang" may be inserted in documents to specify the
|
---|
| 378 | * language used in the contents and attribute values of
|
---|
| 379 | * any element in an XML document.
|
---|
| 380 | *
|
---|
| 381 | * The value is either a two-letter language code (e.g. "en")
|
---|
| 382 | * or a combination of language and country code. Interestingly,
|
---|
| 383 | * the English W3C XML spec gives the following examples:
|
---|
| 384 | *
|
---|
| 385 | + <p xml:lang="en">The quick brown fox jumps over the lazy dog.</p>
|
---|
| 386 | + <p xml:lang="en-GB">What colour is it?</p>
|
---|
| 387 | + <p xml:lang="en-US">What color is it?</p>
|
---|
| 388 | + <sp who="Faust" desc='leise' xml:lang="de">
|
---|
| 389 | + <l>Habe nun, ach! Philosophie,</l>
|
---|
| 390 | + <l>Juristerei, und Medizin</l>
|
---|
| 391 | + <l>und leider auch Theologie</l>
|
---|
| 392 | + <l>durchaus studiert mit heiáem Bemh'n.</l>
|
---|
| 393 | + </sp>
|
---|
| 394 | */
|
---|
| 395 |
|
---|
| 396 | /*
|
---|
| 397 | *@@gloss: comments comments
|
---|
| 398 | * Comments may appear anywhere in a document outside other
|
---|
| 399 | * markup; in addition, they may appear within the @DTD at
|
---|
| 400 | * places allowed by the grammar. They are not part of the
|
---|
| 401 | * document's @content; an XML processor may, but
|
---|
| 402 | * need not, make it possible for an application to retrieve
|
---|
[38] | 403 | * the text of comments (@expat has a handler for this).
|
---|
[36] | 404 | *
|
---|
| 405 | * Comments may contain any text except "--" (double-hyphen).
|
---|
| 406 | *
|
---|
| 407 | * Example of a comment:
|
---|
| 408 | *
|
---|
| 409 | + <!-- declarations for <head> & <body> -->
|
---|
| 410 | */
|
---|
| 411 |
|
---|
| 412 | /*
|
---|
| 413 | *@@gloss: CDATA CDATA
|
---|
| 414 | * CDATA sections can appear anywhere where @content
|
---|
| 415 | * is allowed. They are used to escape blocks of
|
---|
| 416 | * text containing characters which would otherwise be
|
---|
| 417 | * recognized as @markup.
|
---|
| 418 | *
|
---|
| 419 | * CDATA sections begin with the string <![CDATA[ and end
|
---|
| 420 | * with the string ]]>. Within a CDATA section, only the
|
---|
| 421 | * ]]> string is recognized as @markup, so that left angle
|
---|
| 422 | * brackets and ampersands may occur in their literal form.
|
---|
| 423 | * They need not (and cannot) be escaped using "&lt;" and
|
---|
| 424 | * "&amp;". (This implies that not even @comments are
|
---|
| 425 | * recognized).
|
---|
| 426 | *
|
---|
| 427 | * CDATA sections cannot nest.
|
---|
| 428 | *
|
---|
| 429 | * Examples:
|
---|
| 430 | *
|
---|
| 431 | + <![CDATA[<greeting>Hello, world!</greeting>]]>
|
---|
| 432 | +
|
---|
| 433 | + <![CDATA[
|
---|
| 434 | + *p = &q;
|
---|
| 435 | + b = (i <= 3);
|
---|
| 436 | + ]]>
|
---|
| 437 | */
|
---|
| 438 |
|
---|
| 439 | /*
|
---|
| 440 | *@@gloss: processing_instructions processing instructions
|
---|
| 441 | * "Processing instructions" (PIs) contain additional
|
---|
| 442 | * data for applications.
|
---|
| 443 | *
|
---|
| 444 | * Like @comments, they are not textually part of the XML
|
---|
| 445 | * document, but the XML processor is required to pass
|
---|
| 446 | * them to an application.
|
---|
| 447 | *
|
---|
| 448 | * PIs have the form:
|
---|
| 449 | *
|
---|
| 450 | + <?name pidata?>
|
---|
| 451 | *
|
---|
| 452 | *
|
---|
| 453 | * The "name", called the PI "target", identifies the PI to
|
---|
| 454 | * the application. Applications should process only the
|
---|
| 455 | * targets they recognize and ignore all other PIs. Any
|
---|
| 456 | * data that follows the PI target is optional, it is for
|
---|
| 457 | * the application that recognizes the target. The names
|
---|
| 458 | * used in PIs may be declared in a @notation_declaration in order to
|
---|
| 459 | * formally identify them.
|
---|
| 460 | *
|
---|
| 461 | * PI names beginning with "xml" are reserved.
|
---|
| 462 | */
|
---|
| 463 |
|
---|
| 464 | /*
|
---|
| 465 | *@@gloss: well-formed well-formed
|
---|
| 466 | * XML @documents (the sum of all @entities) are "well-formed"
|
---|
| 467 | * if the following conditions are met (among others):
|
---|
| 468 | *
|
---|
| 469 | * -- They contain one or more @elements.
|
---|
| 470 | *
|
---|
| 471 | * -- There is exactly one element, called the root, or document
|
---|
| 472 | * element, no part of which appears in the @content of any
|
---|
| 473 | * other element.
|
---|
| 474 | *
|
---|
| 475 | * -- For all other elements, if the start-tag is in the content
|
---|
| 476 | * of another element, the end-tag is in the content of the
|
---|
| 477 | * same element. More simply stated, the elements nest
|
---|
| 478 | * properly within each other. (This is unlike HTML.)
|
---|
| 479 | *
|
---|
| 480 | * -- Values of string @attributes cannot contain references to
|
---|
| 481 | * @external_entities.
|
---|
| 482 | *
|
---|
| 483 | * -- No attribute may appear more than once in the same element.
|
---|
| 484 | *
|
---|
| 485 | * -- All entities except the amp, lt, gt, apos, and quot must be
|
---|
| 486 | * declared before they are used. Binary @external_entities
|
---|
| 487 | * cannot be referenced in the flow of @content, it can only
|
---|
| 488 | * be used in an attribute declared as ENTITY or ENTITIES.
|
---|
| 489 | *
|
---|
| 490 | * -- Neither text nor @parameter_entities are allowed to be
|
---|
| 491 | * recursive, directly or indirectly.
|
---|
| 492 | */
|
---|
| 493 |
|
---|
| 494 | /*
|
---|
| 495 | *@@gloss: valid valid
|
---|
| 496 | * XML @documents are said to be "valid" if they have a @DTD
|
---|
[38] | 497 | * associated and they confirm to it. While XML documents
|
---|
| 498 | * must always be @well-formed, validation and validity is up
|
---|
| 499 | * to the implementation (i.e. at option to the application).
|
---|
[36] | 500 | *
|
---|
| 501 | * Validating processors must report violations of the constraints
|
---|
| 502 | * expressed by the declarations in the @DTD, and failures to
|
---|
| 503 | * fulfill the validity constraints given in this specification.
|
---|
| 504 | * To accomplish this, validating XML processors must read and
|
---|
| 505 | * process the entire DTD and all @external_parsed_entities
|
---|
| 506 | * referenced in the document.
|
---|
| 507 | *
|
---|
[38] | 508 | * Non-validating processors (such as @expat) are required to
|
---|
| 509 | * check only the document entity (see @entitites), including the
|
---|
| 510 | * entire internal DTD subset, for whether it is @well-formed.
|
---|
| 511 | *
|
---|
| 512 | * While they are not required to check the document for validity,
|
---|
[36] | 513 | * they are required to process all the declarations they
|
---|
| 514 | * read in the internal DTD subset and in any parameter entity
|
---|
| 515 | * that they read, up to the first reference to a parameter
|
---|
| 516 | * entity that they do not read; that is to say, they must
|
---|
| 517 | * use the information in those declarations to normalize
|
---|
[38] | 518 | * values of @attributes, include the replacement text of
|
---|
[36] | 519 | * @internal_entities, and supply default attribute values.
|
---|
| 520 | * They must not process entity declarations or attribute-list
|
---|
| 521 | * declarations encountered after a reference to a
|
---|
| 522 | * parameter entity that is not read, since the entity may have
|
---|
| 523 | * contained overriding declarations.
|
---|
| 524 | */
|
---|
| 525 |
|
---|
| 526 | /*
|
---|
| 527 | *@@gloss: encodings encodings
|
---|
[38] | 528 | * XML supports a wide variety of character encodings. These
|
---|
| 529 | * must be specified in the XML @text_declaration.
|
---|
[36] | 530 | *
|
---|
[38] | 531 | * There are too many character encodings on the planet to
|
---|
| 532 | * be listed here. The most common ones are:
|
---|
| 533 | *
|
---|
| 534 | * -- "UTF-8", "UTF-16", "ISO-10646-UCS-2", and "ISO-10646-UCS-4"
|
---|
| 535 | * should be used for the various encodings and transformations
|
---|
| 536 | * of Unicode / ISO/IEC 10646.
|
---|
| 537 | *
|
---|
| 538 | * -- "ISO-8859-x" (with "x" being a number from 1 to 9) represent
|
---|
| 539 | * the various ISO 8859 ("Latin") encodings.
|
---|
| 540 | *
|
---|
| 541 | * -- "ISO-2022-JP", "Shift_JIS", and "EUC-JP" should be used for
|
---|
| 542 | * the various encoded forms of JIS X-0208-1997.
|
---|
| 543 | *
|
---|
| 544 | * Example of a @text_declaration:
|
---|
| 545 | *
|
---|
| 546 | + <?xml version="1.0" encoding="ISO-8859-2"?>
|
---|
| 547 | *
|
---|
[36] | 548 | * All XML processors must be able to read @entities in either
|
---|
[97] | 549 | * UTF-8 or UTF-16. @expat directly supports the following
|
---|
| 550 | * (see XML_SetUnknownEncodingHandler):
|
---|
[36] | 551 | *
|
---|
[97] | 552 | * -- UTF-8: 8-bit encoding of Unicode.
|
---|
| 553 | *
|
---|
| 554 | * -- UTF-16: 16-bit encoding of Unicode.
|
---|
| 555 | *
|
---|
| 556 | * -- ISO-8859-1: that's "latin 1".
|
---|
| 557 | *
|
---|
| 558 | * -- US-ASCII.
|
---|
| 559 | *
|
---|
[36] | 560 | * Entities encoded in UTF-16 must begin with the ZERO WIDTH NO-BREAK
|
---|
| 561 | * SPACE character, #xFEFF). This is an encoding signature, not part
|
---|
| 562 | * of either the @markup or the @content of the XML @document.
|
---|
| 563 | * XML processors must be able to use this character to differentiate
|
---|
| 564 | * between UTF-8 and UTF-16 encoded documents.
|
---|
| 565 | */
|
---|
| 566 |
|
---|
| 567 | /*
|
---|
| 568 | *@@gloss: text_declaration text declaration
|
---|
| 569 | * XML @documents and @external_parsed_entities may (and
|
---|
| 570 | * should) start with the XML text declaration, exactly like
|
---|
| 571 | * this:
|
---|
| 572 | *
|
---|
| 573 | + <?xml version="1.0" encoding="enc"?>
|
---|
| 574 | *
|
---|
| 575 | * where "1.0" is the only currently defined XML version
|
---|
| 576 | * and "enc" must be the encoding of the document.
|
---|
| 577 | *
|
---|
| 578 | * External parsed entities may begin with a text declaration,
|
---|
| 579 | * which looks like an XML declaration with just an encoding
|
---|
| 580 | * declaration:
|
---|
| 581 | *
|
---|
| 582 | + <?xml encoding="Big5"?>
|
---|
| 583 | *
|
---|
| 584 | * See @encodings.
|
---|
| 585 | *
|
---|
| 586 | * Example:
|
---|
| 587 | *
|
---|
| 588 | + <?xml version="1.0" encoding="ISO-8859-1"?>
|
---|
| 589 | */
|
---|
| 590 |
|
---|
| 591 | /*
|
---|
| 592 | *@@gloss: documents documents
|
---|
| 593 | * XML documents are made up of storage units called @entities,
|
---|
| 594 | * which contain either parsed or unparsed data. Parsed data is
|
---|
| 595 | * made up of characters, some of which form @content,
|
---|
| 596 | * and some of which form @markup.
|
---|
| 597 | *
|
---|
| 598 | * XML documents should start the with the XML @text_declaration.
|
---|
| 599 | *
|
---|
| 600 | * The function of the @markup in an XML document is to describe
|
---|
| 601 | * its storage and logical structure and to associate attribute-value
|
---|
| 602 | * pairs with its logical structures. XML provides a mechanism,
|
---|
| 603 | * the document type declaration (@DTD), to define constraints
|
---|
| 604 | * on the logical structure and to support the use of predefined
|
---|
| 605 | * storage units.
|
---|
| 606 | *
|
---|
| 607 | * A data object is an XML document if it is @well-formed.
|
---|
| 608 | * A well-formed XML document may in addition be @valid if it
|
---|
| 609 | * meets certain further constraints.
|
---|
| 610 | *
|
---|
| 611 | * A very simple XML document looks like this:
|
---|
| 612 | *
|
---|
| 613 | + <?xml version="1.0"?>
|
---|
| 614 | + <oldjoke>
|
---|
| 615 | + <burns>Say <quote>goodnight</quote>, Gracie.</burns>
|
---|
| 616 | + <allen><quote>Goodnight, Gracie.</quote></allen>
|
---|
| 617 | + <applause/>
|
---|
| 618 | + </oldjoke>
|
---|
| 619 | *
|
---|
| 620 | * This document is @well-formed, but not @valid (because it
|
---|
| 621 | * has no @DTD).
|
---|
| 622 | *
|
---|
| 623 | */
|
---|
| 624 |
|
---|
| 625 | /*
|
---|
| 626 | *@@gloss: element_declaration element declaration
|
---|
| 627 | * Element declarations identify the @names of elements and the
|
---|
| 628 | * nature of their content. They look like this:
|
---|
| 629 | +
|
---|
[38] | 630 | + <!ELEMENT name contentspec>
|
---|
[36] | 631 | +
|
---|
[38] | 632 | * No element may be declared more than once.
|
---|
| 633 | *
|
---|
| 634 | * The "name" of the element is obvious. The "contentspec"
|
---|
[36] | 635 | * is not. This specifies what may appear in the element
|
---|
[38] | 636 | * and can be one of the following:
|
---|
[36] | 637 | *
|
---|
[38] | 638 | * -- "EMPTY" marks the element as being empty (i.e.
|
---|
| 639 | * having no content at all).
|
---|
[36] | 640 | *
|
---|
[38] | 641 | * -- "ANY" does not impose any restrictions.
|
---|
[36] | 642 | *
|
---|
[38] | 643 | * -- (mixed): a "list" which declares the element to have
|
---|
| 644 | * mixed content. See below.
|
---|
[36] | 645 | *
|
---|
[38] | 646 | * -- (children): a "list" which declares the element to
|
---|
| 647 | * have child elements only, but no content. See below.
|
---|
[36] | 648 | *
|
---|
[38] | 649 | * <B>(mixed): content with elements</B>
|
---|
[36] | 650 | *
|
---|
[38] | 651 | * With the (mixed) contentspec, an element may either contain
|
---|
| 652 | * @content only or @content with subelements.
|
---|
[36] | 653 | *
|
---|
[38] | 654 | * While the (children) contentspec allows you to define sequences
|
---|
| 655 | * and orders, this is not possible with (mixed).
|
---|
[36] | 656 | *
|
---|
[38] | 657 | * "contentspec" must then be a pair of parentheses, optionally
|
---|
| 658 | * followed by "*". In the brackets, there must be at least the
|
---|
| 659 | * keyword "#PCDATA", optionally followed by "|" and element
|
---|
| 660 | * names. Note that if no #PCDATA appears, the (children) model
|
---|
| 661 | * is assumed (see below).
|
---|
[36] | 662 | *
|
---|
[38] | 663 | * Examples:
|
---|
[36] | 664 | *
|
---|
[38] | 665 | + <!ELEMENT name (#PCDATA)* >
|
---|
| 666 | + <!ELEMENT name (#PCDATA | subname1 | subname2)* >
|
---|
| 667 | + <!ELEMENT name (#PCDATA) >
|
---|
| 668 | *
|
---|
| 669 | * Note that if you specify sub-element names, you must terminate
|
---|
| 670 | * the contentspec with "*". Again, there's no way to specify
|
---|
| 671 | * orders etc. with (mixed).
|
---|
| 672 | *
|
---|
| 673 | * <B>(children): Element content only</B>
|
---|
| 674 | *
|
---|
| 675 | * With the (children) contentspec, an element may contain
|
---|
| 676 | * only other elements (and @whitespace), but no other @content.
|
---|
| 677 | *
|
---|
| 678 | * This can become fairly complicated. "contentspec" then must be
|
---|
| 679 | * a "list" followed by a "repeater".
|
---|
| 680 | *
|
---|
| 681 | * A "repeater" can be:
|
---|
| 682 | *
|
---|
| 683 | * -- Nothing: the preceding item _must_ appear exactly once.
|
---|
| 684 | *
|
---|
| 685 | * -- "+": the preceding item _must_ appear at _least_ once.
|
---|
| 686 | *
|
---|
| 687 | * -- "?": the preceding item _may_ appear exactly once.
|
---|
| 688 | *
|
---|
| 689 | * -- "*": the preceding item _may_ appear once or more than
|
---|
| 690 | * once or not at all.
|
---|
| 691 | *
|
---|
| 692 | * Here's the most simple example (precluding that "SUBELEMENT"
|
---|
| 693 | * is a valid "list" here):
|
---|
| 694 | *
|
---|
| 695 | + <!ELEMENT name (SUBELEMENT)* >
|
---|
| 696 | *
|
---|
| 697 | * In other words, in (children) mode, "contentspec" must always
|
---|
| 698 | * be in brackets and is followed by a "repeater" (which can be
|
---|
| 699 | * nothing).
|
---|
| 700 | *
|
---|
| 701 | * About "lists"... since these declarations may nest, this is
|
---|
| 702 | * where the recursive definition of a "content particle" comes
|
---|
| 703 | * in:
|
---|
| 704 | *
|
---|
| 705 | * -- A "content particle" is either a sub-element name or
|
---|
| 706 | * a nested list, followed by a "repeater".
|
---|
| 707 | *
|
---|
| 708 | * -- A "list" is defined as an enumeration of content particles,
|
---|
| 709 | * enclosed in parentheses, where the content particles are
|
---|
[39] | 710 | * separated by "connectors".
|
---|
[38] | 711 | *
|
---|
[39] | 712 | * There are two types of "connectors":
|
---|
[38] | 713 | *
|
---|
[36] | 714 | * -- Commas (",") indicate that the elements must appear
|
---|
[38] | 715 | * in the specified order ("sequence").
|
---|
[36] | 716 | *
|
---|
| 717 | * -- Vertical bars ("|") specify that the elements may
|
---|
[38] | 718 | * occur alternatively ("choice").
|
---|
[36] | 719 | *
|
---|
[39] | 720 | * The connectors cannot be mixed; the list must be
|
---|
[38] | 721 | * either completely "sequence" or "choice".
|
---|
[36] | 722 | *
|
---|
[38] | 723 | * Examples of content particles:
|
---|
| 724 | *
|
---|
| 725 | + SUBELEMENT+
|
---|
| 726 | + list*
|
---|
| 727 | *
|
---|
| 728 | * Examples of lists:
|
---|
| 729 | *
|
---|
| 730 | + ( cp | cp | cp | cp )
|
---|
| 731 | + ( cp , cp , cp , cp )
|
---|
| 732 | *
|
---|
| 733 | * Full examples for (children):
|
---|
| 734 | *
|
---|
| 735 | + <!ELEMENT oldjoke ( burns+, allen, applause? ) >
|
---|
| 736 | + | | +cp-+ | |
|
---|
| 737 | + | | | |
|
---|
| 738 | + | +------- list ---------+ |
|
---|
| 739 | + +-------contentspec--------+
|
---|
| 740 | *
|
---|
| 741 | * This specifies a "seqlist" for the "oldjoke" element. The
|
---|
| 742 | * list is not nested, so the content particles are element
|
---|
| 743 | * names only.
|
---|
| 744 | *
|
---|
| 745 | * Within "oldjoke", "burns" must appear first and can appear
|
---|
| 746 | * once or several times.
|
---|
| 747 | *
|
---|
| 748 | * Next must be "allen", exactly once (since there's no repeater).
|
---|
| 749 | *
|
---|
| 750 | * Optionally ("?"), there can be "applause" at the end.
|
---|
| 751 | *
|
---|
| 752 | * Now, a nested example:
|
---|
| 753 | *
|
---|
[39] | 754 | + <!ELEMENT poem (title?, (stanza+ | couplet+ | line+) ) >
|
---|
| 755 | *
|
---|
| 756 | * That is, a poem consists of an optional title, followed by one or
|
---|
| 757 | * several stanzas, or one or several couplets, or one or several lines.
|
---|
| 758 | * This is different from:
|
---|
| 759 | *
|
---|
| 760 | + <!ELEMENT poem (title?, (stanza | couplet | line)+ ) >
|
---|
| 761 | *
|
---|
| 762 | * The latter allows for a single poem to contain a mixture of stanzas,
|
---|
| 763 | * couplets or lines.
|
---|
| 764 | *
|
---|
| 765 | * And for WarpIN:
|
---|
| 766 | *
|
---|
[38] | 767 | + <!ELEMENT WARPIN (REXX*, VARPROMPT*, MSG?, TITLE?, (GROUP | PCK)+), PAGE+) >
|
---|
| 768 | *
|
---|
[36] | 769 | */
|
---|
| 770 |
|
---|
| 771 | /*
|
---|
| 772 | *@@gloss: attribute_declaration attribute declaration
|
---|
| 773 | * Attribute declarations identify the @names of attributes
|
---|
| 774 | * of @elements and their possible values. They look like this:
|
---|
| 775 | *
|
---|
| 776 | + <!ATTLIST elementname
|
---|
| 777 | + attname atttype defaultvalue
|
---|
| 778 | + attname atttype defaultvalue
|
---|
| 779 | + ... >
|
---|
| 780 | *
|
---|
| 781 | * "elementname" is the element name for which the
|
---|
| 782 | * attributes are being defined.
|
---|
| 783 | *
|
---|
| 784 | * For each attribute, you must then specify three
|
---|
| 785 | * columns:
|
---|
| 786 | *
|
---|
| 787 | * -- "attname" is the attribute name.
|
---|
| 788 | *
|
---|
| 789 | * -- "atttype" is the attribute type (one of six values,
|
---|
| 790 | * see below).
|
---|
| 791 | *
|
---|
| 792 | * -- "defaultvalue" specifies the default value.
|
---|
| 793 | *
|
---|
| 794 | * The attribute type (specifying the value type) must be
|
---|
| 795 | * one of six:
|
---|
| 796 | *
|
---|
| 797 | * -- "CDATA" is any character data. (This has nothing to
|
---|
| 798 | * do with @CDATA sections.)
|
---|
| 799 | *
|
---|
| 800 | * -- "ID": the value must be a unique @name among the
|
---|
| 801 | * document. Only one such attribute is allowed per
|
---|
| 802 | * element.
|
---|
| 803 | *
|
---|
| 804 | * -- "IDREF" or "IDREFS": a reference to some other
|
---|
| 805 | * element which has an "ID" attribute with this value.
|
---|
| 806 | * "IDREFS" is the plural and may contain several of
|
---|
| 807 | * those separated by @whitespace.
|
---|
| 808 | *
|
---|
| 809 | * -- "ENTITY" or "ENTITIES": a reference to some an
|
---|
| 810 | * external entity (see @external_entities).
|
---|
| 811 | * "ENTITIES" is the plural and may contain several of
|
---|
| 812 | * those separated by @whitespace.
|
---|
| 813 | *
|
---|
| 814 | * -- "NMTOKEN" or "NMTOKENS": a single-word string.
|
---|
| 815 | * This is not a reference though.
|
---|
| 816 | * "NMTOKENS" is the plural and may contain several of
|
---|
| 817 | * those separated by @whitespace.
|
---|
| 818 | *
|
---|
| 819 | * -- an enumeration: an explicit list of allowed
|
---|
| 820 | * values for this attribute. Additionally, you can specify
|
---|
| 821 | * that the names must match a particular @notation_declaration.
|
---|
| 822 | *
|
---|
| 823 | * The "defaultvalue" (third column) can be one of these:
|
---|
| 824 | *
|
---|
| 825 | * -- "#REQUIRED": the attribute may not be omitted.
|
---|
| 826 | *
|
---|
| 827 | * -- "#IMPLIED": the attribute is optional, and there's
|
---|
| 828 | * no default value.
|
---|
| 829 | *
|
---|
| 830 | * -- "'value'": the attribute is optional, and it has
|
---|
| 831 | * this default.
|
---|
| 832 | *
|
---|
| 833 | * -- "#FIXED 'value'": the attribute is optional, but if
|
---|
| 834 | * it appears, it must have this value.
|
---|
| 835 | *
|
---|
| 836 | * Example:
|
---|
| 837 | *
|
---|
| 838 | + <!ATTLIST oldjoke
|
---|
| 839 | + name ID #REQUIRED
|
---|
| 840 | + label CDATA #IMPLIED
|
---|
| 841 | + status ( funny | notfunny ) 'funny'>
|
---|
| 842 | */
|
---|
| 843 |
|
---|
| 844 | /*
|
---|
| 845 | *@@gloss: entity_declaration entity declaration
|
---|
| 846 | * Entity declarations define @entities.
|
---|
| 847 | *
|
---|
| 848 | * An example of @internal_entities:
|
---|
| 849 | *
|
---|
| 850 | + <!ENTITY ATI "ArborText, Inc.">
|
---|
| 851 | *
|
---|
| 852 | * Examples of @external_entities:
|
---|
| 853 | *
|
---|
| 854 | + <!ENTITY boilerplate SYSTEM "/standard/legalnotice.xml">
|
---|
| 855 | + <!ENTITY ATIlogo SYSTEM "/standard/logo.gif" NDATA GIF87A>
|
---|
| 856 | */
|
---|
| 857 |
|
---|
| 858 | /*
|
---|
| 859 | *@@gloss: notation_declaration notation declaration
|
---|
| 860 | * Notation declarations identify specific types of external
|
---|
| 861 | * binary data. This information is passed to the processing
|
---|
| 862 | * application, which may make whatever use of it it wishes.
|
---|
| 863 | *
|
---|
| 864 | * Example:
|
---|
| 865 | *
|
---|
| 866 | + <!NOTATION GIF87A SYSTEM "GIF">
|
---|
| 867 | */
|
---|
| 868 |
|
---|
| 869 | /*
|
---|
| 870 | *@@gloss: DTD DTD
|
---|
| 871 | * The XML document type declaration contains or points to
|
---|
| 872 | * markup declarations that provide a grammar for a class of @documents.
|
---|
| 873 | * This grammar is known as a Document Type Definition, or DTD.
|
---|
| 874 | *
|
---|
| 875 | * The DTD must look like the following:
|
---|
| 876 | *
|
---|
| 877 | + <!DOCTYPE name ... >
|
---|
| 878 | *
|
---|
| 879 | * "name" must match the document's root element.
|
---|
| 880 | *
|
---|
| 881 | * "..." can be the reference to an external subset (being a special
|
---|
| 882 | * case of @external_entities):
|
---|
| 883 | *
|
---|
| 884 | + <!DOCTYPE name SYSTEM "whatever.dtd">
|
---|
| 885 | *
|
---|
[63] | 886 | * The SYSTEM identifier is required with XML, while a public
|
---|
| 887 | * identifier is not. (In SGML, neither is required, but at
|
---|
| 888 | * least one must be present.)
|
---|
[36] | 889 | *
|
---|
[63] | 890 | * Alternatively,specify an internal subset in brackets, which
|
---|
| 891 | * contains the markup directly:
|
---|
| 892 | *
|
---|
[36] | 893 | + <!DOCTYPE name [
|
---|
| 894 | + <!ELEMENT greeting (#PCDATA)>
|
---|
| 895 | + ]>
|
---|
| 896 | *
|
---|
| 897 | * You can even mix both.
|
---|
| 898 | *
|
---|
| 899 | * A markup declaration is either an @element_declaration, an
|
---|
| 900 | * @attribute_declaration, an @entity_declaration,
|
---|
| 901 | * or a @notation_declaration. These declarations may be contained
|
---|
| 902 | * in whole or in part within @parameter_entities.
|
---|
| 903 | */
|
---|
[38] | 904 |
|
---|
| 905 | /*
|
---|
| 906 | *@@gloss: DOM DOM
|
---|
| 907 | * DOM is the "Document Object Model", as defined by the W3C.
|
---|
| 908 | *
|
---|
| 909 | * The DOM is a programming interface for @XML @documents.
|
---|
| 910 | * (XML is a metalanguage and describes the documents
|
---|
| 911 | * themselves. DOM is a programming interface -- an API --
|
---|
| 912 | * to access XML documents.)
|
---|
| 913 | *
|
---|
| 914 | * The W3C calls this "a platform- and language-neutral
|
---|
| 915 | * interface that allows programs and scripts to dynamically
|
---|
| 916 | * access and update the content, structure and style of
|
---|
| 917 | * documents. The Document Object Model provides
|
---|
| 918 | * a standard set of objects for representing HTML and XML
|
---|
| 919 | * documents, a standard model of how these objects can
|
---|
| 920 | * be combined, and a standard interface for accessing and
|
---|
| 921 | * manipulating them. Vendors can support the DOM as an
|
---|
| 922 | * interface to their proprietary data structures and APIs,
|
---|
| 923 | * and content authors can write to the standard DOM
|
---|
| 924 | * interfaces rather than product-specific APIs, thus
|
---|
| 925 | * increasing interoperability on the Web."
|
---|
| 926 | *
|
---|
| 927 | * In short, DOM specifies that an XML document is broken
|
---|
| 928 | * up into a tree of "nodes", representing the various parts
|
---|
| 929 | * of an XML document. Such nodes represent @documents,
|
---|
| 930 | * @elements, @attributes, @processing_instructions,
|
---|
| 931 | * @comments, @content, and more.
|
---|
| 932 | *
|
---|
| 933 | * See xml.c for an introduction to XML and DOM support in
|
---|
| 934 | * the XWorkplace helpers.
|
---|
| 935 | *
|
---|
| 936 | * Example: Take this HTML table definition:
|
---|
| 937 | +
|
---|
| 938 | + <TABLE>
|
---|
| 939 | + <TBODY>
|
---|
| 940 | + <TR>
|
---|
| 941 | + <TD>Column 1-1</TD>
|
---|
| 942 | + <TD>Column 1-2</TD>
|
---|
| 943 | + </TR>
|
---|
| 944 | + <TR>
|
---|
| 945 | + <TD>Column 2-1</TD>
|
---|
| 946 | + <TD>Column 2-2</TD>
|
---|
| 947 | + </TR>
|
---|
| 948 | + </TBODY>
|
---|
| 949 | + </TABLE>
|
---|
| 950 | *
|
---|
| 951 | * In the DOM, this would be represented by a tree as follows:
|
---|
| 952 | +
|
---|
| 953 | + ÚÄÄÄÄÄÄÄÄÄÄÄÄ¿
|
---|
| 954 | + ³ TABLE ³ (only ELEMENT node in root DOCUMENT node)
|
---|
| 955 | + ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
|
---|
| 956 | + ³
|
---|
| 957 | + ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿
|
---|
| 958 | + ³ TBODY ³ (only ELEMENT node in root "TABLE" node)
|
---|
| 959 | + ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
|
---|
| 960 | + ÚÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄ¿
|
---|
| 961 | + ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿
|
---|
| 962 | + ³ TR ³ ³ TR ³
|
---|
| 963 | + ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
|
---|
| 964 | + ÚÄÄÄÁÄÄÄÄÄÄ¿ ÚÄÄÄÁÄÄÄÄÄÄ¿
|
---|
| 965 | + ÚÄÄÄÁÄ¿ ÚÄÄÁÄÄ¿ ÚÄÄÄÁÄ¿ ÚÄÄÁÄÄ¿
|
---|
| 966 | + ³ TD ³ ³ TD ³ ³ TD ³ ³ TD ³
|
---|
| 967 | + ÀÄÄÂÄÄÙ ÀÄÄÂÄÄÙ ÀÄÄÄÂÄÙ ÀÄÄÂÄÄÙ
|
---|
| 968 | + ÉÍÍÍÍÍÊÍÍÍÍ» ÉÍÍÍÍÊÍÍÍÍÍ» ÉÍÍÍÍÊÍÍÍÍÍ» ÉÍÍÊÍÍÍÍÍÍÍ»
|
---|
| 969 | + ºColumn 1-1º ºColumn 1-2º ºColumn 2-1º ºColumn 2-2º (one TEXT node in each parent node)
|
---|
| 970 | + ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ
|
---|
| 971 | */
|
---|
| 972 |
|
---|
| 973 | /*
|
---|
| 974 | *@@gloss: DOM_DOCUMENT DOCUMENT
|
---|
| 975 | * representation of XML @documents in the @DOM.
|
---|
| 976 | *
|
---|
| 977 | * The xwphelpers implementation has the following differences
|
---|
| 978 | * to the DOM specs:
|
---|
| 979 | *
|
---|
| 980 | * -- The "doctype" member points to the documents @DTD, or is NULL.
|
---|
| 981 | * In our implementation, this is the pvExtra pointer, which points
|
---|
| 982 | * to a _DOMDTD.
|
---|
| 983 | *
|
---|
| 984 | * -- The "implementation" member points to a DOMImplementation object.
|
---|
| 985 | * This is not supported here.
|
---|
| 986 | *
|
---|
| 987 | * -- The "documentElement" member is a convenience pointer to the
|
---|
| 988 | * document's root element. We don't supply this field; instead,
|
---|
| 989 | * the llChildren list only contains a single ELEMENT node for the
|
---|
| 990 | * root element.
|
---|
| 991 | *
|
---|
| 992 | * -- The "createElement" method is implemented by xmlCreateElementNode.
|
---|
| 993 | *
|
---|
| 994 | * -- The "createAttribute" method is implemented by xmlCreateAttributeNode.
|
---|
| 995 | *
|
---|
| 996 | * -- The "createTextNode" method is implemented by xmlCreateTextNode,
|
---|
| 997 | * which has an extra parameter though.
|
---|
| 998 | *
|
---|
| 999 | * -- The "createComment" method is implemented by xmlCreateCommentNode.
|
---|
| 1000 | *
|
---|
| 1001 | * -- The "createProcessingInstruction" method is implemented by
|
---|
| 1002 | * xmlCreatePINode.
|
---|
| 1003 | *
|
---|
| 1004 | * -- The "createDocumentFragment", "createCDATASection", and
|
---|
| 1005 | * "createEntityReference" methods are not supported.
|
---|
| 1006 | */
|
---|
| 1007 |
|
---|
| 1008 |
|
---|