source: trunk/src/helpers/xmldefs.c@ 46

Last change on this file since 46 was 39, checked in by umoeller, 24 years ago

Misc. fixes.

  • Property svn:eol-style set to CRLF
  • Property svn:keywords set to Author Date Id Revision
File size: 38.2 KB
Line 
1
2/*
3 *@@sourcefile xmldefs.c:
4 * this file is just for xdoc and contains glossary items for
5 * XML. It is never compiled.
6 *
7 *@@added V0.9.6 (2000-10-29) [umoeller]
8 */
9
10/*
11 * Copyright (C) 2001 Ulrich M”ller.
12 * This file is part of the "XWorkplace helpers" source package.
13 * This is free software; you can redistribute it and/or modify
14 * it under the terms of the GNU General Public License as published
15 * by the Free Software Foundation, in version 2 as it comes in the
16 * "COPYING" file of the XWorkplace main distribution.
17 * This program is distributed in the hope that it will be useful,
18 * but WITHOUT ANY WARRANTY; without even the implied warranty of
19 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
20 * GNU General Public License for more details.
21 */
22
23/*
24 *@@gloss: expat expat
25 * Expat is one of the most well-known XML processors (parsers).
26 * I (umoeller) have ported expat to the XWorkplace Helpers
27 * library. See xmlparse.c for an introduction to expat. See
28 * xml.c for an introduction to XML support in the XWorkplace
29 * Helpers in general.
30 */
31
32
33/*
34 *@@gloss: XML XML
35 * XML is the Extensible Markup Language, as defined by
36 * the W3C. XML isn't really a language, but a meta-language
37 * for describing markup languages. It is a simplified subset
38 * of SGML.
39 *
40 * You should be familiar with the following:
41 *
42 * -- XML parsers operate on XML @documents.
43 *
44 * -- Each XML document has both a physical and a logical
45 * structure.
46 *
47 * Physically, the document is composed of units called
48 * @entities.
49 *
50 * Logically, the document is composed of @markup and
51 * @content. Among other things, markup separates the content
52 * into @elements.
53 *
54 * -- The logical and physical structures must nest properly (be
55 * @well-formed) for each entity, which results in the entire
56 * XML document being well-formed as well.
57 */
58
59/*
60 *@@gloss: entities entities
61 * An "entity" is an XML storage unit. It's a very abstract
62 * concept, and the term doesn't make much sense, but it was
63 * in SGML already, and XML chose to inherit it.
64 *
65 * In the simplest case, an XML document has only one entity,
66 * which is an XML file (or memory buffer from wherever).
67 * The document entity serves as the root of the entity tree
68 * and a starting-point for an XML processor. Unlike other
69 * entities, the document entity has no name and might well
70 * appear on a processor input stream without any identification
71 * at all.
72 *
73 * Entities are defined to be either parsed or unparsed.
74 *
75 * Other than that, there are @internal_entities,
76 * @external_entities, and @parameter_entities.
77 *
78 * See @entity_references for how to reference entities.
79 */
80
81/*
82 *@@gloss: entity_references entity references
83 * An "entity reference" refers to the content of a named
84 * entity (see: @entities). It is included in "&amp" and ";"
85 * characters.
86 *
87 * If you declare @internal_entities in the @DTD, referencing
88 * them allows for text replacements as in SGML:
89 *
90 + This document was prepared on &PrepDate;.
91 *
92 * The same works for @external_entities though. Assuming
93 * that "SecondFile" has been declared in the DTD to point
94 * to another file,
95 *
96 + See the following README: &SecondFile;
97 *
98 * would then insert the complete contents of the second
99 * file into the document. The XML processor will parse
100 * that file as if it were at that position in the original
101 * document.
102 *
103 * An entity is "included" when its replacement text
104 * is retrieved and processed, in place of the reference itself,
105 * as though it were part of the document at the location the
106 * reference was recognized.
107 * The replacement text may contain
108 * both @content and (except for @parameter_entities)
109 * @markup, which must be recognized in the usual way, except
110 * that the replacement text of entities used to escape markup
111 * delimiters (the entities amp, lt, gt, apos, quot) is always
112 * treated as data. (The string "AT&T;" expands to "AT&T;"
113 * and the remaining ampersand is not recognized as an
114 * entity-reference delimiter.) A @character_reference is
115 * included when the indicated character is processed in
116 * place of the reference itself.
117 *
118 * The following are forbidden, and constitute fatal errors:
119 *
120 * -- the appearance of a reference to an unparsed entity;
121 *
122 * -- the appearance of any character or general-entity reference
123 * in the @DTD except within an EntityValue or AttValue;
124 *
125 * -- a reference to an external entity in an attribute value.
126 */
127
128/*
129 *@@gloss: internal_entities internal entities
130 * An "internal entity" has no separate physical storage.
131 * Its contents appear in the document's @DTD as an
132 * @entity_declaration, like this:
133 *
134 + <!ENTITY PrepDate "Feb 11, 2001">
135 *
136 * This can later be referenced with @entity_references
137 * and allows you to define shortcuts for frequently typed
138 * text or text that is expected to change, such as the
139 * revision status of a document.
140 *
141 * XML has five built-in internal entities:
142 *
143 * -- "&amp;amp;" refers to the ampersand ("&amp") character,
144 * which normally introduces @markup and can therefore
145 * only be literally used in @comments, @processing_instructions,
146 * or @CDATA sections. This is also legal within the literal
147 * entity value of declarations of internal entities.
148 *
149 * -- "&amp;lt;" and "&amp;gt;" refer to the angle brackets
150 * ("&lt;", "&gt;") which normally introduce @elements.
151 * They must be escaped unless used in a @CDATA section.
152 *
153 * -- To allow values in @attributes to contain both single and double
154 * quotes, the apostrophe or single-quote character (') may be
155 * represented as "&amp;apos;", and the double-quote character
156 * (") as "&amp;quot;".
157 *
158 * A numeric @character_reference is a special case of an entity reference.
159 *
160 * An internal entity is always parsed.
161 *
162 * Also see @entities.
163 */
164
165/*
166 *@@gloss: parameter_entities parameter entities
167 * Parameter entities can only be references in the @DTD.
168 * A parameter entity is identified by placing "% " (percent-space)
169 * in front of its name in the declaration. The percent sign is
170 * also used in references to parameter entities, instead of the
171 * ampersand. Parameter entity references are immediately expanded
172 * in the DTD and their replacement text is
173 * part of the declaration, whereas normal @entity_references are not
174 * expanded.
175 */
176
177/*
178 *@@gloss: external_entities external entities
179 * As opposed to @internal_entities, "external entities" refer
180 * to different storage.
181 *
182 * They must have a "system ID" with the URI specifying where
183 * the entity can be retrieved. Those URIs may be absolute
184 * or relative. Unless otherwise provided (e.g. by a special
185 * XML element type defined by a particular @DTD, or
186 * @processing_instructions defined by a particular application
187 * specification), relative URIs are relative to the location
188 * of the resource within which the entity declaration occurs.
189 *
190 * Optionally, external entities may specify a "public ID"
191 * as well. An XML processor attempting to retrieve the entity's
192 * content may use the public identifier to try to generate an
193 * alternative URI. If the processor is unable to do so, it must
194 * use the URI specified in the system literal. Before a match
195 * is attempted, all strings of @whitespace in the public
196 * identifier must be normalized to single space characters (#x20),
197 * and leading and trailing white space must be removed.
198 *
199 * An external entity is not always parsed.
200 *
201 * External entities allow an XML document to refer to an external
202 * file. External entities contain either text or binary data. If
203 * they contain text, the content of the external file is inserted
204 * at the point of reference and parsed as part of the referring
205 * document. Binary data is not parsed and may only be referenced
206 * in an attribute that has been declared as ENTITY or ENTITIES.
207 * Binary data is used to reference figures and
208 * other non-XML content in the document.
209 *
210 * Examples of external entity declarations:
211 +
212 + <!ENTITY open-hatch
213 + SYSTEM "http://www.textuality.com/boilerplate/OpenHatch.xml">
214 + <!ENTITY open-hatch
215 + PUBLIC "-//Textuality//TEXT Standard open-hatch boilerplate//EN"
216 + "http://www.textuality.com/boilerplate/OpenHatch.xml">
217 + <!ENTITY hatch-pic
218 + SYSTEM "../grafix/OpenHatch.gif" NDATA gif >
219 *
220 * Character @encoding is processed on a per-external-entity basis.
221 * As a result, each external parsed entity in an XML document may
222 * use a different encoding for its characters.
223 *
224 * In the document entity, the encoding declaration is part of the XML
225 * @text_declaration.
226 *
227 * Also see @entities.
228 */
229
230/*
231 *@@gloss: external_parsed_entities external parsed entities
232 * An external parsed entity is an external entity that has
233 * been parsed, which is not necessarily the case.
234 *
235 * See @external_entities.
236 */
237
238/*
239 *@@gloss: markup markup
240 * XML "markup" encodes a description of the @document's storage
241 * layout and logical structure.
242 *
243 * Markup is either @elements, @entity_references, @comments, @CDATA
244 * section delimiters, @DTD's, or @processing_instructions.
245 *
246 * XML "text" consists of markup and @content.
247 */
248
249/*
250 *@@gloss: whitespace whitespace
251 * In XML, "whitespace" consists of one or more space (0x20)
252 * characters, carriage returns, line feeds, or tabs.
253 *
254 * Whitespace handling in XML can vary. In @markup, this is
255 * used to separate the various @entities of course. However,
256 * in @content (i.e. non-markup), an application may
257 * or may not be interested in white space. Whitespace
258 * handling can therefore be handled differently for each
259 * element with the use of the special "xml:space" @attributes.
260 */
261
262/*
263 *@@gloss: character_reference character reference
264 * Character references escape Unicode characters. They are
265 * a special case of @entity_references.
266 *
267 * They may be used to refer to a specific character in the
268 * ISO/IEC 10646 character set, for example one not directly
269 * accessible from available input devices.
270 *
271 * If the character reference thus begins with "&amp;#x", the
272 * digits and letters up to the terminating ";" provide a
273 * hexadecimal representation of the character's code point in
274 * ISO/IEC 10646. If it begins just with "&amp;#", the
275 * digits up to the terminating ";" provide a decimal
276 * representation of the character's code point.
277 */
278
279/*
280 *@@gloss: content content
281 * XML "text" consists of @markup and "content" (the XML spec
282 * calls this "character data"). Content is simply everything
283 * that is not markup.
284 *
285 * To access characters that would either otherwise be recognized
286 * as @markup or are difficult to reach via the keyboard, XML
287 * allows for using a @character_reference.
288 *
289 * Within @elements, content is any string of
290 * characters which does not contain the start-delimiter of
291 * any markup. In a @CDATA section, content is any
292 * string of characters not including the CDATA-section-close
293 * delimiter, "]]>".
294 *
295 * The character @encodings may vary between @external_parsed_entities.
296 */
297
298/*
299 *@@gloss: names names
300 * In XML, a "name" is a token beginning with a letter or one of a
301 * few punctuation characters, and continuing with letters,
302 * digits, hyphens, underscores, colons, or full stops,
303 * together known as name characters. The colon has a
304 * special meaning with XML namespaces.
305 */
306
307/*
308 *@@gloss: elements elements
309 * Elements are the most common form of XML @markup.
310 * They are identified by their @names.
311 *
312 * As opposed to HTML, there are two types of elements:
313 *
314 * A non-empty element starts and ends with a start-tag
315 * and an end-tag:
316 *
317 + <LI>...</LI>
318 *
319 * As opposed to HTML, an empty element must have an
320 * empty-element tag:
321 *
322 + <P /> <IMG align="left" src="http://www.w3.org/Icons/WWW/w3c_home" />
323 *
324 * In addition, @attributes contains extra parameters to elements.
325 * If the element has attributes, they must be in the start-tag
326 * (or empty-element tag).
327 *
328 * For non-empty elements, the text between the start-tag
329 * and end-tag is called the element's content and may
330 * contain other elements, character data, an entity
331 * reference, a @CDATA section, a processing instruction,
332 * or a comment.
333 *
334 * The XML specs break this into "content particles".
335 *
336 * An element has "mixed content" when it may contain
337 * @content, optionally interspersed with child
338 * elements. In this case, the types of the child
339 * elements may be constrained by a documents @DTD, but
340 * not their order or their number of occurrences.
341 */
342
343/*
344 *@@gloss: attributes attributes
345 * "Attributes" are name-value pairs that have been associated
346 * with @elements. Attributes can only appear in start-tags
347 * or empty-tags.
348 *
349 * Attributes are identified by their @names. Each such
350 * identifier may only appear once per element.
351 *
352 * As opposed to HTML, attribute values must be quoted (either
353 * in single or double quotes). You may use a @character_reference
354 * to escape quotes in attribute values.
355 *
356 * Example of an attribute:
357 *
358 + <IMG SRC="mypic.gif" />
359 *
360 * SRC="mypic.gif" is the attribute here.
361 *
362 * There are a few <B>special attributes</B> defined by XML.
363 * In @valid documents, these attributes, like any other,
364 * must be declared if they are used. These attributes are
365 * recursive, i.e. they are considered to apply to all elements
366 * within the content of the element where they are specified,
367 * unless overridden in a sub-element.
368 *
369 * -- "xml:space" may be attached to an element to signal
370 * that @whitespace should be preserved for this element.
371 *
372 * The value "default" signals that applications' default
373 * whitespace processing modes are acceptable for this
374 * element; the value "preserve" indicates the intent that
375 * applications preserve all the white space.
376 *
377 * -- "xml:lang" may be inserted in documents to specify the
378 * language used in the contents and attribute values of
379 * any element in an XML document.
380 *
381 * The value is either a two-letter language code (e.g. "en")
382 * or a combination of language and country code. Interestingly,
383 * the English W3C XML spec gives the following examples:
384 *
385 + <p xml:lang="en">The quick brown fox jumps over the lazy dog.</p>
386 + <p xml:lang="en-GB">What colour is it?</p>
387 + <p xml:lang="en-US">What color is it?</p>
388 + <sp who="Faust" desc='leise' xml:lang="de">
389 + <l>Habe nun, ach! Philosophie,</l>
390 + <l>Juristerei, und Medizin</l>
391 + <l>und leider auch Theologie</l>
392 + <l>durchaus studiert mit heiáem Bemh'n.</l>
393 + </sp>
394 */
395
396/*
397 *@@gloss: comments comments
398 * Comments may appear anywhere in a document outside other
399 * markup; in addition, they may appear within the @DTD at
400 * places allowed by the grammar. They are not part of the
401 * document's @content; an XML processor may, but
402 * need not, make it possible for an application to retrieve
403 * the text of comments (@expat has a handler for this).
404 *
405 * Comments may contain any text except "--" (double-hyphen).
406 *
407 * Example of a comment:
408 *
409 + <!-- declarations for <head> & <body> -->
410 */
411
412/*
413 *@@gloss: CDATA CDATA
414 * CDATA sections can appear anywhere where @content
415 * is allowed. They are used to escape blocks of
416 * text containing characters which would otherwise be
417 * recognized as @markup.
418 *
419 * CDATA sections begin with the string &lt;![CDATA[ and end
420 * with the string ]]&gt;. Within a CDATA section, only the
421 * ]]&gt; string is recognized as @markup, so that left angle
422 * brackets and ampersands may occur in their literal form.
423 * They need not (and cannot) be escaped using "&amp;lt;" and
424 * "&amp;amp;". (This implies that not even @comments are
425 * recognized).
426 *
427 * CDATA sections cannot nest.
428 *
429 * Examples:
430 *
431 + <![CDATA[<greeting>Hello, world!</greeting>]]>
432 +
433 + <![CDATA[
434 + *p = &q;
435 + b = (i <= 3);
436 + ]]>
437 */
438
439/*
440 *@@gloss: processing_instructions processing instructions
441 * "Processing instructions" (PIs) contain additional
442 * data for applications.
443 *
444 * Like @comments, they are not textually part of the XML
445 * document, but the XML processor is required to pass
446 * them to an application.
447 *
448 * PIs have the form:
449 *
450 + <?name pidata?>
451 *
452 *
453 * The "name", called the PI "target", identifies the PI to
454 * the application. Applications should process only the
455 * targets they recognize and ignore all other PIs. Any
456 * data that follows the PI target is optional, it is for
457 * the application that recognizes the target. The names
458 * used in PIs may be declared in a @notation_declaration in order to
459 * formally identify them.
460 *
461 * PI names beginning with "xml" are reserved.
462 */
463
464/*
465 *@@gloss: well-formed well-formed
466 * XML @documents (the sum of all @entities) are "well-formed"
467 * if the following conditions are met (among others):
468 *
469 * -- They contain one or more @elements.
470 *
471 * -- There is exactly one element, called the root, or document
472 * element, no part of which appears in the @content of any
473 * other element.
474 *
475 * -- For all other elements, if the start-tag is in the content
476 * of another element, the end-tag is in the content of the
477 * same element. More simply stated, the elements nest
478 * properly within each other. (This is unlike HTML.)
479 *
480 * -- Values of string @attributes cannot contain references to
481 * @external_entities.
482 *
483 * -- No attribute may appear more than once in the same element.
484 *
485 * -- All entities except the amp, lt, gt, apos, and quot must be
486 * declared before they are used. Binary @external_entities
487 * cannot be referenced in the flow of @content, it can only
488 * be used in an attribute declared as ENTITY or ENTITIES.
489 *
490 * -- Neither text nor @parameter_entities are allowed to be
491 * recursive, directly or indirectly.
492 */
493
494/*
495 *@@gloss: valid valid
496 * XML @documents are said to be "valid" if they have a @DTD
497 * associated and they confirm to it. While XML documents
498 * must always be @well-formed, validation and validity is up
499 * to the implementation (i.e. at option to the application).
500 *
501 * Validating processors must report violations of the constraints
502 * expressed by the declarations in the @DTD, and failures to
503 * fulfill the validity constraints given in this specification.
504 * To accomplish this, validating XML processors must read and
505 * process the entire DTD and all @external_parsed_entities
506 * referenced in the document.
507 *
508 * Non-validating processors (such as @expat) are required to
509 * check only the document entity (see @entitites), including the
510 * entire internal DTD subset, for whether it is @well-formed.
511 *
512 * While they are not required to check the document for validity,
513 * they are required to process all the declarations they
514 * read in the internal DTD subset and in any parameter entity
515 * that they read, up to the first reference to a parameter
516 * entity that they do not read; that is to say, they must
517 * use the information in those declarations to normalize
518 * values of @attributes, include the replacement text of
519 * @internal_entities, and supply default attribute values.
520 * They must not process entity declarations or attribute-list
521 * declarations encountered after a reference to a
522 * parameter entity that is not read, since the entity may have
523 * contained overriding declarations.
524 */
525
526/*
527 *@@gloss: encodings encodings
528 * XML supports a wide variety of character encodings. These
529 * must be specified in the XML @text_declaration.
530 *
531 * There are too many character encodings on the planet to
532 * be listed here. The most common ones are:
533 *
534 * -- "UTF-8", "UTF-16", "ISO-10646-UCS-2", and "ISO-10646-UCS-4"
535 * should be used for the various encodings and transformations
536 * of Unicode / ISO/IEC 10646.
537 *
538 * -- "ISO-8859-x" (with "x" being a number from 1 to 9) represent
539 * the various ISO 8859 ("Latin") encodings.
540 *
541 * -- "ISO-2022-JP", "Shift_JIS", and "EUC-JP" should be used for
542 * the various encoded forms of JIS X-0208-1997.
543 *
544 * Example of a @text_declaration:
545 *
546 + <?xml version="1.0" encoding="ISO-8859-2"?>
547 *
548 * All XML processors must be able to read @entities in either
549 * UTF-8 or UTF-16. See XML_SetUnknownEncodingHandler for additional
550 * encodings directly supported by @expat.
551 *
552 * Entities encoded in UTF-16 must begin with the ZERO WIDTH NO-BREAK
553 * SPACE character, #xFEFF). This is an encoding signature, not part
554 * of either the @markup or the @content of the XML @document.
555 * XML processors must be able to use this character to differentiate
556 * between UTF-8 and UTF-16 encoded documents.
557 */
558
559/*
560 *@@gloss: text_declaration text declaration
561 * XML @documents and @external_parsed_entities may (and
562 * should) start with the XML text declaration, exactly like
563 * this:
564 *
565 + <?xml version="1.0" encoding="enc"?>
566 *
567 * where "1.0" is the only currently defined XML version
568 * and "enc" must be the encoding of the document.
569 *
570 * External parsed entities may begin with a text declaration,
571 * which looks like an XML declaration with just an encoding
572 * declaration:
573 *
574 + <?xml encoding="Big5"?>
575 *
576 * See @encodings.
577 *
578 * Example:
579 *
580 + <?xml version="1.0" encoding="ISO-8859-1"?>
581 */
582
583/*
584 *@@gloss: documents documents
585 * XML documents are made up of storage units called @entities,
586 * which contain either parsed or unparsed data. Parsed data is
587 * made up of characters, some of which form @content,
588 * and some of which form @markup.
589 *
590 * XML documents should start the with the XML @text_declaration.
591 *
592 * The function of the @markup in an XML document is to describe
593 * its storage and logical structure and to associate attribute-value
594 * pairs with its logical structures. XML provides a mechanism,
595 * the document type declaration (@DTD), to define constraints
596 * on the logical structure and to support the use of predefined
597 * storage units.
598 *
599 * A data object is an XML document if it is @well-formed.
600 * A well-formed XML document may in addition be @valid if it
601 * meets certain further constraints.
602 *
603 * A very simple XML document looks like this:
604 *
605 + <?xml version="1.0"?>
606 + <oldjoke>
607 + <burns>Say <quote>goodnight</quote>, Gracie.</burns>
608 + <allen><quote>Goodnight, Gracie.</quote></allen>
609 + <applause/>
610 + </oldjoke>
611 *
612 * This document is @well-formed, but not @valid (because it
613 * has no @DTD).
614 *
615 */
616
617/*
618 *@@gloss: element_declaration element declaration
619 * Element declarations identify the @names of elements and the
620 * nature of their content. They look like this:
621 +
622 + <!ELEMENT name contentspec>
623 +
624 * No element may be declared more than once.
625 *
626 * The "name" of the element is obvious. The "contentspec"
627 * is not. This specifies what may appear in the element
628 * and can be one of the following:
629 *
630 * -- "EMPTY" marks the element as being empty (i.e.
631 * having no content at all).
632 *
633 * -- "ANY" does not impose any restrictions.
634 *
635 * -- (mixed): a "list" which declares the element to have
636 * mixed content. See below.
637 *
638 * -- (children): a "list" which declares the element to
639 * have child elements only, but no content. See below.
640 *
641 * <B>(mixed): content with elements</B>
642 *
643 * With the (mixed) contentspec, an element may either contain
644 * @content only or @content with subelements.
645 *
646 * While the (children) contentspec allows you to define sequences
647 * and orders, this is not possible with (mixed).
648 *
649 * "contentspec" must then be a pair of parentheses, optionally
650 * followed by "*". In the brackets, there must be at least the
651 * keyword "#PCDATA", optionally followed by "|" and element
652 * names. Note that if no #PCDATA appears, the (children) model
653 * is assumed (see below).
654 *
655 * Examples:
656 *
657 + <!ELEMENT name (#PCDATA)* >
658 + <!ELEMENT name (#PCDATA | subname1 | subname2)* >
659 + <!ELEMENT name (#PCDATA) >
660 *
661 * Note that if you specify sub-element names, you must terminate
662 * the contentspec with "*". Again, there's no way to specify
663 * orders etc. with (mixed).
664 *
665 * <B>(children): Element content only</B>
666 *
667 * With the (children) contentspec, an element may contain
668 * only other elements (and @whitespace), but no other @content.
669 *
670 * This can become fairly complicated. "contentspec" then must be
671 * a "list" followed by a "repeater".
672 *
673 * A "repeater" can be:
674 *
675 * -- Nothing: the preceding item _must_ appear exactly once.
676 *
677 * -- "+": the preceding item _must_ appear at _least_ once.
678 *
679 * -- "?": the preceding item _may_ appear exactly once.
680 *
681 * -- "*": the preceding item _may_ appear once or more than
682 * once or not at all.
683 *
684 * Here's the most simple example (precluding that "SUBELEMENT"
685 * is a valid "list" here):
686 *
687 + <!ELEMENT name (SUBELEMENT)* >
688 *
689 * In other words, in (children) mode, "contentspec" must always
690 * be in brackets and is followed by a "repeater" (which can be
691 * nothing).
692 *
693 * About "lists"... since these declarations may nest, this is
694 * where the recursive definition of a "content particle" comes
695 * in:
696 *
697 * -- A "content particle" is either a sub-element name or
698 * a nested list, followed by a "repeater".
699 *
700 * -- A "list" is defined as an enumeration of content particles,
701 * enclosed in parentheses, where the content particles are
702 * separated by "connectors".
703 *
704 * There are two types of "connectors":
705 *
706 * -- Commas (",") indicate that the elements must appear
707 * in the specified order ("sequence").
708 *
709 * -- Vertical bars ("|") specify that the elements may
710 * occur alternatively ("choice").
711 *
712 * The connectors cannot be mixed; the list must be
713 * either completely "sequence" or "choice".
714 *
715 * Examples of content particles:
716 *
717 + SUBELEMENT+
718 + list*
719 *
720 * Examples of lists:
721 *
722 + ( cp | cp | cp | cp )
723 + ( cp , cp , cp , cp )
724 *
725 * Full examples for (children):
726 *
727 + <!ELEMENT oldjoke ( burns+, allen, applause? ) >
728 + | | +cp-+ | |
729 + | | | |
730 + | +------- list ---------+ |
731 + +-------contentspec--------+
732 *
733 * This specifies a "seqlist" for the "oldjoke" element. The
734 * list is not nested, so the content particles are element
735 * names only.
736 *
737 * Within "oldjoke", "burns" must appear first and can appear
738 * once or several times.
739 *
740 * Next must be "allen", exactly once (since there's no repeater).
741 *
742 * Optionally ("?"), there can be "applause" at the end.
743 *
744 * Now, a nested example:
745 *
746 + <!ELEMENT poem (title?, (stanza+ | couplet+ | line+) ) >
747 *
748 * That is, a poem consists of an optional title, followed by one or
749 * several stanzas, or one or several couplets, or one or several lines.
750 * This is different from:
751 *
752 + <!ELEMENT poem (title?, (stanza | couplet | line)+ ) >
753 *
754 * The latter allows for a single poem to contain a mixture of stanzas,
755 * couplets or lines.
756 *
757 * And for WarpIN:
758 *
759 + <!ELEMENT WARPIN (REXX*, VARPROMPT*, MSG?, TITLE?, (GROUP | PCK)+), PAGE+) >
760 *
761 */
762
763/*
764 *@@gloss: attribute_declaration attribute declaration
765 * Attribute declarations identify the @names of attributes
766 * of @elements and their possible values. They look like this:
767 *
768 + <!ATTLIST elementname
769 + attname atttype defaultvalue
770 + attname atttype defaultvalue
771 + ... >
772 *
773 * "elementname" is the element name for which the
774 * attributes are being defined.
775 *
776 * For each attribute, you must then specify three
777 * columns:
778 *
779 * -- "attname" is the attribute name.
780 *
781 * -- "atttype" is the attribute type (one of six values,
782 * see below).
783 *
784 * -- "defaultvalue" specifies the default value.
785 *
786 * The attribute type (specifying the value type) must be
787 * one of six:
788 *
789 * -- "CDATA" is any character data. (This has nothing to
790 * do with @CDATA sections.)
791 *
792 * -- "ID": the value must be a unique @name among the
793 * document. Only one such attribute is allowed per
794 * element.
795 *
796 * -- "IDREF" or "IDREFS": a reference to some other
797 * element which has an "ID" attribute with this value.
798 * "IDREFS" is the plural and may contain several of
799 * those separated by @whitespace.
800 *
801 * -- "ENTITY" or "ENTITIES": a reference to some an
802 * external entity (see @external_entities).
803 * "ENTITIES" is the plural and may contain several of
804 * those separated by @whitespace.
805 *
806 * -- "NMTOKEN" or "NMTOKENS": a single-word string.
807 * This is not a reference though.
808 * "NMTOKENS" is the plural and may contain several of
809 * those separated by @whitespace.
810 *
811 * -- an enumeration: an explicit list of allowed
812 * values for this attribute. Additionally, you can specify
813 * that the names must match a particular @notation_declaration.
814 *
815 * The "defaultvalue" (third column) can be one of these:
816 *
817 * -- "#REQUIRED": the attribute may not be omitted.
818 *
819 * -- "#IMPLIED": the attribute is optional, and there's
820 * no default value.
821 *
822 * -- "'value'": the attribute is optional, and it has
823 * this default.
824 *
825 * -- "#FIXED 'value'": the attribute is optional, but if
826 * it appears, it must have this value.
827 *
828 * Example:
829 *
830 + <!ATTLIST oldjoke
831 + name ID #REQUIRED
832 + label CDATA #IMPLIED
833 + status ( funny | notfunny ) 'funny'>
834 */
835
836/*
837 *@@gloss: entity_declaration entity declaration
838 * Entity declarations define @entities.
839 *
840 * An example of @internal_entities:
841 *
842 + <!ENTITY ATI "ArborText, Inc.">
843 *
844 * Examples of @external_entities:
845 *
846 + <!ENTITY boilerplate SYSTEM "/standard/legalnotice.xml">
847 + <!ENTITY ATIlogo SYSTEM "/standard/logo.gif" NDATA GIF87A>
848 */
849
850/*
851 *@@gloss: notation_declaration notation declaration
852 * Notation declarations identify specific types of external
853 * binary data. This information is passed to the processing
854 * application, which may make whatever use of it it wishes.
855 *
856 * Example:
857 *
858 + <!NOTATION GIF87A SYSTEM "GIF">
859 */
860
861/*
862 *@@gloss: DTD DTD
863 * The XML document type declaration contains or points to
864 * markup declarations that provide a grammar for a class of @documents.
865 * This grammar is known as a Document Type Definition, or DTD.
866 *
867 * The DTD must look like the following:
868 *
869 + <!DOCTYPE name ... >
870 *
871 * "name" must match the document's root element.
872 *
873 * "..." can be the reference to an external subset (being a special
874 * case of @external_entities):
875 *
876 + <!DOCTYPE name SYSTEM "whatever.dtd">
877 *
878 * or an internal subset in brackets, which contains the markup
879 * directly:
880 *
881 + <!DOCTYPE name [
882 + <!ELEMENT greeting (#PCDATA)>
883 + ]>
884 *
885 * You can even mix both.
886 *
887 * A markup declaration is either an @element_declaration, an
888 * @attribute_declaration, an @entity_declaration,
889 * or a @notation_declaration. These declarations may be contained
890 * in whole or in part within @parameter_entities.
891 */
892
893/*
894 *@@gloss: DOM DOM
895 * DOM is the "Document Object Model", as defined by the W3C.
896 *
897 * The DOM is a programming interface for @XML @documents.
898 * (XML is a metalanguage and describes the documents
899 * themselves. DOM is a programming interface -- an API --
900 * to access XML documents.)
901 *
902 * The W3C calls this "a platform- and language-neutral
903 * interface that allows programs and scripts to dynamically
904 * access and update the content, structure and style of
905 * documents. The Document Object Model provides
906 * a standard set of objects for representing HTML and XML
907 * documents, a standard model of how these objects can
908 * be combined, and a standard interface for accessing and
909 * manipulating them. Vendors can support the DOM as an
910 * interface to their proprietary data structures and APIs,
911 * and content authors can write to the standard DOM
912 * interfaces rather than product-specific APIs, thus
913 * increasing interoperability on the Web."
914 *
915 * In short, DOM specifies that an XML document is broken
916 * up into a tree of "nodes", representing the various parts
917 * of an XML document. Such nodes represent @documents,
918 * @elements, @attributes, @processing_instructions,
919 * @comments, @content, and more.
920 *
921 * See xml.c for an introduction to XML and DOM support in
922 * the XWorkplace helpers.
923 *
924 * Example: Take this HTML table definition:
925 +
926 + <TABLE>
927 + <TBODY>
928 + <TR>
929 + <TD>Column 1-1</TD>
930 + <TD>Column 1-2</TD>
931 + </TR>
932 + <TR>
933 + <TD>Column 2-1</TD>
934 + <TD>Column 2-2</TD>
935 + </TR>
936 + </TBODY>
937 + </TABLE>
938 *
939 * In the DOM, this would be represented by a tree as follows:
940 +
941 + ÚÄÄÄÄÄÄÄÄÄÄÄÄ¿
942 + ³ TABLE ³ (only ELEMENT node in root DOCUMENT node)
943 + ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
944 + ³
945 + ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿
946 + ³ TBODY ³ (only ELEMENT node in root "TABLE" node)
947 + ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
948 + ÚÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄ¿
949 + ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÁÄÄÄÄÄÄ¿
950 + ³ TR ³ ³ TR ³
951 + ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ ÀÄÄÄÄÄÂÄÄÄÄÄÄÙ
952 + ÚÄÄÄÁÄÄÄÄÄÄ¿ ÚÄÄÄÁÄÄÄÄÄÄ¿
953 + ÚÄÄÄÁÄ¿ ÚÄÄÁÄÄ¿ ÚÄÄÄÁÄ¿ ÚÄÄÁÄÄ¿
954 + ³ TD ³ ³ TD ³ ³ TD ³ ³ TD ³
955 + ÀÄÄÂÄÄÙ ÀÄÄÂÄÄÙ ÀÄÄÄÂÄÙ ÀÄÄÂÄÄÙ
956 + ÉÍÍÍÍÍÊÍÍÍÍ» ÉÍÍÍÍÊÍÍÍÍÍ» ÉÍÍÍÍÊÍÍÍÍÍ» ÉÍÍÊÍÍÍÍÍÍÍ»
957 + ºColumn 1-1º ºColumn 1-2º ºColumn 2-1º ºColumn 2-2º (one TEXT node in each parent node)
958 + ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ ÈÍÍÍÍÍÍÍÍÍÍŒ
959 */
960
961/*
962 *@@gloss: DOM_DOCUMENT DOCUMENT
963 * representation of XML @documents in the @DOM.
964 *
965 * The xwphelpers implementation has the following differences
966 * to the DOM specs:
967 *
968 * -- The "doctype" member points to the documents @DTD, or is NULL.
969 * In our implementation, this is the pvExtra pointer, which points
970 * to a _DOMDTD.
971 *
972 * -- The "implementation" member points to a DOMImplementation object.
973 * This is not supported here.
974 *
975 * -- The "documentElement" member is a convenience pointer to the
976 * document's root element. We don't supply this field; instead,
977 * the llChildren list only contains a single ELEMENT node for the
978 * root element.
979 *
980 * -- The "createElement" method is implemented by xmlCreateElementNode.
981 *
982 * -- The "createAttribute" method is implemented by xmlCreateAttributeNode.
983 *
984 * -- The "createTextNode" method is implemented by xmlCreateTextNode,
985 * which has an extra parameter though.
986 *
987 * -- The "createComment" method is implemented by xmlCreateCommentNode.
988 *
989 * -- The "createProcessingInstruction" method is implemented by
990 * xmlCreatePINode.
991 *
992 * -- The "createDocumentFragment", "createCDATASection", and
993 * "createEntityReference" methods are not supported.
994 */
995
996
Note: See TracBrowser for help on using the repository browser.